python中的shallow copy 與 deep copy

今天在寫代碼的時候遇到一個奇葩的問題，問題描述如下：代碼中聲明瞭一個list，將list作為參數傳入了function1()中，在function1()中對list進行了del()即刪除了一個元素。而function2()也把list作為參數傳入使用，在調用完function1()之後再調用fu ...

今天在寫代碼的時候遇到一個奇葩的問題，問題描述如下：

代碼中聲明瞭一個list，將list作為參數傳入了function1()中，在function1()中對list進行了del()即刪除了一個元素。

而function2()也把list作為參數傳入使用，在調用完function1()之後再調用function2()就出現了問題，list中的值已經被改變了，就出現了bug。

直接上代碼：

list = [0, 1, 2, 3, 4, 5]


def function1(list):
    del list[1]
    print(list)


def function2(list):
    print(list)


function1(list)
function2(list)

我並不希望function2()中的list改變，查了一下解決辦法說是可對list進行copy：

newList = list.copy()
function2(newList)

在查解決辦法的過程中發現了還有一個方法叫做deepcopy()，那麼問題來了，deepcopy()與copy()的區別是什麼？

先點到源碼里看了下源碼，發現有註釋，很開心。註釋如下：

"""Generic (shallow and deep) copying operations.

Interface summary:

        import copy

        x = copy.copy(y)        # make a shallow copy of y
        x = copy.deepcopy(y)    # make a deep copy of y

For module specific errors, copy.Error is raised.

The difference between shallow and deep copying is only relevant for
compound objects (objects that contain other objects, like lists or
class instances).

- A shallow copy constructs a new compound object and then (to the
  extent possible) inserts *the same objects* into it that the
  original contains.

- A deep copy constructs a new compound object and then, recursively,
  inserts *copies* into it of the objects found in the original.

Two problems often exist with deep copy operations that don't exist
with shallow copy operations:

 a) recursive objects (compound objects that, directly or indirectly,
    contain a reference to themselves) may cause a recursive loop

 b) because deep copy copies *everything* it may copy too much, e.g.
    administrative data structures that should be shared even between
    copies

Python's deep copy operation avoids these problems by:

 a) keeping a table of objects already copied during the current
    copying pass

 b) letting user-defined classes override the copying operation or the
    set of components copied

This version does not copy types like module, class, function, method,
nor stack trace, stack frame, nor file, socket, window, nor array, nor
any similar types.

Classes can use the same interfaces to control copying that they use
to control pickling: they can define methods called __getinitargs__(),
__getstate__() and __setstate__().  See the documentation for module
"pickle" for information on these methods.
"""

然而看了看，一臉懵逼。還是百度繼續查資料吧：

https://iaman.actor/blog/2016/04/17/copy-in-python大佬總結的很好。

copy其實就是shallow copy，與之相對的是deep copy

結論：

1.對於簡單的object，shallow copy和deep copy沒什麼區別

>>> import copy
>>> origin = 1
>>> cop1 = copy.copy(origin) 
#cop1 是 origin 的shallow copy
>>> cop2 = copy.deepcopy(origin) 
#cop2 是 origin 的 deep copy
>>> origin = 2
>>> origin
2
>>> cop1
1
>>> cop2
1
#cop1 和 cop2 都不會隨著 origin 改變自己的值
>>> cop1 == cop2
True
>>> cop1 is cop2
True

2.複雜的 object，如 list 中套著 list 的情況，shallow copy 中的子list，並未從原 object 真的「獨立」出來。

如果你改變原 object 的子 list 中的一個元素，你的 copy 就會跟著一起變。這跟我們直覺上對「複製」的理解不同。

>>> import copy
>>> origin = [1, 2, [3, 4]]
#origin 裡邊有三個元素：1， 2，[3, 4]
>>> cop1 = copy.copy(origin)
>>> cop2 = copy.deepcopy(origin)
>>> cop1 == cop2
True
>>> cop1 is cop2
False 
#cop1 和 cop2 看上去相同，但已不再是同一個object
>>> origin[2][0] = "hey!" 
>>> origin
[1, 2, ['hey!', 4]]
>>> cop1
[1, 2, ['hey!', 4]]
>>> cop2
[1, 2, [3, 4]]
#把origin內的子list [3, 4] 改掉了一個元素，觀察 cop1 和 cop2

cop1，也就是shallow copy 跟著 origin 改變了。而 cop2 ，也就是 deep copy 並沒有變。

那麼問題又來了，有deepcopy直接用就好了為啥還要有copy？

這個問題的解決要從python變數存儲的方法說起，在python中，與其說是把值賦給了變數，不如說是給變數建立了一個到具體值的reference(引用)

>>> a = [1, 2, 3]
>>> b = a
>>> a = [4, 5, 6] //賦新的值給 a
>>> a
[4, 5, 6]
>>> b
[1, 2, 3]
# a 的值改變後，b 並沒有隨著 a 變

>>> a = [1, 2, 3]
>>> b = a
>>> a[0], a[1], a[2] = 4, 5, 6 //改變原來 list 中的元素
>>> a
[4, 5, 6]
>>> b
[4, 5, 6]
# a 的值改變後，b 隨著 a 變了

上面代碼，都改變了a的值，不同的是：第一段是給a賦新值，第二段是直接改變了list中的元素。

下麵解釋下這詭異的現象：

首次把 [1, 2, 3] 看成一個物品。a = [1, 2, 3] 就相當於給這個物品上貼上 a 這個標簽。而 b = a 就是給這個物品又貼上了一個 b的標簽。

第一種情況：

a = [4, 5, 6] 就相當於把 a 標簽從 [1 ,2, 3] 上撕下來，貼到了 [4, 5, 6] 上。

在這個過程中，[1, 2, 3] 這個物品並沒有消失。 b 自始至終都好好的貼在 [1, 2, 3] 上，既然這個 reference 也沒有改變過。 b 的值自然不變。

第二種情況：

a[0], a[1], a[2] = 4, 5, 6 則是直接改變了 [1, 2, 3] 這個物品本身。把它內部的每一部分都重新改裝了一下。內部改裝完畢後，[1, 2, 3] 本身變成了 [4, 5, 6]。

而在此過程當中，a 和 b 都沒有動，他們還貼在那個物品上。因此自然 a b 的值都變成了 [4, 5, 6]。

用copy.copy()。結果卻發現本體與 copy 之間並不是獨立的。有的時候改變其中一個，另一個也會跟著改變。也就是本文一開頭提到的例子：

>>> import copy
>>> origin = [1, 2, [3, 4]]
#origin 裡邊有三個元素：1， 2，[3, 4]
>>> cop1 = copy.copy(origin)
>>> cop2 = copy.deepcopy(origin)
>>> cop1 == cop2
True
>>> cop1 is cop2
False 
#cop1 和 cop2 看上去相同，但已不再是同一個object
>>> origin[2][0] = "hey!" 
>>> origin
[1, 2, ['hey!', 4]]
>>> cop1
[1, 2, ['hey!', 4]]
>>> cop2
[1, 2, [3, 4]]
#把origin內的子list [3, 4] 改掉了一個元素，觀察 cop1 和 cop2

官方解釋:

The difference between shallow and deep copying is only relevant for compound objects (objects that contain other objects, like lists or class instances):

A shallow copy constructs a new compound object and then (to the extent possible) inserts references into it to the objects found in the original.

A deep copy constructs a new compound object and then, recursively, inserts copies into it of the objects found in the original.

兩種 copy 只在面對複雜對象時有區別，所謂複雜對象，是指對象中含其他對象（如複雜的 list 和 class）。

由 shallow copy 建立的新複雜對象中，每個子對象，都只是指向自己在原來本體中對應的子對象。而 deep copy 建立的複雜對象中，存儲的則是本體中子對象的 copy，並且會層層如此 copy 到底。

先看這裡的 shallow copy。如圖所示，cop1 就是給當時的 origin 建立了一個鏡像。origin 當中的元素指向哪， cop1 中的元素就也指向哪。這就是官方 doc 中所說的 inserts references into it to the objects found in the original 。

這裡的關鍵在於，origin[2]，也就是 [3, 4] 這個 list。根據 shallow copy 的定義，在 cop1[2] 指向的是同一個 list [3, 4]。那麼，如果這裡我們改變了這個 list，就會導致 origin 和 cop1 同時改變。這就是為什麼上邊 origin[2][0] = "hey!" 之後，cop1 也隨之變成了 [1, 2, ['hey!', 4]]。