python3-cookbook筆記：第四章迭代器與生成器

python3-cookbook中每個小節以問題、解決方案和討論三個部分探討了Python3在某類問題中的最優解決方式，或者說是探討Python3本身的數據結構、函數、類等特性在某類問題上如何更好地使用。這本書對於加深Python3的理解和提升Python編程能力的都有顯著幫助，特別是對怎麼提高Py ...

python3-cookbook中每個小節以問題、解決方案和討論三個部分探討了Python3在某類問題中的最優解決方式，或者說是探討Python3本身的數據結構、函數、類等特性在某類問題上如何更好地使用。這本書對於加深Python3的理解和提升Python編程能力的都有顯著幫助，特別是對怎麼提高Python程式的性能會有很好的幫助，如果有時間的話強烈建議看一下。
本文為學習筆記，文中的內容只是根據自己的工作需要和平時使用寫了書中的部分內容，並且文中的示例代碼大多直接貼的原文代碼，當然，代碼多數都在Python3.6的環境上都驗證過了的。不同領域的編程關註點也會有所不同，有興趣的可以去看全文。
python3-cookbook：https://python3-cookbook.readthedocs.io/zh_CN/latest/index.html

4.2 代理迭代

如果想要迭代一個不可迭代對象，只需要為這個對象定義一個__iter__()方法即可，__iter__()方法必須返回一個實現了__next__()方法的迭代器對象。

class Node:
    """Node類似一個樹節點"""
    def __init__(self, value):
        self._value = value
        self._children = []

    def __repr__(self):
        return 'Node({!r})'.format(self._value)

    def add_child(self, node):
        self._children.append(node)

    def __iter__(self):
        # iter(s)只是簡單的通過調用s.__iter__()方法來返回對應的迭代器對象，就跟len(s)會調用s.__len__()原理是一樣的
        return iter(self._children)


if __name__ == '__main__':
    root = Node(0)
    child1 = Node(1)
    child2 = Node(2)
    root.add_child(child1)
    root.add_child(child2)
    # 輸出當前節點下其他節點的列印值
    for ch in root:
        print(ch)

Node(1)
Node(2)

4.4 實現迭代器協議

想在迭代某個對象時按照自己的方式來迭代，最簡單的方法就是使用yield定義一個生成器函數，但是需要註意的是，在迭代操作時，如果不是使用for迴圈，就需要先使用iter()函數轉換一下，再去迭代它。比如以下示例代碼中在樹形結構中定義一個深度優先的生成器函數：

class Node:
    """Node類似一個樹節點"""

    def __init__(self, value):
        self._value = value
        self._children = []

    def __repr__(self):
        return 'Node({!r})'.format(self._value)

    def add_child(self, node):
        self._children.append(node)

    def __iter__(self):
        # 返回一個可以迭代子節點的迭代器
        return iter(self._children)

    def depth_first(self):
        """深度優先遍歷節點"""
        # 使用yield定義一個生成器
        yield self
        for c in self:
            # 註意這裡是yield from
            yield from c.depth_first()


if __name__ == '__main__':
    root = Node(0)
    child1 = Node(1)
    child2 = Node(2)
    root.add_child(child1)
    root.add_child(child2)
    child1.add_child(Node(3))
    child1.add_child(Node(4))
    child2.add_child(Node(5))
    # 以深度優先原則遍歷節點
    for ch in root.depth_first():
        print(ch)

Node(0)
Node(1)
Node(3)
Node(4)
Node(2)
Node(5)

4.7 迭代器切片

想要對迭代對象切片，或者說只想要其中某一段，可以使用itertools.islice，但是需要註意的是這樣會消耗掉這個迭代器，之後就不能使用了，因為迭代器是不可逆的。

>>> def count(n):
    while True:
        yield n
        n += 1

        
>>> c = count(0)
>>> c[10:20]
Traceback (most recent call last):
  File "<pyshell#105>", line 1, in <module>
    c[10:20]
TypeError: 'generator' object is not subscriptable
>>> import itertools
>>> for x in itertools.islice(c, 10, 20):
    print(x)

    
10
11
12
13
14
15
16
17
18
19
>>>

4.8 跳過可迭代對象的開始部分

在遍歷一個可迭代對象時，想要跳過開始的某些元素，可以使用itertools.dropwhile，為它傳入一個函數和可迭代對象，如果知道確切的索引位置，也可以使用itertools.islice。

>>> from itertools import dropwhile, islice
>>> items = ['a', 'b', 'c', 1, 4, 10, 15]
>>> for x in dropwhile(lambda i: isinstance(i, str), items):
    print(x)

    
1
4
10
15
>>> for x in islice(items, 3, None):
    print(x)

    
1
4
10
15
>>>

4.11 同時迭代多個序列

內置函數zip的使用有時候很方便，但是它只會遍歷到最短的那個序列完就結束了，如果想要遍歷完最長的那個序列，可以使用itertools.zip_longest()。

>>> a = [1, 2, 3]
>>> b = ['w', 'x', 'y', 'z']
>>> for i in zip(a,b):
    print(i)

    
(1, 'w')
(2, 'x')
(3, 'y')
>>> from itertools import zip_longest
>>> for i in zip_longest(a, b):
    print(i)

    
(1, 'w')
(2, 'x')
(3, 'y')
(None, 'z')
>>>

4.12 不同集合上元素的迭代

想要遍歷多個可迭代對象中的元素，但又不想單獨遍歷每個對象，或者把它們都整合在一個對象中再遍歷，此時可以使用itertools.chain()。

>>> from itertools import chain
>>> a = [1, 2, 3, 4]
>>> b = ['x', 'y', 'z']
>>> for x in chain(a, b):
    print(x)

    
1
2
3
4
x
y
z
>>>

4.14 展開嵌套的序列

展開嵌套的序列，這個問題或許有其他的解決方式，但文中使用遞歸生成器的方式還是很很不錯的。

from collections import Iterable


def flatten(items, ignore_types=(str, bytes)):
    for x in items:
        if isinstance(x, Iterable) and not isinstance(x, ignore_types):
            yield from flatten(x)
        else:
            yield x


items = [1, 2, [3, 4, [5, 6], 7], 8]
for x in flatten(items):
    print(x)

4.15 順序迭代合併後的排序迭代對象

你有多個可迭代對象，想要將它們合併排序後遍歷裡面的元素，那麼可以使用heapq.merge(*iterables, key=None, reverse=False)，但是需要註意，使用這個函數前每個可迭代對象都要預先排序好，因為這個函數只是每次從多個序列的第一個元素中選出最小或最大的元素。並且因為它是可迭代的，意味著它可以處理非常長的序列而不用擔心記憶體消耗。

>>> import heapq
>>> a = [1, 4, 7, 10]  # 預先排好序的序列
>>> b = [2, 5, 6, 11]
>>> for c in heapq.merge(a, b):
    print(c)

    
1
2
4
5
6
7
10
11
>>>

4.16 迭代器代替while無限迴圈

某些情況下可以使用iter創建一個迭代器來替換while迴圈，iter函數它接受一個可選的 callable 對象和一個標記(結尾)值作為輸入參數。當以這種方式使用iter的時候，它會創建一個迭代器，這個迭代器會不斷調用 callable 對象直到返回值和標記值相等為止。雖然文中並沒有說這兩種方式在性能上有什麼差別，但是從代碼編寫上看，iter的方式會更加優雅些。

CHUNKSIZE = 8192

def reader(s):
    while True:
        # 接收數據
        data = s.recv(CHUNKSIZE)
        if data == b'':
            break
        # 處理數據
        process_data(data)

def reader2(s):
    for data in iter(lambda: s.recv(CHUNKSIZE), b''):
        # 處理數據
        process_data(data)

python3-cookbook筆記：第四章 迭代器與生成器

python3-cookbook筆記：第四章迭代器與生成器