Synopsis: Python中内置的序列，如 list、tuple、str、bytes、dict、set、collections.deque 等都是可迭代的对象，但它们不是迭代器。迭代器可以被 next() 函数调用，并不断返回下一个值。Python 从可迭代的对象中获取迭代器。迭代器和生成器都是为了惰性求值（lazy evaluation），避免浪费内存空间，实现高效处理大量数据。在 Python 3 中，生成器有广泛的用途，所有生成器都是迭代器，因为生成器完全实现了迭代器接口。迭代器用于从集合中取出元素，而生成器用于 "凭空" 生成元素。PEP 342 给生成器增加了 send() 方法，实现了 "基于生成器的协程"。PEP 380 允许生成器中可以 return 返回值，并新增了 yield from 语法结构，打开了调用方和子生成器的双向通道

代码已上传到 https://github.com/wangy8961/python3-concurrency ，欢迎 star

1. 可迭代的对象

可迭代的对象（Iterable） 是指使用 iter() 内置函数可以获取 迭代器（Iterator） 的对象。Python 解释器需要 迭代 对象 x 时，会自动调用 iter(x)，内置的 iter() 函数有以下作用：

检查对象 x 是否实现了 __iter__() 方法，如果实现了该方法就调用它，并尝试获取一个 迭代器
如果没有实现 __iter__() 方法，但是实现了 __getitem__(index) 方法，尝试按顺序（从索引 0 开始）获取元素，即参数 index 是从 0 开始的整数（int）。之所以会检查是否实现 __getitem(index)__ 方法，为了向后兼容
如果前面都尝试失败，Python 解释器会抛出 TypeError 异常，通常会提示 'X' object is not iterable（X类型的对象不可迭代），其中 X 是目标对象所属的类

具体来说，哪些是可迭代对象呢？

如果对象实现了能返回 迭代器 的 __iter__() 方法，那么对象就是可迭代的
如果对象实现了 __getitem__(index) 方法，而且 index 参数是从 0 开始的整数（索引），这种对象也可以迭代的。Python 中内置的 序列 类型，如 list、tuple、str、bytes、dict、set、collections.deque 等都可以迭代，原因是它们都实现了 __getitem__() 方法（注意： 其实标准的序列还都实现了 __iter__() 方法）

1.1 判断对象是否可迭代

从 Python 3.4 开始，检查对象 x 能否 迭代，最准确的方法是：调用 iter(x) 函数，如果不可迭代，会抛出 TypeError 异常。这比使用 isinstance(x, abc.Iterable) 更准确，因为 iter(x) 函数会考虑到遗留的 __getitem__(index) 方法，而 abc.Iterable 类则不会考虑

1.2 `getitem()`

下面构造一个类，它实现了 __getitem__() 方法。可以给类的构造方法传入包含一些文本的字符串，然后可以逐个单词进行迭代：

'''创建test.py模块'''
import re
import reprlib

RE_WORD = re.compile('\w+')

class Sentence:
    def __init__(self, text):
        self.text = text
        self.words = RE_WORD.findall(text)

    def __getitem__(self, index):
        return self.words[index]

    def __len__(self):  # 为了让对象可以迭代没必要实现这个方法，这里是为了完善序列协议，即可以用len(s)获取单词个数
        return len(self.words)

    def __repr__(self):
        return 'Sentence({})'.format(reprlib.repr(self.text))

测试 Sentence 的实例能否迭代：

In [1]: from test import Sentence  # 导入刚创建的类

In [2]: s = Sentence('I love Python')  # 传入字符串，创建一个Sentence实例

In [3]: s
Out[3]: Sentence('I love Python')

In [4]: s[0]
Out[4]: 'I'

In [5]: s.__getitem__(0)
Out[5]: 'I'

In [6]: for word in s:  # Sentence实例可以迭代
   ...:     print(word)
   ...:
I
love
Python

In [7]: list(s)  # 因为可以迭代，所以Sentence对象可以用于构建列表和其它可迭代的类型
Out[7]: ['I', 'love', 'Python']

In [8]: from collections import abc

In [9]: isinstance(s, abc.Iterable)  # 不能正确判断Sentence类的对象s是可迭代的对象
Out[9]: False

In [10]: iter(s)  # 没有抛出异常，返回迭代器，说明Sentence类的对象s是可迭代的
Out[10]: <iterator at 0x7f82a761e5f8>

1.3 `iter()`

如果实现了 __iter__() 方法，但该方法没有返回 迭代器 时：

In [1]: class Foo:
   ...:     def __iter__(self):
   ...:         pass
   ...:

In [2]: from collections import abc

In [3]: f = Foo()

In [4]: isinstance(f, abc.Iterable)  # 错误地判断Foo类的对象f是可迭代的对象
Out[4]: True

In [5]: iter(f)  # 使用iter()方法会抛出异常，即对象f不可迭代，不能用for循环迭代它
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-5-a2fd621ca1d7> in <module>()
----> 1 iter(f)

TypeError: iter() returned non-iterator of type 'NoneType'

Python 迭代协议要求 __iter__() 必须返回特殊的 迭代器 对象。下一节会讲迭代器，迭代器对象必须实现 __next__() 方法，并使用 StopIteration 异常来通知迭代结束

In [1]: class Foo:
   ...:     def __iter__(self):  # 其实是将迭代请求委托给了列表
   ...:         return iter([1, 2, 3])  # iter()函数从列表创建迭代器，等价于[1, 2, 3].__iter__()
   ...:

In [2]: from collections import abc

In [3]: f = Foo()

In [4]: isinstance(f, abc.Iterable)
Out[4]: True

In [5]: iter(f)
Out[5]: <list_iterator at 0x7fbe0e4f2d30>

In [6]: for i in f:
   ...:     print(i)
   ...:
1
2
3

1.4 `iter()` 函数的补充

iter() 函数有两种用法：

iter(iterable) -> iterator：传入 可迭代的对象，返回 迭代器
iter(callable, sentinel) -> iterator：传入两个参数，第一个参数必须是 可调用的对象，用于不断调用（没有参数），产出各个值；第二个值是哨符，这是一个标记值，当可调用的对象返回这个值时，触发迭代器抛出 StopIteration 异常，而不产出哨符

下述示例展示如何使用 iter() 函数的第 2 种用法来掷骰子，直到掷出 1 点为止：

In [1]: from random import randint

In [2]: def d6():
   ...:     return randint(1, 6)
   ...:

In [3]: d6_iter = iter(d6, 1)  # 第一个参数是d6函数，第二个参数是哨符

In [4]: d6_iter  # 这里的 iter 函数返回一个 callable_iterator 对象
Out[4]: <callable_iterator at 0x473c5d0>

In [5]: for roll in d6_iter:  # for 循环可能运行特别长的时间，不过肯定不会打印 1，因为 1 是哨符
   ...:     print(roll)
   ...:
6
3
5
2
4
4

实用的示例：逐行读取文件，直到遇到空行或者到达文件末尾为止

with open('mydata.txt') as fp:
    for line in iter(fp.readline, '\n'):  # fp.readline每次返回一行
        print(line)

1.5 Iterable reducing functions

有一些函数接受一个 可迭代的对象，然后返回单个结果。下表中列出的每个内置函数都可以使用 functools.reduce 函数实现，之所以要把它们实现为内置函数，是因为使用它们可以便于解决常见的问题。此外，对 all 和 any 函数来说，有一项重要的优化措施是 reduce 函数做不到的：这两个函数会短路（即一旦确定了结果就立即停止使用迭代器）：

模块	函数	说明
functools	`reduce`(function, sequence[, initial])	Apply a function of two arguments cumulatively to the items of a sequence, from left to right, so as to reduce the sequence to a single value. For example, `reduce(lambda x, y: x+y, [1, 2, 3, 4, 5])` calculates `((((1+2)+3)+4)+5)`
（内置）	`all`(iterable, /)	Return True if bool(x) is True for all values x in the iterable. If the iterable is empty, return True.
（内置）	`any`(iterable, /)	Return True if bool(x) is True for any x in the iterable. If the iterable is empty, return False.
（内置）	`min`(iterable, *[, default=obj, key=func])	With a single iterable argument, return its smallest item. If the provided iterable is empty, return the default obj.
（内置）	`max`(iterable, *[, default=obj, key=func])	With a single iterable argument, return its biggest item. If the provided iterable is empty, return the default obj.
（内置）	`sum`(iterable, start=0, /)	Return the sum of a 'start' value (default: 0) plus an iterable of numbers. When the iterable is empty, return the start value.

2. 迭代器

迭代是数据处理的基石。当扫描内存中放不下的数据集时，我们要找到一种 惰性 获取数据项的方式，即按需一次获取一个数据项。这就是 迭代器模式（Iterator pattern）

迭代器 是这样的对象：实现了无参数的 __next__() 方法，返回序列中的下一个元素，如果没有元素了，就抛出 StopIteration 异常。即，迭代器 可以被 next() 函数调用，并不断返回下一个值

在 Python 语言内部，迭代器 用于支持：

for 循环
构建和扩展集合类型
逐行遍历文本文件
列表推导、字典推导和集合推导
元组拆包
调用函数时，使用 * 拆包实参

2.1 判断对象是否为迭代器

检查对象 x 是否为 迭代器 最好的方式是调用 isinstance(x, abc.Iterator)：

In [1]: from collections import abc

In [2]: isinstance([1,3,5], abc.Iterator)
Out[2]: False

In [3]: isinstance((2,4,6), abc.Iterator)
Out[3]: False

In [4]: isinstance({'name': 'wangy', 'age': 18}, abc.Iterator)
Out[4]: False

In [5]: isinstance({1, 2, 3}, abc.Iterator)
Out[5]: False

In [6]: isinstance('abc', abc.Iterator)
Out[6]: False

In [7]: isinstance(100, abc.Iterator)
Out[7]: False

In [8]: isinstance((x*2 for x in range(5)), abc.Iterator)  # 生成器表达式，后续会介绍
Out[8]: True

Python 中内置的序列类型，如 list、tuple、str、bytes、dict、set、collections.deque 等都是 可迭代的对象，但不是 迭代器。而 生成器 一定是 迭代器

2.2 `next()` 和 `iter()`

标准的 迭代器 接口：

__next__()：返回下一个可用的元素，如果没有元素了，抛出 StopIteration 异常。调用 next(x) 相当于调用 x.__next__()
__iter__()：返回 迭代器 本身（self），以便在应该使用 可迭代的对象 的地方能够使用 迭代器，比如在 for 循环、list(iterable) 函数、sum(iterable, start=0, /) 函数等应该使用 可迭代的对象 地方可以使用 迭代器。说明： 如章节 1 所述，只要实现了能返回 迭代器 的 __iter__() 方法的对象就是 可迭代的对象，所以，迭代器 都是 可迭代的对象！

下面的示例中，Sentence 类的对象是 可迭代的对象，而 SentenceIterator 类实现了典型的 迭代器 设计模式：

import re
import reprlib

RE_WORD = re.compile('\w+')

class Sentence:
    def __init__(self, text):
        self.text = text
        self.words = RE_WORD.findall(text)

    def __repr__(self):
        return 'Sentence(%s)' % reprlib.repr(self.text)

    def __iter__(self):
        return SentenceIterator(self.words)  # 迭代协议要求__iter__返回一个迭代器


class SentenceIterator:
    def __init__(self, words):
        self.words = words
        self.index = 0

    def __next__(self):
        try:
            word = self.words[self.index]  # 获取 self.index 索引位（从0开始）上的单词。
        except IndexError:
            raise StopIteration()  # 如果 self.index 索引位上没有单词，那么抛出 StopIteration 异常
        self.index += 1
        return word

    def __iter__(self):
        return self  # 返回迭代器本身

2.3 next() 函数获取迭代器中下一个元素

除了可以使用 for 循环处理 迭代器 中的元素以外，还可以使用 next() 函数，它实际上是调用 iterator.__next__()，每调用一次该函数，就返回 迭代器 的下一个元素。如果已经是最后一个元素了，再继续调用 next() 就会抛出 StopIteration 异常。一般来说，StopIteration 异常是用来通知我们迭代结束的：