Python 协程

2017-04-24

最近在研究协程，但到最后发现对其概念其实还是理解的肤浅，所以专门写一文章，详细说明协程相关内容。

进入协程主题之前，先来看一下子例程（subroutine）概念。

子例程

按照维基的定义，：

In computer programming, a subroutine is a sequence of program instructions that perform a specific task, packaged as a unit.

也就是说，一个子例程就是一个执行具体任务的代码指令单元，说白了就是一个函数（function）。对于 Subroutine 和 function 等概念，维基已经说的很清楚了，是一样的东西：

In different programming languages, a subroutine may be called a procedure, a function, a routine, a method, or a subprogram.

协程

协程定义

对于协程（Coroutine），维基上解释的挺好的：

Coroutines are computer program components that generalize subroutines for non-preemptive multitasking, by allowing multiple entry points for suspending and resuming execution at certain locations.

这个定义有几点特别重要：

协程是对子例程的泛化（generalize），也就是协程是子例程的抽象。其实 Donald Knuth 早就说过：

Subroutines are special cases of coroutines.
也就是说，子协程其实是协程的特例。
协程用于非抢占式多任务（non-preemptive multitasking）。对这点，个人的理解不足，暂就不多加说明了，以后理解深入了，再多加说明。
协程有多个入口（entry point），可以从某个地方挂起和恢复执行。这点是跟子例程区别开来最重要的一点：子例程只有一个入口，也即第一行代码；而协程却有多个入口，可以是第一行代码，也可以是其他地方。这个也可以用来补充说明上边提到的“子协程其实是协程的特例”。

协程本质

不过这个定义没提到的一点，就是协程的“本质”是执行权的转移（yield）是由“程序”自身来负责的，这跟一般由操作系统来负责是非常不一样的。操作系统的调度单位的是线程，而协程本身是在一个线程内部，自然由程序自身来调度。而这个协程调度是通过调用 yield 来做到的。yield 本身有放弃、让出的意思，也就是说，如果我们在一个函数中调用了 yield，就意味着当执行到该关键字所在的地方时，返回（值）并让出执行权给其他协程。这种调度其实很依赖于每个协程的自觉性，如果其中一个协程永远不让出执行权（同时也因为协程不由操作系统调度），那其它协程就得不到执行的机会。

博文对 Python 中 yield 和协程的理解举了一个通俗易懂的例子，可以用来加深理解：

假设说有十个人去食堂打饭，这个食堂比较穷，只有一个打饭的窗口，并且也只有一个打饭阿姨，那么打饭就只能一个一个排队来打咯。这十个人胃口很大，每个人都要点5个菜，但这十个人又有个毛病就是做事情都犹豫不决，所以点菜的时候就会站在那里，每点一个菜后都会想下一个菜点啥，因此后面的人等的很着急呀。这样一直站着也不是个事情吧，所以打菜的阿姨看到某个人犹豫5秒后就开始吼一声，会让他排到队伍最后去，先让别人打菜，等轮到他的时候他也差不多想好吃啥了。这确实是个不错的方法，但也有一个缺点，那就是打菜的阿姨会的等每个人5秒钟，如果那个人在5秒内没有做出决定吃啥，其实这5秒就是浪费了。一个人点一个菜就是浪费5秒，十个人每个人点5个菜可就浪费的多啦（菜都凉了要）。那咋办呢？这个时候阿姨发话了：大家都是学生，学生就要自觉，我以后也不主动让你们排到最后去了，如果你们觉得自己会犹豫不决，就自己主动点直接点一个菜就站后面去，等下次排到的时候也差不多想好吃啥了。这个方法果然有效，大家点了菜后想的第一件事情不是下一个菜吃啥，而是自己会不会犹豫，如果会犹豫那直接排到队伍后面去，如果不会的话就直接接着点菜就行了。这样一来整个队伍没有任何时间是浪费的，效率自然就高了。

调度器

上边的例子我们提到协程调度依赖于协程的自觉性，但这里还有一个问题，就是但某一个协程让出执行权之后，哪一个协程可以获得执行权呢？这就涉及到调度器了。

在 PEP 342 上有一个调度器样例：

import collections


class Trampoline:
    """Manage communications between coroutines"""

    running = False

    def __init__(self):
        self.queue = collections.deque()

    def add(self, coroutine):
        """Request that a coroutine be executed"""
        self.schedule(coroutine)

    def run(self):
        result = None
        self.running = True
        try:
            while self.running and self.queue:
                func = self.queue.popleft()
                result = func()
            return result
        finally:
            self.running = False

    def stop(self):
        self.running = False

    def schedule(self, coroutine, stack=(), val=None, *exc):
        def resume():
            value = val
            try:
                if exc:
                    value = coroutine.throw(value, *exc)
                else:
                    value = coroutine.send(value)
            except:
                if stack:
                    # send the error back to the "caller"
                    self.schedule(
                        stack[0], stack[1], *sys.exc_info()
                    )
                else:
                    # Nothing left in this pseudothread to
                    # handle it, let it propagate to the
                    # run loop
                    raise

            if isinstance(value, types.GeneratorType):
                # Yielded to a specific coroutine, push the
                # current one on the stack, and call the new
                # one with no args
                self.schedule(value, (coroutine, stack))

            elif stack:
                # Yielded a result, pop the stack and send the
                # value to the caller
                self.schedule(stack[0], stack[1], value)

                # else: this pseudothread has ended

        self.queue.append(resume)

其他复杂的可参考 GTasklet、peak.events。

PEP 342

在 Python 2.5 PEP 342之前，生成器几乎与协程一样。在该提议中，有两点值得注意。下边详细说一下。

重定义 yield 为一个表达式而不是一个语句

表达式和语句概念辨别

这个提议一开始看会觉得很奇怪，yield 是一个表达式（expression）还是一个语句（statement）很重要？看了一下表达式和语句的区别之后，还真的很重要！StackOverflow 上有一个回答挺不错的：

Expressions only contain identifiers, literals and operators, where operators include arithmetic and boolean operators, the function call operator () the subscription operator [] and similar, and can be reduced to some kind of "value", which can be any Python object. Examples:
1
2
3
4
3 + 5
map(lambda x: x*x, range(10))
[a.x for a in some_iterable]
yield 7
Statements (see 1, 2), on the other hand, are everything that can make up a line (or several lines) of Python code. Note that expressions are statements as well. Examples:
1
2
3
4
5
# all the above expressions
print 42
if x: do_y()
return
a = 7

简单的说，每一行代码都是一个语句（包括表达式），但表达式是能够产生值的语句（expression statement）。这意味着，表达式的值可以赋给其他变量，而一般语句是不行的。这太重要了，因为这样一来，再结合新提议的 send() 方法，就可以从一个生成器将某个值发给另一个生成器，然后后者将该值赋予某变量，进行下一步的操作，从而达到协同工作（cooperative routine, coroutine）的效果。这就是协程！！！

yield 表达式

提议说明

根据 PEP 342 说明，当把 yield 语句（yield-statement，如 yield 7）放在赋值语句右边时，它就变成了一个 yield 表达式（yield-expression，如 a = yield 7，这时 yield 7 是一个“表达式”，它的值会赋予变量 a）。

对于 yield 表达式的返回值，

当调用 send(value) 时，该表达式的值为传入的值（即 value）。
当调用 next() 时，该表达式的值 None。
该“表达式的值”和“生成器生成（yield）的值”“不是”一回事，如 a = yield 7，表达式是 yield 7，其值由 send()/next() 决定，而生成器生成并返回的值是 7。
当 yield 表达式是一个 yield 语句时，其返回值被忽略（没有赋值对象）。
生成器下一次迭代从 yield 的“下一行”开始执行，需要特别注意的是，对于 yield 表达式，赋值和真正的挂起并返回操作实际是分开的，并不是如赋值语句那样给我们造成先把 yield 表达式的值赋予变量，然后再挂起并返回结果。也就是说，在字节码层面，yield 操作排在赋值操作的前边，如 a = yield 7 的字节码就为：
1
2
3
4
......
YIELD_VALUE <<<<<<<<<<<
STORE_FAST 1 (a) <<<<<<<<<<<
......

最后，需要注意的是，Python 中同时支持 yield 语句和 yield 表达式。

表达式样例

PEP 342 还列出了一些合法和不合法的 yield 表达式：

合法

x = yield 42
x = yield
x = 12 + (yield 42)
x = 12 + (yield)
foo(yield 42)
foo(yield)
不合法

x = 12 + yield 42
x = 12 + yield
foo(yield 42, 12)
foo(yield, 12)

为生成器增加 send() 方法

提议说明

提议的 send() 方法允许我们向生成器发送数据，最后该方法返回该生成器生成（yield）的值。不理解没关系，我们接下来看 PEP 342 详细解释。

send() 方法只接受一个参数。
！！！调用 send(None) 等同于调用 next() 方法。
因为生成器迭代器从生成器函数体顶部开始执行，生成器刚刚创建时没有 yield 表达式来接收传过来的值，所以这时调用 send() 方法传递非 None 的值是禁止的，而且会抛出 TypeError 异常。因此，当你首次跟协程（注意这时不叫生成器）通信之前，需要调用 next() 方法或 send(None)，以推进执行点至第一个 yield 表达式，也即，首次调用 next() 方法或 send(None) 会使得该协程执行至第一个 yield 表达式处（注意 yield 表达式“赋值”和真正的“挂起并返回操作”是分开的，也即执行至 yield 处，下次迭代就从下一句的赋值语句开始执行）。
与 next() 方法一样，调用 send() 方法会返回生成器生成（yield）的下一个值，或者当生成器正常退出或已经退出时抛出 StopIteration 异常。当生成器抛出没有捕获的异常时，该异常会传递给 send() 调用者。

注：

send() 方法跟 next() 方法不一样
生成器本身就是一个函数，只是比较特殊而已
生成器本身是一种迭代器（实现了某些方法，如 next()）

例子解析

下边我们来看一个例子，来加深对 send() 方法的理解：

# bogus.py
#
# Bogus example of a generator that produces and receives values


def countdown(n):
    print "Counting down from", n
    while n >= 0:
        newvalue = (yield n)	# yield n 就是一个表达式，其值由 send()/next() 决定，然后赋予 newvalue
        # If a new value got sent in, reset n with it
        if newvalue is not None:
            n = newvalue
        else:
            n -= 1


# The holy grail countdown
c = countdown(5)
for x in c:
    print x,
    if x == 5:
        c.send(3)

问题解析

这个程序很奇葩，结果应该是：

1 2	Counting down from 5 5 3 2 1 0

也即过程应该是这样子的（注意：yield 表达式“赋值”和真正的“挂起并返回操作”是分开的。在字节码层面，yield 操作排在赋值操作的前边）：

for 循环开始迭代生成器（隐含调用 next()）时，第一次调用 next(c)，执行了函数体前两行，执行至第 9 行 yield 所在的地方，yield n 表达式的值为 None，挂起（suspend）并返回 n，这时 n 的值为 5。注意，因为还没执行到赋值语句，这时 newvlue 并没有被赋值，也没有为其分配内存；
这时刚好满足 if x == 5 的条件，向生成器发送数据，从第 9 行也即赋值语句开始执行，yield n 表达式的值 3 赋予 newvalue，n 的值仍为 5。向下执行 if/else，将 n 的值赋为 newvalue 的值 3，接着回到第 9 行，挂起并返回 n 3，这时 n 的值为 3；
send 执行完后，又由 for 循环开始迭代调用 next(c)，yield n 表达式的值为 None，赋予 newvalue；此时 n 为 3；继续向下执行，n 自减 1 变为 2，接着回到第 9 行，挂起并返回 n 2；
依次类推，最后分别返回 1 0。

但实际结果不是这样的，正确的结果是：

1 2	Counting down from 5 5 2 1 0

那 3 哪里去了？？？还记得我们在前边提到的，与 next() 方法一样，调用 send() 方法会返回生成器生成（yield）的下一个值，也就是说，send() 函数相当于在 for 循环中“插了一脚”，让 for 循环少了一次迭代。这样说其实不是很准确，但比较形象。我们知道 send() 函数能够获得返回值，所以我们只要打印 c.send(3) 就会看到打印出来的 3。其实，send() 返回后，生成器里保存的 n 的值就已经为 3 了，接下来 for 循环得到的结果必然是 2 1 0。

造成以上误解的最大原因在于 send() 方法与 next() 一样，都会改变迭代器的内部状态。

生成器执行点

其实，除了以上提到的容易造成的误解，还有一点很奇怪，就是我们都说，下一次生成器的执行从上一次执行离开的地方开始执行（left off），但貌似在这里不成立，理由如下：

for 循环第一次调用 next(c) 的时候，执行到 yield 所在的地方时，就直接挂起并返回了，并没有把 yield n 表达式的值赋予 newvalue，更没有为 newvalue 开辟内存空间；
后续的迭代，都是先将 yield n 表达式的值赋予 newvalue，为 newvalue 开辟了内存空间，而不是挂起并返回；并执行后后边的 if/else 语句后才最终挂起并返回（yield n）。

上边两点都指向一个点：yield 表达式“赋值”和真正的“挂起并返回操作”是分开的，也即在字节码层面，yield n 排在表达式赋值的前边。

其实这个时候，只要看一下字节码就很清楚了（“<<<<<<<<<<<”地方）：

import dis


def countdown(n):
    print "Counting down from", n
    while n >= 0:
        newvalue = (yield n)
        # If a new value got sent in, reset n with it
        if newvalue is not None:
            n = newvalue
        else:
            n -= 1

print dis.dis(countdown.func_code)
# Results：
#  9           0 LOAD_CONST               1 ('Counting down from')
#              3 PRINT_ITEM          
#              4 LOAD_FAST                0 (n)
#              7 PRINT_ITEM          
#              8 PRINT_NEWLINE       
#
# 10           9 SETUP_LOOP              54 (to 66)
#        >>   12 LOAD_FAST                0 (n)
#             15 LOAD_CONST               2 (0)
#             18 COMPARE_OP               5 (>=)
#             21 POP_JUMP_IF_FALSE       65
#
# 11          24 LOAD_FAST                0 (n)
#             27 YIELD_VALUE                               <<<<<<<<<<<
#             28 STORE_FAST               1 (newvalue)     <<<<<<<<<<<
#
# 13          31 LOAD_FAST                1 (newvalue)
#             34 LOAD_CONST               0 (None)
#             37 COMPARE_OP               9 (is not)
#             40 POP_JUMP_IF_FALSE       52
# ......

异常及资源清理

这部分暂不涉及，待后需添加…

协程与生成器区别

看完上边的解释，我想很多人都会想，生成器跟协程不就一个东西吗？我也是这么想，不过它们确实不一样。协程其实有协同工作（cooperative routine）的意思，而生成器就是一个“独立”的函数、迭代器。课程 A Curious Course on Coroutines and Concurrency 就说生成器用于生成数据，而协程则倾向于消费数据。

或者你会说，生成器是只有 yield 语句的协程，生成器是协程的一种（因为协程同时支持 yield 语句和表达式）。但这样说，你可能又会说，在 PEP 342，本身就提议通过改进生成器来实现协程，这样协程不应该是一种生成器吗？

个人觉得，对于这个问题，只要抓住本质，就啥都好理解了。协程本质上是执行权的切换，从而实现协同工作，它的含义不单单包括了生成器，还涵盖了协同工作；而这些生成器是没有的，它只是一个独立的函数。协程的这种执行权切换很像操作系统对线程的切换，所以，利用协程，我们也可以写一个程序层面的操作系统，详细可参考课程 A Curious Course on Coroutines and Concurrency 。

另外，该课程作者提到的非常重要的一点，协程跟迭代没有关系（Coroutines are not related to iteration）。协程更像是一个流水线，一级一级向下传递执行权。

对于协程和生成器两者的详细区别可以直接看英文原文：

GENERATORS

Generators add two new abstract operations, "suspend" and "resume". When a generator suspends, it's exactly like a return today except we simply decline to decref the frame. That's it! The locals, and where we are in the computation, aren't thrown away. A "resume" then consists of *re*starting the frame at its next bytecode instruction, with the retained frame's locals and eval stack just as they were.

Some generator properties:

In implementation terms a trivial variation on what Python currently does.

They’re asymmetric: “suspend” is something only a generator can do, and “resume” something only its caller can do (this does not preclude a generator from being “the caller” wrt to some other generator, though, and indeed that’s very useful in practice).

A generator always returns control directly to its caller, at the point the caller invoked the generator. And upon resumption, a generator always picks up where it left off.

Because a generator remembers where it is and what its locals are, its state and "what to do next" don't have to be encoded in global data structures then decoded from scratch upon entry. That is, whenever you build a little (or large!) state machine to figure out “what to do next” from a collection of persistent flags and state vrbls, chances are good there’s a simple algorithm dying to break free of that clutter .

COROUTINES

Coroutines add only one new abstract operation, "transfer". They’re fully symmetric so can get away with only one. “transfer” names a coroutine to transfer to, and gives a value to deliver to it (there are variations, but this one is common & most useful). When A transfers to B, it acts like a generator “suspend” wrt A and like a generator “resume” wrt B. So A remembers where it is, and what its locals etc are, and B gets restarted from the point it last transfered to someone else.

Coroutines grew up in simulation languages because they’re an achingly natural way to model independent objects that interact with feedback. There each object (which may itself be a complex system of other stuff) is written as an infinite loop, transferring control to other objects when it has something to tell them, and transferred to by other objects when they have something to tell it.

A Unix pipeline "A | B | C | D" doesn’t exploit the full power but is suggestive. A may be written as

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
while 1:
   x = compute my next output
   B.transfer(x)     # resume B with my output

B as

while 1:
   x = A.transfer()  # resume A to get my input
   y = compute something from x and my own history
   C.transfer(y)     # resume C with my output

C as

while 1:
   x = B.transfer()  # resume B to get my input
   y = compute something from x and my own history
   D.transfer(y)     # resume D with my output

and D as

while 1:
   x = C.transfer()  # resume C to get my input
   y = compute something from x and my own history
   print y
If e.g. C collapses pairs of values from B, it can be written instead as
1
2
3
4
5
6
while 1:
   # get a pair of B's
   x = B.transfer()
   y = B.transfer()
   z = f(x, y, whatever)
   D.transfer(z)     # resume D with my output
It’s a local modification to C: B doesn’t know and shouldn’t need to know. This keeps complex algorithms manageable as things evolve.

Initialization and shutdown can be delicate, but once the pipe is set up it doesn’t even matter which of {A, B, C, D} first gets control! You can view A as pushing results through the pipe, or D as pulling them, or whatever. In reality they’re all equal partners.

Why these are so much harder to implement than generators: "transfer" *names* who next gets control, while generators always return to their (unnamed) caller. So a generator simply "pops the stack" when it suspends, while coroutine flow need not be (and typically isn't) stack-like.

In Python this is currently a coroutine-killer, because the C stack gets intertwined. So if coroutine A merely calls (in the regular sense) function F, and F tries to transfer to coroutine B, the info needed to resume A includes the chunk of the C stack between A and F. And that’s why the Python coroutine implementation I referenced earlier uses threads under the covers (where capturing pieces of the C stack isn’t a problem).

Early versions of coroutines didn’t allow for this, though! At first coroutines could only transfer directly to other coroutines, and as soon as a coroutine made “a regular call” transfers were prohibited until the call returned (unless the called function kicked off a brand new collection of coroutines, which could then transfer among themselves – making the distinction leads to convoluted rules, so modern practice is to generalize from the start).

Then the current state of each coroutine was contained in a single frame, and it’s really no harder to implement than generators. Knuth seems to have this restricted flavor of coroutine in mind when he describes generator behavior as "semi-coroutine".