In Silicon Valley is a very exclusive fast-food restaurant, which is always open. There is one table, where one guest at a time is served an absolutely fabulous hamburger. When you arrive, you wait in line until the table is available. Then the host takes you to the table and, this being America, you are asked a seemingly endless series of questions about how you would like your hamburger to be cooked and served.
But today we're not talking about culinary delights. We're talking about the queuing system used by the restaurant. If you are lucky to arrive at the restaurant when the table is available and there are no other guests waiting, you are seated right away. Otherwise, the host gives you a buzzer (from an infinite stack of buzzers!) and you are free to roam the neighborhood until your buzzer goes off. It is the host's job to ensure that guests are seated in order of arrival. When it is your turn, the host will cause your buzzer go off and you make your way back to the restaurant, where you will be seated.
If you change your mind, you can return the buzzer to the host, who will take it back without lifting an eyebrow. If your buzzer has already gone off, the host will buzz the next guest, if any. Guests are always polite and don't abscond with their buzzers. The host is always fair and doesn't seat another guest ahead of you even if you take your time making it back.
The above description fits that of a Lock. A guest arriving corresponds to the acquire() call; leaving is a release() call. Changing your mind is like getting cancelled while waiting in acquire(). You can change your mind before or after your buzzer goes off, i.e., you can be cancelled before or after the lock has awakened your call (but before you return from acquire()).
One day the restaurant expands, hiring extra sous-chefs and opening several new tables. There is still only one host, whose job is not really changed. However, since multiple guests can be seated concurrently, a Semaphore must now be used instead of a simple Lock.
It turns out that implementing synchronization primitives is hard. This is somewhat surprising in the case of asyncio, since only one task can be executing at a time, and task switches only happen at await. But in the past year its fairness, correctness, semantics and performance have all been challenged. In fact, the last three complaints happened in the last month, and, being the last asyncio expert standing, I had to learn in a hurry what's the best way to think about semaphores.
The restaurant metaphor was very useful. For example, there is a difference between the number of open tables and the number of guests who may be seated immediately, and it equals the number of guests whose buzzer has gone off but who haven't come back to the host yet.
There was one particular challenge to fairness, where a task that released a semaphore and then immediately tried to acquire it again could starve other tasks. This is like a guest walking out, turning around, and getting seated again ahead of other waiting guests.
And there was a bug where a cancelled acquire() call could leave the lock in a bad state. This is like the host getting confused when a guest with a buzzing buzzer returns it but declines to be seated.
The restaurant metaphor didn't help with everything: cancellation behavior in asyncio is just complex. In Python 3.11 we have started putting extra strain on cancellation, because of two new asynchronous context managers we added:
- Class TaskGroup for managing a group of related tasks. When one task fails, the others are cancelled, and the context manager waits for all tasks to exit.
- timeout() function for managing timeouts. When the timeout goes off, the current task is cancelled.
Here is the main complication of cancellation handling:
- When waiting for a Future, that Future may be cancelled, and then the await operation fails, raising CancelledError.
- But when awaiting a Future raises CancelledError you cannot assume that the Future was cancelled! It is also possible that the Future was already marked as having a result (so it can no longer be cancelled), and your task has been marked as runnable, but another (also runnable) task runs first and cancels your task. I am grateful to Cyker Way for pointing out this corner case.
It helps to think of Futures as being in one of four states:
- Waiting
- Done, holding a result
- Done, holding an exception
- Done, but cancelled
From the waiting state a Future can transition to one of the other states, and then it cannot change state again. (Insert cute picture of state diagram here. :-)
The semaphore manages a FIFO queue of waiters. It does not use the exception state, but it does use the other three states:
- Waiting: a guest with a buzzer that hasn't gone off yet
- Holding a result: a guest who has been buzzed
- Cancelled: a guest who returns their buzzer before it goes off
Fairness is supposed to be ensured by always appending a new Future to the queue to the end when acquire() finds the semaphore locked, and by always marking the leftmost (i.e., oldest) Future in the queue as holding a result when release() is called while queue isn't empty. The fairness bug was due acquire() taking a shortcut when the Semaphore's level (the number of open tables) is nonzero. It should not do this when there are still Futures in the queue. In other words we were sometimes seating a newly arrived guest when there was an open table even though there was already a guest waiting.
Guess what caused the cancellation bug? The scenario where a Future is holding a result (guest with buzzer buzzing) but the task awaiting that Future gets cancelled (guest declining to be seated).
I struggled to visualize the state of the Semaphore for myself, with its level and FIFO queue of waiting Futures. I also struggled with the definition of locked(). If the level variable had been public I would have struggled with its semantics too. In the end I came up with the following definitions:
- W, the list of waiting futures, or [f for f in queue if not f.done()]
- R, the list of futures holding a result, or [f for f in queue if f.done() and not f.cancelled()]
- C, the list of cancelled futures, or [f for f in queue if f.cancelled()]
and some invariants:
- set(W + R + C) == set(queue) — all futures are either waiting, have a result, or are cancelled.
- level >= len(R) — we must have at least as many open tables as there are guests holding buzzing buzzers.
- define locked() as (len(W) > 0 or len(R) > 0 or level == 0) — we cannot immediately seat anyone unless (a) there are no guests waiting for their buzzer to go off, (b) there are no guests holding a buzzing buzzer, and (c) there is at least one open table.
I leave you with a final link, to the current code.