Tuesday, October 4, 2022

Reasoning about asyncio.Semaphore

In Silicon Valley is a very exclusive fast-food restaurant, which is always open. There is one table, where one guest at a time is served an absolutely fabulous hamburger. When you arrive, you wait in line until the table is available. Then the host takes you to the table and, this being America, you are asked a seemingly endless series of questions about how you would like your hamburger to be cooked and served.

But today we're not talking about culinary delights. We're talking about the queuing system used by the restaurant. If you are lucky to arrive at the restaurant when the table is available and there are no other guests waiting, you are seated right away. Otherwise, the host gives you a buzzer (from an infinite stack of buzzers!) and you are free to roam the neighborhood until your buzzer goes off. It is the host's job to ensure that guests are seated in order of arrival. When it is your turn, the host will cause your buzzer go off and you make your way back to the restaurant, where you will be seated.

If you change your mind, you can return the buzzer to the host, who will take it back without lifting an eyebrow. If your buzzer has already gone off, the host will buzz the next guest, if any. Guests are always polite and don't abscond with their buzzers. The host is always fair and doesn't seat another guest ahead of you even if you take your time making it back.

The above description fits that of a Lock. A guest arriving corresponds to the acquire() call; leaving is a release() call. Changing your mind is like getting cancelled while waiting in acquire(). You can change your mind before or after your buzzer goes off, i.e., you can be cancelled before or after the lock has awakened your call (but before you return from acquire()).

One day the restaurant expands, hiring extra sous-chefs and opening several new tables. There is still only one host, whose job is not really changed. However, since multiple guests can be seated concurrently, a Semaphore must now be used instead of a simple Lock.

It turns out that implementing synchronization primitives is hard. This is somewhat surprising in the case of asyncio, since only one task can be executing at a time, and task switches only happen at await. But in the past year its fairness, correctness, semantics and performance have all been challenged. In fact, the last three complaints happened in the last month, and, being the last asyncio expert standing, I had to learn in a hurry what's the best way to think about semaphores.

The restaurant metaphor was very useful. For example, there is a difference between the number of open tables and the number of guests who may be seated immediately, and it equals the number of guests whose buzzer has gone off but who haven't come back to the host yet.

There was one particular challenge to fairness, where a task that released a semaphore and then immediately tried to acquire it again could starve other tasks. This is like a guest walking out, turning around, and getting seated again ahead of other waiting guests.

And there was a bug where a cancelled acquire() call could leave the lock in a bad state. This is like the host getting confused when a guest with a buzzing buzzer returns it but declines to be seated.

The restaurant metaphor didn't help with everything: cancellation behavior in asyncio is just complex. In Python 3.11 we have started putting extra strain on cancellation, because of two new asynchronous context managers we added:

Class TaskGroup for managing a group of related tasks. When one task fails, the others are cancelled, and the context manager waits for all tasks to exit.
timeout() function for managing timeouts. When the timeout goes off, the current task is cancelled.

Here is the main complication of cancellation handling:

When waiting for a Future, that Future may be cancelled, and then the await operation fails, raising CancelledError.
But when awaiting a Future raises CancelledError you cannot assume that the Future was cancelled! It is also possible that the Future was already marked as having a result (so it can no longer be cancelled), and your task has been marked as runnable, but another (also runnable) task runs first and cancels your task. I am grateful to Cyker Way for pointing out this corner case.

It helps to think of Futures as being in one of four states:

Waiting
Done, holding a result
Done, holding an exception
Done, but cancelled

From the waiting state a Future can transition to one of the other states, and then it cannot change state again. (Insert cute picture of state diagram here. :-)

The semaphore manages a FIFO queue of waiters. It does not use the exception state, but it does use the other three states:

Waiting: a guest with a buzzer that hasn't gone off yet
Holding a result: a guest who has been buzzed
Cancelled: a guest who returns their buzzer before it goes off

Fairness is supposed to be ensured by always appending a new Future to the queue to the end when acquire() finds the semaphore locked, and by always marking the leftmost (i.e., oldest) Future in the queue as holding a result when release() is called while queue isn't empty. The fairness bug was due acquire() taking a shortcut when the Semaphore's level (the number of open tables) is nonzero. It should not do this when there are still Futures in the queue. In other words we were sometimes seating a newly arrived guest when there was an open table even though there was already a guest waiting.

Guess what caused the cancellation bug? The scenario where a Future is holding a result (guest with buzzer buzzing) but the task awaiting that Future gets cancelled (guest declining to be seated).

I struggled to visualize the state of the Semaphore for myself, with its level and FIFO queue of waiting Futures. I also struggled with the definition of locked(). If the level variable had been public I would have struggled with its semantics too. In the end I came up with the following definitions:

W, the list of waiting futures, or [f for f in queue if not f.done()]
R, the list of futures holding a result, or [f for f in queue if f.done() and not f.cancelled()]
C, the list of cancelled futures, or [f for f in queue if f.cancelled()]

and some invariants:

set(W + R + C) == set(queue) — all futures are either waiting, have a result, or are cancelled.
level >= len(R) — we must have at least as many open tables as there are guests holding buzzing buzzers.
define locked() as (len(W) > 0 or len(R) > 0 or level == 0) — we cannot immediately seat anyone unless (a) there are no guests waiting for their buzzer to go off, (b) there are no guests holding a buzzing buzzer, and (c) there is at least one open table.

I leave you with a final link, to the current code.

Monday, February 28, 2022

Meeting Mike Burrows

In late 2005 I joined Google. The interviews took a surprising long time, which is a tale for another time. Today I want to tell a story that happened in one of my first weeks on campus.

In the main building was an impressive staircase going up to the second floor. Somewhere near the top was a spacious office. A very important engineer worked there. I checked the name on the door and realized I knew him: he had been a grad student from the UK who had spent some time visiting our research group (the Amoeba project) at CWI in Amsterdam in the early '90s.

Happy to find someone I knew long ago, one day I knocked on the door and introduced myself. Yes, he remembered me too, but my delight was soon over. Not only was Python the bane of Mike's existence at Google (he detested everything that wasn't C++), but the one memory from his stay in Amsterdam that stood out was about a time I had given him a ride across town on the back of my bike: "Worst ride of my life."

Friday, March 15, 2019

Why operators are useful

This is something I posted on python-ideas, but I think it's interesting to a wider audience.

There's been a lot of discussion recently about an operator to merge two dicts.

It prompted me to think about the reason (some) people like operators, and a discussion I had with my mentor Lambert Meertens over 30 years ago came to mind.

For mathematicians, operators are essential to how they think. Take a simple operation like adding two numbers, and try exploring some of its behavior.

    add(x, y) == add(y, x)    (1)

Equation (1) expresses the law that addition is commutative. It's usually written using an operator, which makes it more concise:

    x + y == y + x    (1a)

That feels like a minor gain.

Now consider the associative law:

    add(x, add(y, z)) == add(add(x, y), z)    (2)

Equation (2) can be rewritten using operators:

    x + (y + z) == (x + y) + z    (2a)

This is much less confusing than (2), and leads to the observation that the parentheses are redundant, so now we can write

    x + y + z    (3)

without ambiguity (it doesn't matter whether the + operator binds tighter to the left or to the right).

Many other laws are also written more easily using operators. Here's one more example, about the identity element of addition:

    add(x, 0) == add(0, x) == x    (4)

compare to

    x + 0 == 0 + x == x    (4a)

The general idea here is that once you've learned this simple notation, equations written using them are easier to *manipulate* than equations written using functional notation -- it is as if our brains grasp the operators using different brain machinery, and this is more efficient.

I think that the fact that formulas written using operators are more easily processed *visually* has something to do with it: they engage the brain's visual processing machinery, which operates largely subconsciously, and tells the conscious part what it sees (e.g. "chair" rather than "pieces of wood joined together"). The functional notation must take a different path through our brain, which is less subconscious (it's related to reading and understanding what you read, which is learned/trained at a much later age than visual processing).

The power of visual processing really becomes apparent when you combine multiple operators. For example, consider the distributive law:

    mul(n, add(x, y)) == add(mul(n, x), mul(n, y)) (5)

That was painful to write, and I believe that at first you won't see the pattern (or at least you wouldn't have immediately seen it if I hadn't mentioned this was the distributive law).

Compare to:

    n * (x + y) == n * x + n * y    (5a)

Notice how this also uses relative operator priorities. Often mathematicians write this even more compact:

    n(x+y) == nx + ny    (5b)

but alas, that currently goes beyond the capacities of Python's parser.

Another very powerful aspect of operator notation is that it is convenient to apply them to objects of different types. For example, laws (1) through (5) also work when x, y and z are same-size vectors and n is a scalar (substituting a vector of zeros for the literal "0"), and also if they are matrices (again, n has to be a scalar).

And you can do this with objects in many different domains. For example, the above laws (1) through (5) apply to functions too (n being a scalar again).

By choosing the operators wisely, mathematicians can employ their visual brain to help them do math better: they'll discover new interesting laws sooner because sometimes the symbols on the blackboard just jump at you and suggest a path to an elusive proof.

Now, programming isn't exactly the same activity as math, but we all know that Readability Counts, and this is where operator overloading in Python comes in. Once you've internalized the simple properties which operators tend to have, using + for string or list concatenation becomes more readable than a pure OO notation, and (2) and (3) above explain (in part) why that is.

Of course, it's definitely possible to overdo this -- then you get Perl. But I think that the folks who point out "there is already a way to do this" are missing the point that it really is easier to grasp the meaning of this:

    d = d1 + d2

compared to this:

    d = d1.copy()
    d.update(d2)    # CORRECTED: This line was previously wrong

and it is not just a matter of fewer lines of code: the first form allows us to use our visual processing to help us see the meaning quicker -- and without distracting other parts of our brain (which might already be occupied by keeping track of the meaning of d1 and d2, for example).

Of course, everything comes at a price. You have to learn the operators, and you have to learn their properties when applied to different object types. (This is true in math too -- for numbers, x*y == y*x, but this property does not apply to functions or matrices; OTOH x+y == y+x applies to all, as does the associative law.)

"But what about performance?" I hear you ask. Good question. IMO, readability comes first, performance second. And in the basic example (d = d1 + d2) there is no performance loss compared to the two-line version using update, and a clear win in readability. I can think of many situations where performance difference is irrelevant but readability is of utmost importance, and for me this is the default assumption (even at Dropbox -- our most performance critical code has already been rewritten in ugly Python or in Go). For the few cases where performance concerns are paramount, it's easy to transform the operator version to something else -- *once you've confirmed it's needed* (probably by profiling).

Monday, November 26, 2018

What to do with your computer science career

I regularly receive questions from students in the field of computer science looking for career advice.

Here's an answer I wrote to one of them. It's not comprehensive or anything, but I thought people might find it interesting.

[A question about whether to choose a 9-5 job or be an entrepreneur]

The question about "9-5" vs. "entrepreneur" is a complex one -- not everybody can be a successful entrepreneur (who would do the work? :-) and not everybody has the temperament for it. For me personally it was never an option -- there are vast parts of management and entrepreneurship that I wouldn't enjoy doing, such as hiring (I hate interviewing and am bad at it) and firing (too emotionally draining -- even just giving negative feedback is hard for me). Pitching ideas to investors is another thing that I'd rather do without.

If any of that resonates with you, you may be better off not opting for entrepreneurship -- the kind of 9-5 software development jobs I have had are actually (mostly) very rewarding: I get to write software that gets used by hundreds or thousands of other developers (or millions in the case of Python), and those other developers in turn use my software to produce product that get uses by hundreds of thousands or, indeed hundreds of millions of users. Not every 9-5 job is the same! For me personally, I don't like the product stuff (since usually that means it's products I have no interest in using myself), but "your mileage may vary" (as they say in the US). Just try to do better than an entry-level web development job; that particular field (editing HTML and CSS) is likely to be automated away, and would feel repetitive to me.

[A question about whether AI would make human software developers redundant (not about what I think of the field of AI as a career choice)]

Regarding AI, I'm not worried at all. The field is focused on automating boring, repetitive tasks like driving a car or recognizing faces, which humans can learn to do easily but find boring if they have to do it all the time. The field of software engineering (which includes the field of AI) is never boring, since as soon as a task is repetitive, you automate it, and you start solving new problems.

Saturday, July 23, 2016

About spammers and comments

I'm turning off commenting for my blogs. While I've enjoyed some feedback, the time wasted to moderate spam posts just isn't worth it. Thank you, spammers! :-(

Wednesday, May 18, 2016

Union syntax

(I'm trying to do this as a quick post in response to some questions I received on this topic. I realize this will probably reopen the whole discussion about the best syntax for types, but sorry folks, PEP 484 was accepted nearly a year ago, after many months of discussions and hundreds of messages. It's unlikely that any idea you can think of here would be new. This post just explains the rationale of one particular decision and tries to put it in some context.)

I've heard some grumbling about the union syntax in PEP 484: Union[X, Y, Z] (where X, Y and Z are arbitrary type expressions). In the past people have suggested X|Y|Z for this, or (X, Y, Z) or {X, Y, Z}. Why did we go with the admittedly clunkier Union[X, Y, Z]?

First of all, despite all the attention drawn to it, unions are actually a pretty minor feature, and you shouldn't be using them much. So you also shouldn't care that much.

Why not X|Y|Z?

This won't fly because we want compatibility with versions of Python 3 that were already frozen (see below). We want to be able to express e.g. a union of int and str, which under this notation would be written as int|str. But for that to fly we'd have to modify the builtin 'type' class to implement __or__ -- and that wouldn't fly on already-frozen Python versions. Supporting X|Y only for types (like List) imported from the typing module and some other notation for builtin types would only sow confusion. So X|Y|Z is out.

Why not {X, Y, Z}?

That's the set with elements X, Y and Z, using the builtin set notation. We can usefully consider types to be sets of values, and this makes a union a set of values too (that's why it's called union :-).

However, {X, Y, Z} confuses the set of types with the set of values, which I consider a mortal sin. This would just cause endless confusion.

This notation would also confuse things when taking the union of several classes that overlap, e.g. if we have classes B and C, where C inherits from B, then the union of B and C is just B. But the builtin set doesn't see it that way. In contrast, the X|Y notation could actually solve this (since in principle we could overload __or__ to do whatever we want), and the Union[] operator ("functor"?) from PEP 484 indeed solves this -- in this example Union[B, C] returns the (non-union) type B, both in the type checker and at runtime.

Why not (X, Y, Z)?

That's the tuple (X, Y, Z). It has the same disadvantages as {X, Y, Z}, but at least it has the advantage of being similar to how unions are expressed as arguments to isinstance(), for example isinstance(x, (int, str, list)) or isinstance(x, (Sequence, Mapping)). (Similarly the except clause: try: ... / except (KeyError, IndexError): ...)

Another problem with tuples is that the tuple syntax is already overloaded in so many ways that it would be confused with other uses even more easily. One particular confusion would be other generic types, for which we'd still want to use square brackets. (You can't really beat Iterable[int] for clarity if you have an iterable of integers. :-) Suppose you have a sequence of values that could be integers or strings. In PEP 484 notation we write this as Sequence[Union[int, str]]. Using the tuple notation we'd want to write this as Sequence[(int, str)]. But it turns out that the __getitem__ overload on the metaclass can't tell the difference between Sequence[(int, str)] and Sequence[int, str] -- and we would like to reject the latter as a mistake since Sequence[] is a generic class over a single parameter. (An example of a generic class over two parameters would be Mapping[K, V].) Disambiguating all this would place us on very thin ice indeed.

The nail in this idea's coffin is the competing idea of using (X, Y, Z) to indicate a tuple with three items, with respective types, X, Y and Z. At first sight this seems an even better use of the tuple syntax than unions would be, and tuples are way more common than unions. But it runs afoul of the same problems with Foo[(X, Y)] vs. Foo[X, Y]. (Also, there would be no easy way to describe what PEP 484 calls Tuple[X, ...], i.e. a variable-length tuple with uniform item type X.)

PS. Why support old Python 3 versions?

The reason for supporting older versions is adoption. Only a relatively small crowd of early adopters can upgrade to the latest Python version as soon as it's out; the rest of us are stuck on older versions (even Python 2.7!).

So for PEP 484 and the typing module, we wanted to support 3.2 and up -- we chose 3.2 because it's the newest Python 3 supported by some older but still popular Ubuntu and Debian distributions. (Also, 3.0 and 3.1 were too immature at their time of release to ever have a large following.)

There's a typing package that you can install easily using pip, and this defines all sorts of useful things for typing, from Any and Union to generic versions of List and Sequence. But such a package can't modify existing builtins like int or list.

(Eventually we also added Python 2.7 support, using type comments for function signatures.)

Adding type annotations for fspath

Type annotations for fspath

Python 3.6 will have a new dunder protocol, __fspath__() , which should be supported by classes that represent filesystem paths. Example of such classes are the pathlib.Path family and os.DirEntry (returned by os.scandir() ).

You can read more about this protocol in the brand new PEP 519. In this blog post I’m going to discuss how we would add type annotations for these additions to the standard library.

I’m making frequent use of AnyStr , a quite magical type variable predefined in the typing module. If you’re not familiar with it, I recommend reading my blog post about AnyStr . You may also want to read up on generics in PEP 484 (or read mypy’s docs on the subject).

Adding os.scandir() to the stubs for os.py

For practice, let’s see if we can add something to the stub file for os.py. As of this writing there’s no typeshed information for os.scandir() , which I think is a shame. I think the following will do nicely. Note how we only define DirEntry and scandir() for Python versions >= 3.5. (Mypy doesn’t support this yet, but it will soon, and the example here still works — it just doesn’t realize scandir() is only available in Python 3.5.) This could be added to the end of stdlib/3/os/__init__.pyi:

from typing import Generic, AnyStr, overload, Iterator

if sys.version_info >= (3, 5):

class DirEntry(Generic[AnyStr]):

name = ... # type: AnyStr

path = ... # type: AnyStr

def inode(self) -> int: ...

def is_dir(self, *, follow_symlinks: bool = ...) -> bool: ...

def is_file(self, *, follow_symlinks: bool = ...) -> bool: ...

def is_symlink(self) -> bool: ...

def stat(self, *, follow_symlinks: bool = ...) -> stat_result: ...

@overload

def scandir() -> Iterator[DirEntry[str]]: ...

@overload

def scandir(path: AnyStr) -> Iterator[DirEntry[AnyStr]]: ...

Deconstructing this a bit, we see a generic class (that’s what the Generic[AnyStr] base class means) and an overloaded function. The scandir() definition uses @overload because it can also be called without arguments. We could also write it as follows; it’ll work either way:

@overload

def scandir(path: str = ...) -> Iterator[DirEntry[str]]: ...

@overload

def scandir(path: bytes) -> Iterator[DirEntry[bytes]]: ...

Either way there really are three ways to call scandir() , all three returning an iterable of DirEntry objects:

scandir() -> Iterator[DirEntry[str]]
scandir(str) -> Iterator[DirEntry[str]]
scandir(bytes) -> Iterator[DirEntry[bytes]]

Adding os.fspath()

Next I’ll show how to add os.fspath() and how to add support for the __fspath__() protocol to DirEntry .

PEP 519 defines a simple ABC (abstract base class), PathLike , with one method, __fspath__() . We need to add this to the stub for os.py , as follows:

class PathLike(Generic[AnyStr]):

@abstractmethod

def __fspath__(self) -> AnyStr: ...

That’s really all there is to it (except for the sys.version_info check, which I’ll leave out here since it doesn’t really work yet). Next we define os.fspath() , which wraps this protocol. It’s slightly more complicated than just calling its argument’s __fspath__() method, because it also handles strings and bytes. So here it is:

@overload

def fspath(path: PathLike[AnyStr]) -> AnyStr: ...

@overload

def fspath(path: AnyStr) -> AnyStr: ...

Easy enough! Next is update the definition of DirEntry . That’s easy too — in fact we only need to make it inherit from PathLike[AnyStr] , the rest is the same as the definition I gave above:

class DirEntry(PathLike[AnyStr], Generic[AnyStr]):

# Everything else unchanged!

The only slightly complicated bit here is the extra base class Generic[AnyStr] . This seems redundant, and in fact PEP 484 says we can leave it off, but mypy doesn’t support that yet, and it’s quite harmless — this just rubs into mypy’s face that this is a generic class of one type variable (the by-now famous AnyStr ).

Finally we need to make a similar change to the stub for pathlib.py . Again, all we need to do is to make PurePath inherit from PathLike[str] , like so:

from os import PathLike

class PurePath(PathLike[str]):

# Everything else unchanged!

However, here we don’t add Generic , because this is not a generic class! It inherits from PathLike[str] , which is quite un-generic, since it’s PathLike specialized for just str .

Note that we don’t actually have to define the __fspath__() method in these stubs — we’re not supposed to call them directly, and stubs don’t provide implementations, only interfaces.

Putting it all together, we see that it’s quite elegant:

for a in os.scandir('.'):

b = os.fspath(a)

# Here, the typechecker will know that the type of b is str!

The derivation that b has type str is not too complicated: first, os.scandir('.') has a str argument, so it returns an iterator of DirEntry objects parameterized with str , which we write as DirEntry[str] . Passing this DirEntry[str] to os.fspath() then takes the first of that function’s two overloads (the one with PathLike[AnyStr] ), since it doesn’t match the second one ( DirEntry doesn’t inherit from AnyStr , because it’s neither a str nor bytes ). Further the AnyStr type variable in PathLike[AnyStr] is solved to stand for just str , because DirEntry[str] inherits from PathLike[str] . This is the specialized version of what the code says: DirEntry[AnyStr] inherits from PathLike[AnyStr] .

Okay, so maybe that last paragraph was intermediate or advanced. And maybe it could be expanded. Maybe I’ll write another blog about how type inference works, but there’s a lot on that topic, and other authors have probably already written better introductory material about generics (in other languages, though).

Making things accept PathLike

There’s a bit of cleanup work that I’ve left out. PEP 519 says that many stdlib functions that currently take strings for pathnames will be modified to also accept PathLike . For example, here’s how the signatures for os.scandir() would change:

@overload

def scandir() -> Iterator[DirEntry[str]]: ...

@overload

def scandir(path: AnyStr) -> Iterator[DirEntry[AnyStr]]: ...

@overload

def scandir(path: PathLike[AnyStr]) -> Iterator[DirEntry[AnyStr]]: ...

The first two entries are unchanged; I’ve just added a third overload. (Note that the alternative way of defining scandir() would require more changes — an indication that this way is more natural.)

I also tried doing this with a union:

@overload

def scandir() -> Iterator[DirEntry[str]]: ...

@overload

def scandir(path: Union[AnyStr, PathLike[AnyStr]]) -> Iterator[DirEntry[AnyStr]]: ...

But I couldn’t get this to work, so the extra overload is probably the best we can do. Quite a few functions will require a similar treatment, sometimes introducing overloading where none exists today (but that shouldn’t hurt anything).

A note about pathlib : since it only deals with strings, its methods (the ones that PEP 519 says should be changed anyway) should use PathLike[str] rather than PathLike[AnyStr] .

Acknowledgments

(Thanks for comments on the draft to Stephen Turnbull, Koos Zevenhoven, Ethan Furman, and Brett Cannon.)