Split receive_data into receive_data + next_event #4

njsmith · 2016-05-18T07:50:29Z

So now receive_data no longer returns a list of events; instead, it just
appends to the internal buffer, and then you have to call next_event to
get it. (So next_event is like the old receive_data(None).)

Maybe some other small changes snuck in, but that's the main one... this
involved a major rework of the code and docs.

The big advantage of this is that when trying to solve the really hard
problem of implementing a HTTP server that handles errors gracefully,
life is much simpler if you can take advantage of h11's knowledge of
the connection state. But with the old way of doing receive_data, you
might get back multiple events at once, and then while you're processing
the first event, conn.states is already updated to reflect the future
events you haven't processed yet. So you couldn't really use the state,
because in general it'd be out-of-sync with your processing.

In the new model, we return events on at a time, and we only update the
state machine as we return them, so now the API follows causal law
again. Down with spooky time travel!

I'm also much happier with the flow control section of the docs, which
now explicitly talks about how to use these primitives to implement both
push-style and pull-style servers.

Plus: super-awesome new example curio-server.py, which easily scales to
thousands of requests/second, thousands of concurrent connections, and
has robust error handling without spaghetti (!!).

I'm not sure why I thought the Sphinx syntax for short link text was like :meth:`~.Class.foo`, but let's switch to the more standard :meth:`~Class.foo`. (Apparently both ways work?) (ref: http://www.sphinx-doc.org/en/stable/markup/inline.html#cross-referencing-syntax)

Discovered that it made writing the curio server example rather tricky and difficult -- my server code already paused reading at the appropriate times, but then I had to watch out for these Paused events cluttering up my event stream...

So now receive_data no longer returns a list of events; instead, it just appends to the internal buffer, and then you have to call next_event to get it. (So next_event is like the old receive_data(None).) Maybe some other small changes snuck in, but that's the main one... this involved a major rework of the code and docs. The big advantage of this is that when trying to solve the really hard problem of implementing a HTTP server that handles errors gracefully, life is *much* simpler if you can take advantage of h11's knowledge of the connection state. But with the old way of doing receive_data, you might get back multiple events at once, and then while you're processing the first event, conn.states is already updated to reflect the future events you haven't processed yet. So you couldn't really use the state, because in general it'd be out-of-sync with your processing. In the new model, we return events on at a time, and we only update the state machine as we return them, so now the API follows causal law again. Down with spooky time travel! I'm also *much* happier with the flow control section of the docs, which now explicitly talks about how to use these primitives to implement both push-style and pull-style servers. Plus: super-awesome new example curio-server.py, which easily scales to thousands of requests/second, thousands of concurrent connections, and has robust error handling without spaghetti (!!).

njsmith · 2016-05-18T07:58:59Z

@Lukasa: Not asking for a code review, but you might be interested in this -- it implements a fundamental change in how the h11 receive API works.

Probably the simplest way to understand what + why is to look at the core of the new curio-server.py example and how it's a super-elegant little graceful-error-handling thing... BUT totally dependent on having a consistent view of the connection's internal state, which wasn't possible with the old API that returned multiple received events at a time.

Ironically this means that h11 now has a split API for receiving bytes from receiving events, which is exactly opposite and parallel to how h2 has a split API for sending events from sending bytes... :-/

Lukasa · 2016-05-18T08:10:50Z

Ironically this means that h11 now has a split API for receiving bytes from receiving events, which is exactly opposite and parallel to how h2 has a split API for sending events from sending bytes... :-/

Heh, yeah, I clocked that. I thought that was pretty funny. ;)

So, this does touch on a constant concern I've had, which is that it's possible that the state machine will reject actions you take on it based on events if the user hasn't yet handled certain stream-state-affecting events (e.g. RST_STREAM, GOAWAY).

I'm honestly not sure how best to handle that concern. In the case of h2, at least, if you try to send data on a stream that got reset, nothing bad will happen aside from the exception that gets thrown. That seems ok to me now, but I'm definitely not certain about it.

Interestingly, however, h2 has a perfect solution for dealing with this problem. Right now receive_data returns a list, but there's no reason it couldn't return a generator. If it did, we'd be able to make the guarantee that the state machine reflects the state only for the frame that was just processed.

I'm not 100% sure that this is a good idea for h2, but it's worth discussing. I'll open an issue on h2 and try to think it over with some h2 contributors as well.

njsmith added 7 commits May 16, 2016 20:28

Small updates to changes.rst

816ee1c

Get rid of Paused pseudo-event

2f68f17

Discovered that it made writing the curio server example rather tricky and difficult -- my server code already paused reading at the appropriate times, but then I had to watch out for these Paused events cluttering up my event stream...

WIP

b12d31d

Small doc text cleanups

5671ed0

First working curio-server commit!

d19984f

Update fuzz harness for new receive API

235430e

Lukasa mentioned this pull request May 18, 2016

[RFC]: Lazy state changes. python-hyper/h2#228

Open

Small doc tweak

ca1a5b5

njsmith merged commit dfc7cec into master May 18, 2016

njsmith deleted the new-receive-api branch July 1, 2017 06:02

njsmith mentioned this pull request Feb 14, 2019

Rationale for API Choices? #78

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Split receive_data into receive_data + next_event #4

Split receive_data into receive_data + next_event #4

njsmith commented May 18, 2016

njsmith commented May 18, 2016

Lukasa commented May 18, 2016

Split receive_data into receive_data + next_event #4

Split receive_data into receive_data + next_event #4

Conversation

njsmith commented May 18, 2016

njsmith commented May 18, 2016

Lukasa commented May 18, 2016