Skip to content

Commit d49ad83

Browse files
authored
Document coordinator interface. (#20413)
This documents some of the control flow and assupmtions between pgwire and the coordinator that I personally found non-obvious, and so hopefully it can help someone else. The biggest deficiency of this text is that it does not discuss transactions at all. I did not feel like my knowledge of those was sufficient for documenting them properly, but it would be great if someone else added that information in the future.
1 parent 64beec8 commit d49ad83

File tree

1 file changed

+247
-0
lines changed

1 file changed

+247
-0
lines changed
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,247 @@
1+
# Coordinator Interface
2+
3+
This document describes the external interface that the coordinator
4+
presents to the protocol layers (pgwire and HTTP). In general, the
5+
concepts exposed by the coordinator API are close to native
6+
pgwire concepts (prepare, bind, execute, and so on), but there is not
7+
a 1-to-1 correspondence, and so this document is not simply a
8+
restatement of the [Postgres protocol documentation][pgwire].
9+
10+
## The Adapter Client
11+
12+
During system startup, all users of the coordinator are instantiated
13+
with an _[adapter client]_, in some documentation also referred to as
14+
a _coordinator client_. This is a reference-counted handle to the
15+
coordinator, with the capability to send messages to the coordinator
16+
instructing it to perform various tasks.
17+
18+
## Sessions
19+
20+
Approximately the only useful actions that an adapter client allows
21+
one to take (via, behind the scenes, sending messages to the
22+
coordinator) are [creating][create session] and [starting][start
23+
session] sessions. A _session_ is an abstraction of a network
24+
connection that is authenticated (that is, always associated with a
25+
valid user) and further associated with various metadata such as the
26+
values of various session variables and the set of prepared statements
27+
and executing cursors.
28+
29+
The state associated with a session is partly stored in the
30+
[`Session`][session] struct, and partly in a map maintained by the
31+
coordinator and keyed on the session's connection ID.
32+
33+
## Basic Concepts
34+
35+
A _statement_ is the most general term for what is sometimes called a
36+
query or a command. In SQL source text, statements are typically
37+
delimited by semicolons. That said, the coordinator does not directly
38+
accept SQL source text, but rather deals with the
39+
[`Statement`][statement] AST. Statements may have _parameters_. These
40+
are unspecified values that appear in places where an actual value is
41+
expected.
42+
43+
Typically, before a statement may be executed, it must be _prepared_;
44+
preparation does some planning and type-checking and binds the
45+
statement to a name.
46+
47+
The next step after preparation is binding the statement to a
48+
_portal_, also called a _cursor_. It is at this stage that the values
49+
for all parameters must be specified. The lifecycles and names of
50+
portals and the prepared statements that are bound to them are not
51+
strongly connected: one prepared statement may be bound to many
52+
different portals, and may be deallocated or reassigned while the
53+
portals still exist, without affecting them.
54+
55+
As a convenience, preparation and binding may be combined in one step,
56+
using an interface function called `declare`. This prepares a
57+
statement (whose parameter values must all be speciified up front) and
58+
binds it to a portal in one step, without naming the prepared
59+
statement.
60+
61+
Once a prepared statement has been bound to a portal, it may be
62+
_executed_. Execution may cause the catalog to be mutated in various
63+
ways, and may return a result set. Describe the full set of effects
64+
and return values of statements is outside the scope of this document;
65+
the interested reader should consult the general Materialize SQL
66+
documentation.
67+
68+
Note that these concepts are not only exposed by the controller
69+
interface, but also at a higher level, to the end-user, in SQL. For
70+
example, the user may bind the statement `SELECT * FROM t` to the
71+
portal `c` by executing the SQL `DECLARE c CURSOR for SELECT * FROM
72+
t`, and may later execute this portal using the SQL `FETCH FORWARD ALL
73+
FROM c`. Alternatively, the user may make that statement into a
74+
prepared statement named `ps` by executing `PREPARE ps AS SELECT *
75+
FROM t`, and later execute it with `EXECUTE ps`. It's important to
76+
note that these statements are distinct from the ones they reference;
77+
for example, the statement `DECLARE c CURSOR for SELECT * FROM t` must
78+
itself be prepared, bound to a portal (distinct from the portal `c`),
79+
and executed.
80+
81+
## Communication Flow
82+
83+
Once a session has been established, the owner of the adapter client
84+
gains access to the much richer interface of the [session
85+
client]. Most functions of this interface are implemented according to
86+
the same pattern: they send a command to the coordinator and wait for
87+
a response.
88+
89+
We avoid the need to design intricate locking protocols by simply
90+
requiring that the coordinator have exclusive access to the session
91+
object during the entire execution of most commands. Therefore, as
92+
part of the common communication pattern for most commands mentioned
93+
above, the entire session object is sent to the coordinator, and then
94+
sent back. The session client then simply crashes if any of these
95+
functions are called while it does not have possession of the session
96+
object.
97+
98+
In the upcoming sections, we will describe the various interface
99+
functions in more detail.
100+
101+
## `prepare`
102+
103+
The [`SessionClient::prepare`][prepare] function corresponds to the
104+
`Command::Prepare` coordinator command. Its purpose is to bind a
105+
statement (possibly with parameters) to a name. It is called with a
106+
description of a parsed parameterized SQL statement, a name to bind it
107+
to, and a description of the types of its parameters (for those that
108+
are known). The coordinator plans the statement in order to determine
109+
the types of all parameters and the output relation (if any), and
110+
associates this information with the given name in its per-session
111+
state.
112+
113+
Using the empty string for the name may be used to specify the default
114+
prepared statement, matching postgres protocol semantics.
115+
116+
When successful, this function returns no value.
117+
118+
## `set_portal`
119+
120+
The [`Session::set_portal`][setportal] function does not correspond to
121+
any coordinator command and, in fact, does not require communication
122+
with the coordinator at all -- note that it is a function on the
123+
`Session` object itself, rather than on either of the coordinator
124+
clients. Its most important parameters are a portal name, a statement
125+
AST and its description (discovered during preparation), and a list of
126+
values for the parameters of the statement. The session assigns the
127+
given statement to a portal of the given name.
128+
129+
When successful, this function returns no value.
130+
131+
## `declare`
132+
133+
The [`SessionClient::declare`][declare] function corresponds to the
134+
`Command::Declare` coordinator command. It behaves essentially like a
135+
combination of `SessionClient::prepare` and `Session::set_portal`, preparing a statement and
136+
binding it to a portal for execution, except that it does not cause
137+
the statement to be bound to a prepared statement name (only to a
138+
portal name).
139+
140+
When successful, this function returns no value.
141+
142+
## `execute`
143+
144+
The [`SessionClient::execute`][execute] function corresponds to the
145+
`Command::Execute` coordinator command. Its purpose is to begin
146+
execution of a portal to which a statement and set of parameters has
147+
previously been bound. It is called with the name of the portal and a
148+
wire called `cancel_future` on which the coordinator can listen for
149+
requests to cancel the execution.
150+
151+
When successful, this function returns an `ExecuteResponse`, whose
152+
meaning varies depending on the kind of statement being executed. In
153+
some cases it means that the statement's execution has finished; in
154+
others, it only means that it has _begun_, and the client must either
155+
wait for results, or take further actions to drive it to
156+
completion. The specific protocols for various types of statements are
157+
described in more detail below.
158+
159+
### `SELECT` queries
160+
161+
The coordinator responds with an instance of
162+
[`ExecuteResponse::SendingRows`][sendingrows], which contains a future
163+
that will ultimately resolve to a batch of rows, an error, or a
164+
cancelation signal. If the query is successful, all rows it returns
165+
will be available as the output of the future as soon as it resolves
166+
(i.e., this is not a streaming API).
167+
168+
In general, queries may take arbitrarily long to finish, including
169+
possibly never terminating. If the client is no longer interested in
170+
the results of a query, for example because the end user disconnected
171+
or requested a cancellation, then the client is expected to
172+
communicate that fact to the coordinator by causing the
173+
`cancel_future` wire discussed above to activate.
174+
175+
### Subscriptions
176+
177+
The coordinator responds to a bare `SUBSCRIBE` statement (that is, one
178+
that is not part of a `COPY ... TO STDOUT` statement) with the an instance of
179+
[`ExecuteResponse::Subscribing`][subscribing]. This response contains
180+
a channel on which batches of rows are received. The client may
181+
continue drawing from this channel until either it closes gracefully
182+
(indicating that the subscribe has finished, as can happen for
183+
constant relations), or an error or cancellation is indicated.
184+
185+
As is the case for [`SELECT` queries](#select-queries), if the client is no longer
186+
interested in further responses, it should indicate that fact by
187+
actuating the corresponding `cancel_future`.
188+
189+
### `COPY ... TO STDOUT`
190+
191+
The coordinator responds to such a statement with an instance of
192+
[`ExecuteResponse::CopyTo`][copyto], which simply wraps an instance of
193+
the `ExecuteResponse` for the inner [subscription](#subscriptions) or
194+
[`SELECT` query](#select-queries).
195+
196+
### `COPY ... FROM STDIN`
197+
198+
The coordinator responds with an instance of
199+
[`ExecuteResponse::CopyFrom`][copyfrom] containing some metadata about the
200+
expected input format. The execution of the statement has by no means
201+
finished at this point; this `ExecuteResponse` is merely an
202+
instruction to the client to collect some data from the end user,
203+
decode it into rows, and pass them to the coordinator in the
204+
`SessionClient::insert_rows` method (which corresponds to a
205+
[`Command::CopyRows`][commandcopyrows] coordinator command). After the
206+
coordinator receives this second command, the statement execution is
207+
considered finished.
208+
209+
### `FETCH`
210+
211+
The coordinator responds with an instance of
212+
[`ExecuteResponse::Fetch`][fetch], which instructs the session to
213+
execute a cursor with a given name (i.e., the cursor specified
214+
textually in the `FETCH` statement being executed). Thus, a total of
215+
two calls to `execute` should happen as part of this flow (an "outer"
216+
one to `execute` the `FETCH` statement, and an "inner" one `execute`
217+
the cursor referenced in the `FETCH` statement). The entire statement
218+
execution is considered to have finished once this "inner" execute
219+
finishes.
220+
221+
### All others
222+
223+
Unless described differently elsewhere, the typical statement
224+
terminates in the coordinator, and no further action by the client is
225+
necessary. To give one arbitrary example among many, if a `CREATE
226+
SOURCE` statement is successful, the coordinator responds with an
227+
instace of [`ExecuteResponse::CreatedSource`][createdsource], which
228+
carries no additional information and requires no response.
229+
230+
[adapter client]: https://github.com/MaterializeInc/materialize/blob/0495d6272f/src/adapter/src/client.rs#L85-L92
231+
[create session]: https://github.com/MaterializeInc/materialize/blob/0495d6272f/src/adapter/src/client.rs#L115-L123
232+
[start session]: https://github.com/MaterializeInc/materialize/blob/0495d6272f/src/adapter/src/client.rs#L125-L170
233+
[session]: https://github.com/MaterializeInc/materialize/blob/0495d6272f/src/adapter/src/session.rs#L57-L87
234+
[session client]: https://github.com/MaterializeInc/materialize/blob/0495d6272f/src/adapter/src/client.rs#L225-L239
235+
[prepare]: https://github.com/MaterializeInc/materialize/blob/0495d6272f/src/adapter/src/client.rs#L281-L299
236+
[declare]: https://github.com/MaterializeInc/materialize/blob/0495d6272f/src/adapter/src/client.rs#L301-L317
237+
[execute]: https://github.com/MaterializeInc/materialize/blob/0495d6272f/src/adapter/src/client.rs#L319-L336
238+
[sendingrows]: https://github.com/MaterializeInc/materialize/blob/0495d6272f/src/adapter/src/command.rs#L370-L375
239+
[subscribing]: https://github.com/MaterializeInc/materialize/blob/0495d6272f/src/adapter/src/command.rs#L386-L386
240+
[copyto]: https://github.com/MaterializeInc/materialize/blob/0495d6272f/src/adapter/src/command.rs#L286-L289
241+
[copyfrom]: https://github.com/MaterializeInc/materialize/blob/0495d6272f/src/adapter/src/command.rs#L290-L295
242+
[commandcopyrows]: https://github.com/MaterializeInc/materialize/blob/0495d6272f/src/adapter/src/command.rs#L95-L102
243+
[fetch]: https://github.com/MaterializeInc/materialize/blob/0495d6272f/src/adapter/src/command.rs#L344-L352
244+
[createdsource]: https://github.com/MaterializeInc/materialize/blob/0495d6272f/src/adapter/src/command.rs#L315-L315
245+
[statement]: https://github.com/MaterializeInc/materialize/blob/0495d6272f/src/sql-parser/src/ast/defs/statement.rs#L42-L100
246+
[setportal]: https://github.com/MaterializeInc/materialize/blob/0495d6272f/src/adapter/src/session.rs#L526-L563
247+
[pgwire]: https://www.postgresql.org/docs/15/protocol-flow.html

0 commit comments

Comments
 (0)