-
Notifications
You must be signed in to change notification settings - Fork 537
[RFC] Replace HTTP+SSE with new "Streamable HTTP" transport #206
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 19 commits
ec96e58
fb8e9ce
66a1cdd
39af45c
5474985
bd504c8
048f812
1917ba4
e0b3493
b3364c9
732fdd2
4001c48
d5e0a0e
ea699ca
9cffe4f
103bcc0
20399a4
06abe2c
fe418dd
67fe21f
0f4924b
29fcf7e
60dbfcd
3a57a03
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -12,7 +12,7 @@ The protocol currently defines two standard transport mechanisms for client-serv | |
communication: | ||
|
||
1. [stdio](#stdio), communication over standard in and standard out | ||
2. [HTTP with Server-Sent Events](#http-with-sse) (SSE) | ||
2. [Streamable HTTP](#streamable-http) | ||
|
||
Clients **SHOULD** support stdio whenever possible. | ||
|
||
|
@@ -48,38 +48,209 @@ sequenceDiagram | |
deactivate Server Process | ||
``` | ||
|
||
## HTTP with SSE | ||
## Streamable HTTP | ||
|
||
In the **SSE** transport, the server operates as an independent process that can handle | ||
multiple client connections. | ||
{{< callout type="info" >}} This replaces the [HTTP+SSE | ||
transport]({{< ref "/specification/2024-11-05/basic/transports#http-with-sse" >}}) from | ||
protocol version 2024-11-05. See the [backwards compatibility](#backwards-compatibility) | ||
guide below. {{< /callout >}} | ||
|
||
The server **MUST** provide two endpoints: | ||
In the **Streamable HTTP** transport, the server operates as an independent process that | ||
can handle multiple client connections. This transport uses standard HTTP with optional | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I am not sure we should prescribe it being an "independent process" because of two things (1) I am not fully sure what "independent process" refers to (2) if we were to deprecate stdio in favor of HTTP everywhere, it might still be beneficial to launch the server as a parent process in order to control lifecycle easily, in which case the server process is not independent. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This language was here before, but happy to revise it. |
||
[Server-Sent Events](https://en.wikipedia.org/wiki/Server-sent_events) (SSE) for | ||
streaming multiple server messages. This permits "plain HTTP" MCP servers, as well as | ||
jspahrsummers marked this conversation as resolved.
Show resolved
Hide resolved
|
||
more feature-rich servers supporting streaming and server-to-client notifications and | ||
requests. | ||
|
||
1. An SSE endpoint, for clients to establish a connection and receive messages from the | ||
server | ||
2. A regular HTTP POST endpoint for clients to send messages to the server | ||
The server **MUST** provide a single HTTP endpoint path (hereafter referred to as the | ||
**MCP endpoint**) that supports both POST and GET methods. For example, this could be a | ||
URL like `https://example.com/mcp`. | ||
samuelcolvin marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
When a client connects, the server **MUST** send an `endpoint` event containing a URI for | ||
the client to use for sending messages. All subsequent client messages **MUST** be sent | ||
as HTTP POST requests to this endpoint. | ||
### Message Exchange | ||
|
||
Server messages are sent as SSE `message` events, with the message content encoded as | ||
JSON in the event data. | ||
1. Every JSON-RPC message sent from the client **MUST** be a new HTTP POST request to the | ||
jspahrsummers marked this conversation as resolved.
Show resolved
Hide resolved
|
||
MCP endpoint. | ||
|
||
2. When the client sends a JSON-RPC _request_ to the MCP endpoint via POST: | ||
|
||
- The client **MUST** include an `Accept` header, listing both `application/json` and | ||
samuelcolvin marked this conversation as resolved.
Show resolved
Hide resolved
jspahrsummers marked this conversation as resolved.
Show resolved
Hide resolved
|
||
`text/event-stream` as supported content types. | ||
jspahrsummers marked this conversation as resolved.
Show resolved
Hide resolved
|
||
- The server **MUST** either return `Content-Type: text/event-stream`, to initiate an | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Since some servers will now return "plain" JSON, two questions occur to me:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I think we have a larger TODO here, to put some recommendations about timeouts into the spec, and maybe codify conventions like "issuing a progress notification should reset default timeouts." This is definitely important but probably should happen separate to this PR. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There's a bigger discussion here about how best to support streaming chunks—on the input side too. I'm inclined to leave mentions out of the current spec but tackle it in follow-ups. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. My take here is that (a) timeouts depend on the SDKs and can be freely chosen, but i am okay with giving guidance (b) Clients should only be concerned with starting to parse responses once a newline is detected, so I think clients who CAN do streamed parsing can do so, but clients who cannot sould be fine too. I certainly don't want to expect a streaming json parser as it would raise the bar significantly. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. makes sense. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Ideally timeouts should be exposed as configuration on the client, so that in extenuating circumstances they can be tuned eg. users in APAC calling a server deployed in us-east regions may need to tune higher read timeouts. |
||
SSE stream, or `Content-Type: application/json`, to return a single JSON-RPC | ||
_response_. The client **MUST** support both these cases. | ||
jspahrsummers marked this conversation as resolved.
Show resolved
Hide resolved
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Given that While this wouldn't solve the resumability issue by itself, it would vastly simplify the transport. It would be much closer to the stdio transport, and it could potentially support binary data. And I think it would help with the resumability issue. It greatly simplifies resumability to only have one logical stream per connection like the stdio transport does. That way, you're not stuck managing multiple last-message-ids on the server which seems like a pain. If a core design principal is that clients should handle complexity where it exists, I'd suggest forcing the client to only have one resumable server-to-client message stream at a time. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. JSON-RPC supports batching. You can pass an array of requests as a JSON-RPC body. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. FWIW - in OpenID Provider Commands we are proposing a POST that returns either application/json or text/event-stream https://openid.github.io/openid-provider-commands/main.html There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Supporting batched client -> server messages makes sense! I honestly had forgotten that aspect of JSON-RPC, since we ignored it for so long. 😬 I would like to extend support to streaming request bodies, but I think we should kick the can on this a bit, as it will probably involve significant discussion of its own. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'll punt on batching in this PR as well, as it also affects the stdio transport and has some wider-ranging implications, but basically I agree we should support it. |
||
- If the server initiates an SSE stream: | ||
- The SSE stream **SHOULD** eventually include a JSON-RPC _response_ message. | ||
jspahrsummers marked this conversation as resolved.
Show resolved
Hide resolved
|
||
- The server **MAY** send JSON-RPC _requests_ and _notifications_ before sending a | ||
JSON-RPC _response_. These messages **SHOULD** relate to the originating client | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I think this relationship needs to be clearly explained, for example, what kind of notifications are related to the request. Here is my understanding:
Is this understanding correct? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is purposely left a bit unspecified at the moment. Your interpretation seems reasonable. Another example would be progress or logging notifications related to a tool call request. |
||
_request_. | ||
- The server **SHOULD NOT** close the SSE stream before sending the JSON-RPC | ||
_response_, unless the [session](#session-management) expires. | ||
- After the JSON-RPC _response_ has been sent, the server **MAY** close the SSE | ||
stream at any time. | ||
- Disconnection **MAY** occur at any time (e.g., due to network conditions). | ||
Therefore: | ||
- Disconnection **SHOULD NOT** be interpreted as the client cancelling its | ||
request. | ||
- To cancel, the client **SHOULD** explicitly send an MCP `CancelledNotification`. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Possibly dumb question, but: what should the client do if it wants to cancel a request after a disconnection, in the non-resumable case? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It can POST the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Are there any alternatives here like having support for clients to cancel or discover ongoing requests ? I know that’s a lot of extra complexity but uncanceled requests can cause a lot of challenges There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Servers and clients should be responsible for automatically timing out requests from each other. We may add more language around this into the spec in future. |
||
- To avoid message loss due to disconnection, the server **MAY** make the stream | ||
[resumable](#resumability-and-redelivery). | ||
|
||
3. When the client sends a JSON-RPC _notification_ or _response_ to the MCP endpoint via | ||
POST: | ||
|
||
- If the server accepts the message, it **MUST** return HTTP status code 202 Accepted | ||
with no body. | ||
- If the server cannot accept the message, it **MUST** return an HTTP error status | ||
code (e.g., 400 Bad Request). The HTTP response body **MAY** comprise a JSON-RPC | ||
_error response_ that has no `id`. | ||
|
||
4. The client **MAY** also issue an HTTP GET to the MCP endpoint. This can be used to | ||
open an SSE stream, allowing the server to communicate to the client without the | ||
client first sending a JSON-RPC _request_. | ||
- The client **MUST** include an `Accept` header, listing `text/event-stream` as a | ||
supported content type. | ||
- The server **MUST** either return `Content-Type: text/event-stream` in response to | ||
this HTTP GET, or else return HTTP 405 Method Not Allowed, indicating that the | ||
server does not offer an SSE stream at this endpoint. | ||
- If the server initiates an SSE stream: | ||
- The server **MAY** send JSON-RPC _requests_ and _notifications_ on the stream. | ||
These messages **SHOULD** be unrelated to any concurrently-running JSON-RPC | ||
_request_ from the client. | ||
- The server **MUST NOT** send a JSON-RPC _response_ on the stream **unless** | ||
[resuming](#resumability-and-redelivery) a stream associated with a previous | ||
client request. | ||
- The server **MAY** close the SSE stream at any time. | ||
- The client **MAY** close the SSE stream at any time. | ||
jspahrsummers marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
### Multiple Connections | ||
|
||
1. The client **MAY** remain connected to multiple SSE streams simultaneously. | ||
2. The server **MUST** send each of its JSON-RPC messages on only one of the connected | ||
streams; that is, it **MUST NOT** broadcast the same message across multiple streams. | ||
- The risk of message loss **MAY** be mitigated by making the stream | ||
[resumable](#resumability-and-redelivery). | ||
|
||
### Resumability and Redelivery | ||
jspahrsummers marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
To support resuming broken connections, and redelivering messages that might otherwise be | ||
lost: | ||
|
||
1. Servers **MAY** attach an `id` field to their SSE events, as described in the | ||
[SSE standard](https://html.spec.whatwg.org/multipage/server-sent-events.html#event-stream-interpretation). | ||
- If present, the ID **MUST** be globally unique across all streams within that | ||
[session](#session-management)—or all streams with that specific client, if session | ||
management is not in use. | ||
2. If the client wishes to resume after a broken connection, it **SHOULD** issue an HTTP | ||
GET to the MCP endpoint, and include the | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. How should the server know which session this event id corresponds to? After reading the github discussion, I was expecting the client to remember the As a server author, I could choose to embed the session id in every event id, but that seems wasteful. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. further complicated by the IDs needing to uniquely identify the stream as well. i mentioned this in another thread but i think it got buried: single stream@jspahrsummers is there a reason we dont use a single stream ( clients send via POST i may be missing something but it seems this would simplify things greatly. it would eliminate the complexity and overhead of multiple streams and ensure that replays are consistent up to a point of failure. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Multi-stream adds non-trivial complications to the backend instead of a single stream for stream resumption/re-delivery. In the current spec's single SSE/stream world, backend resumption/redelivery would be done for a single stream. The existing backend complexity of routing messages to the correct stream on the correct machine remains, with the added complexity of buffering messages for that single stream. However, in a multi-stream world, these backend complexities are multiplied by the number of individual streams within a session. The backend would now have to keep track of what streams are active on possibly different machines to determine session liveliness, and routing messages to the correct streams for a given session is a much more complicated mapping/routing problem. Then, in addition to this more complex routing, mapping, and liveliness, you still have to figure out how to implement per-stream buffering for resumption/re-delivery. A single stream is 100% the way to go if you plan on keeping the server's complexity minimal. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Allowing POST requests to turn into a stream permits a gradual increase in complexity in server design. For example, a server design might evolve like this (starting from the simplest version to the most complex):
It's pretty important that 2 can be done without any additional server infrastructure. Resumability is a nice feature but far from required, and this permits servers that don't require it to remain quite simple. If we had only a single stream (like the existing SSE transport), then all streaming would require a message bus. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
@nagl-temporal As described in the "session management" section, the session ID header should be attached to all subsequent HTTP requests. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I agree that a gradual increase in complexity for server design is a goal the protocol should optimize for. However, per-request streams are orthogonal from letting a JSON-RPC POST request return the corresponding JSON-RPC response. You can still support that without adopting multi-stream.
This isn't true in any multi-server production system. The instant any other request/event can impact the results streamed from another production machine, a message bus will be required. Alternatives ApproachesSingle stream, multi-subscription. Keep a streaming endpoint, make it available only if the client/server agree on it as a supported capability and expose cancellable subscriptions through that single streaming endpoint. |
||
[`Last-Event-ID`](https://html.spec.whatwg.org/multipage/server-sent-events.html#the-last-event-id-header) | ||
header to indicate the last event ID it received. | ||
- The server **MAY** use this header to replay messages that would have been sent | ||
after the last event ID, _on the stream that was disconnected_, and to resume the | ||
stream from that point. | ||
- The server **MUST NOT** replay messages that would have been delivered on a | ||
different stream. | ||
|
||
In other words, these event IDs should be assigned by servers on a _per-stream_ basis, to | ||
act as a cursor within that particular stream. | ||
jspahrsummers marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
### Session Management | ||
|
||
An MCP "session" consists of logically related interactions between a client and a | ||
server, beginning with the [initialization phase]({{< ref "lifecycle" >}}). To support | ||
servers which want to establish stateful sessions: | ||
Comment on lines
+159
to
+160
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. not sure that's needed. Is there a world where we can just always let the server start a session at any point by sending a session id? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't want clients to have to figure out what it means for servers to return a session ID at any time. It introduces synchronization challenges for both sides, since (e.g.) it could happen on 1 out of N concurrent streams, or even multiple times on different streams. |
||
|
||
1. A server using the Streamable HTTP transport **MAY** assign a session ID at | ||
initialization time, by including it in the `sessionId` field of the | ||
`InitializeResult`. | ||
- The session ID **SHOULD** be globally unique and cryptographically secure (e.g., a | ||
securely generated UUID, a JWT, or a cryptographic hash). | ||
- The session ID **MUST** only contain visible ASCII characters. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Here we only limit the valid format of session id, can we also limit its maximum length? Because for most backend services or API gateways( Kong, ingress-nginx), in order to protect the backend application, the maximum data that can be received is limited to avoid attacks or abuse. If we can clarify this limitation at the specification level, we will have a reference for subsequent deployment in production environments. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I considered this, but I found it hard to pick a good value—something arbitrary like 100 or 255 could easily be too short, while specifying a very large number feels equally arbitrary and constraining. HTTP also doesn't define a maximum length for header values AFAIK, and implementations treat it differently. I think it might be best to allow implementations to define this themselves, and if we see incompatibilities emerge, we can revisit it in the spec. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. As another point of reference, it seems OAuth also does not specify a maximum length for tokens. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. OAuth does not. While that might have been an oversight (I'm the editor) I do not know of any incidents where it has been an issue. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There have been some security issues in the Kubernetes community caused by sending large payloads. Or we can add a suggestion? For example, it is usually recommended that it does not exceed 1M. I care about this mainly because I believe that many managed MCP servers will emerge in the future. This point is very important for deployment in a production environment. |
||
2. If a `sessionId` is returned in the `InitializeResult`, clients using the Streamable | ||
HTTP transport **MUST** include it in the `Mcp-Session-Id` header on all of their | ||
subsequent HTTP requests. | ||
- Servers that require a session ID **SHOULD** respond to requests without an | ||
`Mcp-Session-Id` header (other than initialization) with HTTP 400 Bad Request. | ||
3. The server **MAY** terminate the session at any time, after which it **MUST** respond | ||
to requests containing that session ID with HTTP 404 Not Found. | ||
4. When a client receives HTTP 404 in response to a request containing an | ||
`Mcp-Session-Id`, it **MUST** start a new session by sending a new `InitializeRequest` | ||
without a session ID attached. | ||
5. Clients that no longer need a particular session (e.g., because the user is leaving | ||
the client application) **SHOULD** send an HTTP DELETE to the MCP endpoint with the | ||
`Mcp-Session-Id` header, to explicitly terminate the session. | ||
- The server **MAY** respond to this request with HTTP 405 Method Not Allowed, | ||
indicating that the server does not allow clients to terminate sessions. | ||
jspahrsummers marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
### Sequence Diagram | ||
|
||
```mermaid | ||
sequenceDiagram | ||
participant Client | ||
participant Server | ||
|
||
Client->>Server: Open SSE connection | ||
Server->>Client: endpoint event | ||
loop Message Exchange | ||
Client->>Server: HTTP POST messages | ||
Server->>Client: SSE message events | ||
note over Client, Server: initialization | ||
|
||
Client->>+Server: POST InitializeRequest | ||
Server->>-Client: InitializeResponse<br>Mcp-Session-Id: 1868a90c... | ||
|
||
Client->>+Server: POST InitializedNotification<br>Mcp-Session-Id: 1868a90c... | ||
Server->>-Client: 202 Accepted | ||
|
||
note over Client, Server: client requests | ||
Client->>+Server: POST ... request ...<br>Mcp-Session-Id: 1868a90c... | ||
|
||
alt single HTTP response | ||
Server->>Client: ... response ... | ||
else server opens SSE stream | ||
loop while connection remains open | ||
Server-)Client: ... SSE messages from server ... | ||
end | ||
Server-)Client: SSE event: ... response ... | ||
end | ||
deactivate Server | ||
|
||
note over Client, Server: client notifications/responses | ||
Client->>+Server: POST ... notification/response ...<br>Mcp-Session-Id: 1868a90c... | ||
Server->>-Client: 202 Accepted | ||
|
||
note over Client, Server: server requests | ||
Client->>+Server: GET<br>Mcp-Session-Id: 1868a90c... | ||
loop while connection remains open | ||
Server-)Client: ... SSE messages from server ... | ||
end | ||
Client->>Server: Close SSE connection | ||
deactivate Server | ||
|
||
``` | ||
|
||
### Backwards Compatibility | ||
|
||
Clients and servers can maintain backwards compatibility with the deprecated [HTTP+SSE | ||
transport]({{< ref "/specification/2024-11-05/basic/transports#http-with-sse" >}}) (from | ||
protocol version 2024-11-05) as follows: | ||
|
||
**Servers** wanting to support older clients should: | ||
|
||
- Continue to host both the SSE and POST endpoints of the old transport, alongside the | ||
new "MCP endpoint" defined for the Streamable HTTP transport. | ||
- It is also possible to combine the old POST endpoint and the new MCP endpoint, but | ||
this may introduce unneeded complexity. | ||
|
||
**Clients** wanting to support older servers should: | ||
|
||
1. Accept an MCP server URL from the user, which may point to either a server using the | ||
old transport or the new transport. | ||
2. Attempt to POST an `InitializeRequest` to the server URL, with an `Accept` header as | ||
defined above: | ||
- If it succeeds or opens an SSE stream in response, the client can assume this is a | ||
jspahrsummers marked this conversation as resolved.
Show resolved
Hide resolved
|
||
server supporting the new Streamable HTTP transport. | ||
- If it fails with an HTTP 4xx status code (e.g., 405 Method Not Allowed or 404 Not | ||
Found): | ||
- Issue a GET request to the server URL, expecting that this will open an SSE stream | ||
and return an `endpoint` event as the first event. | ||
- When the `endpoint` event arrives, the client can assume this is a server running | ||
the old HTTP+SSE transport, and should use that transport for all subsequent | ||
communication. | ||
|
||
## Custom Transports | ||
|
||
Clients and servers **MAY** implement additional custom transport mechanisms to suit | ||
|
Uh oh!
There was an error while loading. Please reload this page.