Skip to content

Commit c9c0dc1

Browse files
committed
split the proxy and model server protocols for easy reference
1 parent 2913da4 commit c9c0dc1

File tree

2 files changed

+36
-38
lines changed

2 files changed

+36
-38
lines changed

docs/proposals/003-endpoint-picker-protocol/README.md docs/proposals/003-model-server-protocol/README.md

+1-38
Original file line numberDiff line numberDiff line change
@@ -1,41 +1,4 @@
1-
# Endpoint Picker Protocol
2-
3-
The Endpoint Picker, or EPP, is a core component of the inference extension. Ultimately it's
4-
responsible for picking an endpoint from the `InferencePool`. A reference implementation can be
5-
found [here](../../../pkg/epp/).
6-
7-
## Proxy Protocol
8-
9-
This is the protocol between the EPP and the proxy (e.g, Envoy).
10-
11-
The EPP MUST implement the Envoy
12-
[external processing service](https://www.envoyproxy.io/docs/envoy/latest/api-v3/service/ext_proc/v3/external_processor)protocol.
13-
14-
For each HTTP request, the EPP MUST communicate to the proxy the picked model server endpoint via:
15-
16-
1. Setting the `x-gateway-destination-endpoint` HTTP header to the selected endpoint in <ip:port> format.
17-
18-
2. Set an unstructured entry in the [dynamic_metadata](https://github.com/envoyproxy/go-control-plane/blob/c19bf63a811c90bf9e02f8e0dc1dcef94931ebb4/envoy/service/ext_proc/v3/external_processor.pb.go#L320) field of the ext-proc response. The metadata entry for the picked endpoint MUST be wrapped with an outer key (which represents the metadata namespace) with a default of `envoy.lb`.
19-
20-
The final metadata necessary would look like:
21-
```go
22-
dynamicMetadata: {
23-
"envoy.lb": {
24-
"x-gateway-destination-endpoint": <ip:port>"
25-
}
26-
}
27-
```
28-
29-
Note:
30-
- If the EPP did not communicate the server endpoint via these two methods, it MUST return an error.
31-
- The EPP MUST not set two different values in the header and the inner response metadata value.
32-
33-
### Why envoy.lb namespace as a default?
34-
The `envoy.lb` namesapce is a predefined namespace used for subsetting. One common way to use the selected endpoint returned from the server, is [envoy subsets](https://www.envoyproxy.io/docs/envoy/latest/intro/arch_overview/upstream/load_balancing/subsets) where host metadata for subset load balancing must be placed under `envoy.lb`.
35-
36-
Setting different value leads to unpredictable behavior because proxies aren't guaranteed to support both paths, and so this protocol does not define what takes precedence.
37-
38-
## Model Server Protocol
1+
# Model Server Protocol
392

403
This is the protocol between the EPP and the model servers.
414

Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
# Endpoint Picker Protocol
2+
3+
The Endpoint Picker, or EPP, is a core component of the inference extension. Ultimately it's
4+
responsible for picking an endpoint from the `InferencePool`. A reference implementation can be
5+
found [here](../../../pkg/epp/).
6+
7+
This doc defines the protocol between the EPP and the proxy (e.g, Envoy).
8+
9+
The EPP MUST implement the Envoy
10+
[external processing service](https://www.envoyproxy.io/docs/envoy/latest/api-v3/service/ext_proc/v3/external_processor)protocol.
11+
12+
For each HTTP request, the EPP MUST communicate to the proxy the picked model server endpoint via:
13+
14+
1. Setting the `x-gateway-destination-endpoint` HTTP header to the selected endpoint in <ip:port> format.
15+
16+
2. Set an unstructured entry in the [dynamic_metadata](https://github.com/envoyproxy/go-control-plane/blob/c19bf63a811c90bf9e02f8e0dc1dcef94931ebb4/envoy/service/ext_proc/v3/external_processor.pb.go#L320) field of the ext-proc response. The metadata entry for the picked endpoint MUST be wrapped with an outer key (which represents the metadata namespace) with a default of `envoy.lb`.
17+
18+
The final metadata necessary would look like:
19+
```go
20+
dynamicMetadata: {
21+
"envoy.lb": {
22+
"x-gateway-destination-endpoint": <ip:port>"
23+
}
24+
}
25+
```
26+
27+
Note:
28+
- If the EPP did not communicate the server endpoint via these two methods, it MUST return an error.
29+
- The EPP MUST not set two different values in the header and the inner response metadata value.
30+
31+
## Why envoy.lb namespace as a default?
32+
The `envoy.lb` namesapce is a predefined namespace used for subsetting. One common way to use the selected endpoint returned from the server, is [envoy subsets](https://www.envoyproxy.io/docs/envoy/latest/intro/arch_overview/upstream/load_balancing/subsets) where host metadata for subset load balancing must be placed under `envoy.lb`.
33+
34+
Setting different value leads to unpredictable behavior because proxies aren't guaranteed to support both paths, and so this protocol does not define what takes precedence.
35+

0 commit comments

Comments
 (0)