Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set selected endpoint in extproc response metadata #258

Closed
ahg-g opened this issue Jan 30, 2025 · 11 comments · Fixed by #270
Closed

Set selected endpoint in extproc response metadata #258

ahg-g opened this issue Jan 30, 2025 · 11 comments · Fixed by #270

Comments

@ahg-g
Copy link
Contributor

ahg-g commented Jan 30, 2025

To offer optionality for gateway integrations

@AndresGuedez
Copy link

Let's use the same key suggested in #181: x-gateway-destination-endpoint, but added to the dynamic_metadata field in the ext_proc ProcessingResponse: https://github.com/envoyproxy/envoy/blob/main/api/envoy/service/ext_proc/v3/external_processor.proto#L174.

@hzxuzhonghu
Copy link
Member

@ahg-g What is the metadata used for? Cannot find it used now

@ahg-g
Copy link
Contributor Author

ahg-g commented Feb 20, 2025

This is used to communicate back the selected endpoint, see the protocol: https://github.com/kubernetes-sigs/gateway-api-inference-extension/tree/main/docs/proposals/003-endpoint-picker-protocol

We are speccing two ways for the endpoint picker to communicate the picked endpoint to the proxy, one is via an http header, the other is via the ext-proc request metadata .

@hzxuzhonghu
Copy link
Member

Hmm. envoy doc doesnot require the metadata must be set. FYI: this project doesnot set it at all https://github.com/envoyproxy/ai-gateway/blob/1192716fce86c4a66e275884bbf3d528d5f99ac2/internal/extproc/chatcompletion_processor.go#L104-L108

@ahg-g
Copy link
Contributor Author

ahg-g commented Feb 21, 2025

Yes, that is why we are doing #377; isn't that sufficient?

@hzxuzhonghu
Copy link
Member

No, i mean it does not explain why metadata is a must in epp

@LiorLieberman
Copy link
Member

@hzxuzhonghu how envoy ai gateway is using the header you referenced? Original_dst cluster?

A common way to use the selected endpoint (which imo is slightly better than original_dst for reasons i can explain) is using envoy susbets - as the PR describe. For that you need metadata under envoy.lb

@hzxuzhonghu
Copy link
Member

https://www.envoyproxy.io/docs/envoy/latest/api-v3/config/cluster/v3/cluster.proto.html#envoy-v3-api-msg-config-cluster-v3-cluster-originaldstlbconfig

http_header_name
([string](https://developers.google.com/protocol-buffers/docs/proto#scalar)) The http header to override destination address if [use_http_header](https://www.envoyproxy.io/docs/envoy/latest/api-v3/config/cluster/v3/cluster.proto.html#envoy-v3-api-field-config-cluster-v3-cluster-originaldstlbconfig-use-http-header). is set to true. If the value is empty, [x-envoy-original-dst-host](https://www.envoyproxy.io/docs/envoy/latest/configuration/http/http_conn_man/headers#config-http-conn-man-headers-x-envoy-original-dst-host) will be used.

@hzxuzhonghu
Copy link
Member

@LiorLieberman Do you mean it is for future collaboration with metadata_match https://www.envoyproxy.io/docs/envoy/latest/api-v3/config/endpoint/v3/endpoint_components.proto

@LiorLieberman
Copy link
Member

LiorLieberman commented Feb 21, 2025

the envoy cluster for the model pods can use lb_subset_config- ref: https://www.envoyproxy.io/docs/envoy/latest/intro/arch_overview/upstream/load_balancing/subsets

Now, the subset_selector can be x-gateway-destination-endpoint.

In order for subsetting to work, x-gateway-destination-endpoint has to be under envoy.lb.

@hzxuzhonghu
Copy link
Member

Ah, IC, would like to see the full feature can be supported soon

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants