-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Set selected endpoint in extproc response metadata #258
Comments
Let's use the same key suggested in #181: |
@ahg-g What is the metadata used for? Cannot find it used now |
This is used to communicate back the selected endpoint, see the protocol: https://github.com/kubernetes-sigs/gateway-api-inference-extension/tree/main/docs/proposals/003-endpoint-picker-protocol We are speccing two ways for the endpoint picker to communicate the picked endpoint to the proxy, one is via an http header, the other is via the ext-proc request metadata . |
Hmm. envoy doc doesnot require the metadata must be set. FYI: this project doesnot set it at all https://github.com/envoyproxy/ai-gateway/blob/1192716fce86c4a66e275884bbf3d528d5f99ac2/internal/extproc/chatcompletion_processor.go#L104-L108 |
Yes, that is why we are doing #377; isn't that sufficient? |
No, i mean it does not explain why metadata is a must in epp |
@hzxuzhonghu how envoy ai gateway is using the header you referenced? Original_dst cluster? A common way to use the selected endpoint (which imo is slightly better than original_dst for reasons i can explain) is using envoy susbets - as the PR describe. For that you need metadata under envoy.lb |
|
@LiorLieberman Do you mean it is for future collaboration with |
the envoy cluster for the model pods can use Now, the In order for subsetting to work, |
Ah, IC, would like to see the full feature can be supported soon |
To offer optionality for gateway integrations
The text was updated successfully, but these errors were encountered: