-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add code for Envoy extension that supports body-to-header translation #355
Add code for Envoy extension that supports body-to-header translation #355
Conversation
Welcome @rramkumar1! |
Hi @rramkumar1. Thanks for your PR. I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
✅ Deploy Preview for gateway-api-inference-extension ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
/ok-to-test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure how it is related with this project. I mean you can deploy this not depending on gateway inference extension
@hzxuzhonghu Sorry I forgot to link the related issue in the initial comment. Fixed that. |
Thanks, now i get the inetention. But i am wondering, should we separate it from the ext-proc binart? IMO, it can be merged into the current binary. Correct me if miss some ctx |
I think it should be a separate binary because this particular extension is intended to execute before the routing decision while endpoint picker is intended to execute after routing decision. |
hmm, the model name header will be used by the http route match later. What confused me most is the epp currenlty only support one model pool. It seems no model match needed to specify in the HTTPRoute |
@hzxuzhonghu for a bit more context, there are different points in time where we can attach an ext_proc extension. We want to attach this extension early in the process so it can add headers before routing logic is computed. That would allow someone to say that requests for the "foo" model should go to the "bar" InferencePool while requests for "baz" model should go to a different InferencePool. Then when the request gets to an InferencePool, the Endpoint Picker extension can select the best endpoint to serve a request for that model within that InferencePool. |
Ah, that makes sense. Would we plan to support multiple inferencepool within a epp process |
That's definitely been a point of discussion, I think it's inevitable that that will be a mode at some point, but it's not part of the initial releases. The rationale for this is:
With all that said, I think these could all be temporary limitations. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @rramkumar1! Once we get the image pipeline set up and some corresponding docs, this will be a great part of the v0.2 release.
Can you also please add a README.md file; I added one for pkg/epp at #386 |
fb09c91
to
72a56e8
Compare
cbbce6f
to
e5a3458
Compare
I added a unit test for the code in request.go. Please let me know if any other code should have unit test coverage as some of it is boilerplatey. For integration tests, should I file a follow-up issue to track? Those seem more involved. |
Thanks for adding the unit tests, yes please, lets open a tracking issue for integration test coverage. |
|
dd9db1a
to
b25275f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Almost there! Can you please confirm that this passes manual testing.
As a followup, we need:
- Manifests for configuring it with envoy gateway
- User guide
b25275f
to
31ede57
Compare
Yes, this passes manual testing! Will file issues for those follow-ups. |
31ede57
to
5e9514a
Compare
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: ahg-g, rramkumar1 The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Ref: #321