Skip to content

Dynamically select pod instead of statically passing Pods/PodIPs into ext proc #12

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Xunzhuo opened this issue Sep 29, 2024 · 4 comments

Comments

@Xunzhuo
Copy link
Member

Xunzhuo commented Sep 29, 2024

Currently we pass the static pod names and IPs to ext proc server, we should use more dynamic approach to fetch the data from kubernetes cluster like by selectors.

@Joffref
Copy link

Joffref commented Sep 29, 2024

In the near future, a new CR, tentatively called BackendPool, will be introduced. https://github.com/kubernetes-sigs/llm-instance-gateway/blob/d385c80d17f531e5ad155ed3da09de665834f02b/docs/proposals/002-api-proposal/proposal.md?plain=1#L197-L221
This resource will reference the services exposing inference servers that share certain characteristics, mainly the same set of loaded adapters. The final name of this resource is still being discussed, and you can review the related documents for more information: https://docs.google.com/document/d/1v1Rp6v_AfY5EfwpLqDadDpAaCg7OcnrUutzBUNxGoJE/edit?pli=1

Once introduced, BackendPool will be referenced in the backendRefs field of HTTPRoute to manage routing.

The idea of referencing services directly could be re-evaluated, given that using selectors was the original approach. However, directly referencing pods would involve managing a structure similar to EndpointSlices within the gateway. IMO, this adds unnecessary complexity and introduces potential security risks.

Let me know your thoughts on this—I’d be happy to discuss it further.

@liu-cong
Copy link
Contributor

liu-cong commented Oct 1, 2024

cc @kfswain @robscott

@liu-cong
Copy link
Contributor

/close

Closed by #36

@k8s-ci-robot
Copy link
Contributor

@liu-cong: Closing this issue.

In response to this:

/close

Closed by #36

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants