Skip to content

feat(conformance): Add HTTPRouteMultipleRulesDifferentPools test #834

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

SinaChavoshi
Copy link
Contributor

Description

This PR introduces a new conformance test, HTTPRouteMultipleRulesDifferentPools, which validates the setup with one Gateway and one HTTPRoute to successfully route traffic to multiple, distinct InferencePool backends.

local run results: ( Ran on commit 2a67e7)

go test -v ./conformance -args -debug     -gateway-class gke-l7-regional-external-managed     -cleanup-base-resources=false     -run-test HTTPRouteMultipleRulesDifferentPools
=== RUN   TestConformance
...
=== NAME  TestConformance/HTTPRouteMultipleRulesDifferentPools
    apply.go:279: 2025-05-14T22:27:20.155498191Z: Deleting httproute-multi-pool-rules HTTPRoute
...
--- PASS: TestConformance (161.74s)
    --- SKIP: TestConformance/InferencePoolAccepted (0.00s)
    --- PASS: TestConformance/HTTPRouteMultipleRulesDifferentPools (158.09s)
        --- PASS: TestConformance/HTTPRouteMultipleRulesDifferentPools/HTTPRoute_should_be_Accepted_and_Reconciled (155.11s)
        --- PASS: TestConformance/HTTPRouteMultipleRulesDifferentPools/InferencePool_A_should_be_Accepted (0.08s)
        --- PASS: TestConformance/HTTPRouteMultipleRulesDifferentPools/InferencePool_B_should_be_Accepted (0.11s)
PASS
ok      sigs.k8s.io/gateway-api-inference-extension/conformance 161.955s

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label May 14, 2025
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: SinaChavoshi
Once this PR has been reviewed and has the lgtm label, please assign ahg-g for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot requested review from ahg-g and kfswain May 14, 2025 22:40
@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label May 14, 2025
@k8s-ci-robot
Copy link
Contributor

Hi @SinaChavoshi. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Copy link

netlify bot commented May 14, 2025

Deploy Preview for gateway-api-inference-extension ready!

Name Link
🔨 Latest commit 66f325d
🔍 Latest deploy log https://app.netlify.com/projects/gateway-api-inference-extension/deploys/682cb5942511380008ab0d5b
😎 Deploy Preview https://deploy-preview-834--gateway-api-inference-extension.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label May 14, 2025
Comment on lines +146 to +179
# --- HTTPRoute Definition for Multiple Pools ---
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
name: httproute-multi-pool-rules
namespace: gateway-conformance-app-backend
spec:
parentRefs:
- group: gateway.networking.k8s.io
kind: Gateway
name: conformance-gateway
namespace: gateway-conformance-infra
sectionName: http
rules:
- matches:
- path:
type: PathPrefix
value: /app-a
backendRefs:
- group: inference.networking.x-k8s.io
kind: InferencePool
name: pool-a
port: 8080
weight: 1
- matches:
- path:
type: PathPrefix
value: /app-b
backendRefs:
- group: inference.networking.x-k8s.io
kind: InferencePool
name: pool-b
port: 8080
weight: 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

adding here a comment -
not really related to conformance but more of a general comment on the approach we take for multi pool setup.
(might affect also conformance of course).

currently when using a single EPP and sending an inference request - the model name is specified in the body. we have BBR that can take that and put it also as a header. several issues here:

  • in order to be aligned with how BBR works, it might be better to have this test with header matcher and not path prefix.
  • the bigger issue is, even if previous point is addressed, the ModelName that we get in the request body is LoRA adapter name and not the base model name. we need to somehow understand what is the correct EPP from the specified LoRA name, otherwise, we move the problem to the user, who will need to set a prefix matcher (or header matcher) that matches the correct pool. this doesn't sound to me like a good UX.

the UX I'm expecting is:
user sends a request in a similar way to how it's done with a single pool.
some component logic is triggered BEFORE calling epp (similar to how BBR works). that logic is responsible for injecting the right header/path matcher. then flow continues as usual.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @nirrozenbaum I think the idea here is that we're trying to test splitting two entirely different and disconnected InferencePools. So the /app-a and /app-b prefixes are not meant to be model names, but instead just generic ways to route to different InferencePools. You could theoretically match on any arbitrary request attribute such as header or query param, but I'd hate to have this confused with matching on a model name, as that's not really the intent of the test.

I do think it would be useful to add an e2e test in the future that includes BBR, but that would require us to have a portable way to configure BBR + use it in some conformance tests like this. I definitely want to get to that point, but I think it will take some time.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fair enough

@spencerhance
Copy link

/cc

@k8s-ci-robot k8s-ci-robot requested a review from spencerhance May 16, 2025 18:16
@ahg-g
Copy link
Contributor

ahg-g commented May 20, 2025

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels May 20, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants