Add README.md file to the epp pkg #386

ahg-g · 2025-02-21T15:46:08Z

No description provided.

k8s-ci-robot · 2025-02-21T15:46:16Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ahg-g

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [ahg-g]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

netlify · 2025-02-21T15:46:30Z

✅ Deploy Preview for gateway-api-inference-extension ready!

Name	Link
🔨 Latest commit	`fa4a77d`
🔍 Latest deploy log	https://app.netlify.com/sites/gateway-api-inference-extension/deploys/67b8ea6c7853ef00084fb04f
😎 Deploy Preview	https://deploy-preview-386--gateway-api-inference-extension.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

pkg/epp/README.md

kfswain · 2025-02-21T20:47:07Z

pkg/epp/README.md

@@ -0,0 +1,25 @@
+# The EndPoint Picker (EPP)
+This package provides the reference implementation for the Endpoint Picker (EPP). It implements the [extension protocol](../../docs/proposals/003-endpoint-picker-protocol), enabling a proxy or gateway to request endpoint hints from an extension. As it is implemented now, an EPP instance handles a single `InferencePool` (and so for each `InferencePool`, one must create a dedicated EPP deployment).


Nit: Consider dropping 'As it is implemented now'. My thinking is: we can just update this when/if we make an EPP multitenant

make sense, done.

kfswain · 2025-02-21T20:51:59Z

pkg/epp/README.md

+
+- Endpoint Selection
+  - The EPP determines the appropriate Pod endpoint for the load balancer (LB) to route requests.
+  - It selects from the pool of ready Pods designated by the assigned InferencePool.


Suggested change

- It selects from the pool of ready Pods designated by the assigned InferencePool.

- It selects from the pool of ready Pods designated by the assigned InferencePool's [Selector](https://github.com/kubernetes-sigs/gateway-api-inference-extension/blob/7e3cd457cdcd01339b65861c8e472cf27e6b6e80/api/v1alpha1/inferencepool_types.go#L53) field.

This is also a nit

kfswain · 2025-02-21T20:53:58Z

pkg/epp/README.md

+  - It selects from the pool of ready Pods designated by the assigned InferencePool.
+  - Endpoint selection is contingent on the request's ModelName matching an `InferenceModel` that references the `InferencePool`.
+  - Requests with unmatched ModelName values trigger an error response to the proxy.
+  - The endpoint selection algorithm is detailed below.


Nit: consider removing, the headers I think are prominent enough to draw attention.

kfswain · 2025-02-21T20:55:33Z

pkg/epp/README.md

+  - Requests with unmatched ModelName values trigger an error response to the proxy.
+  - The endpoint selection algorithm is detailed below.
+- Traffic Splitting and ModelName Rewriting
+  - The EPP facilitates controlled rollouts of new adapter versions by implementing traffic splitting between adapters within the same `InferencePool`, as defined by the `InferenceModel`.


More nits: Linking to the InfModel's targetmodel field could be useful

kfswain · 2025-02-21T20:57:26Z

This all looks great! Left some comments, but these are true nits, feel free to disregard. This is already a major improvement from what we had before. Thanks!

/lgtm
/hold

ahg-g

Thanks for the detailed review, addressed all comments

ahg-g · 2025-02-21T21:02:21Z

pkg/epp/README.md

@@ -0,0 +1,25 @@
+# The EndPoint Picker (EPP)
+This package provides the reference implementation for the Endpoint Picker (EPP). It implements the [extension protocol](../../docs/proposals/003-endpoint-picker-protocol), enabling a proxy or gateway to request endpoint hints from an extension. As it is implemented now, an EPP instance handles a single `InferencePool` (and so for each `InferencePool`, one must create a dedicated EPP deployment).


make sense, done.

ahg-g · 2025-02-21T21:03:07Z

pkg/epp/README.md

+
+- Endpoint Selection
+  - The EPP determines the appropriate Pod endpoint for the load balancer (LB) to route requests.
+  - It selects from the pool of ready Pods designated by the assigned InferencePool.


ahg-g · 2025-02-21T21:03:30Z

pkg/epp/README.md

+  - It selects from the pool of ready Pods designated by the assigned InferencePool.
+  - Endpoint selection is contingent on the request's ModelName matching an `InferenceModel` that references the `InferencePool`.
+  - Requests with unmatched ModelName values trigger an error response to the proxy.
+  - The endpoint selection algorithm is detailed below.


ahg-g · 2025-02-21T21:04:25Z

pkg/epp/README.md

+  - Requests with unmatched ModelName values trigger an error response to the proxy.
+  - The endpoint selection algorithm is detailed below.
+- Traffic Splitting and ModelName Rewriting
+  - The EPP facilitates controlled rollouts of new adapter versions by implementing traffic splitting between adapters within the same `InferencePool`, as defined by the `InferenceModel`.


kfswain · 2025-02-21T21:06:59Z

/lgtm

Thanks!

ahg-g · 2025-02-21T21:11:36Z

/hold cancel

* Polish the epp README.md file * Addressed comments

k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Feb 21, 2025

k8s-ci-robot requested a review from danehans February 21, 2025 15:46

k8s-ci-robot requested a review from liu-cong February 21, 2025 15:46

k8s-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Feb 21, 2025

ahg-g force-pushed the epp-readme branch from 0f5f38e to 3c490d1 Compare February 21, 2025 16:15

k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Feb 21, 2025

ahg-g changed the title ~~[WIP] Polish the epp README.md file~~ Polish the epp README.md file Feb 21, 2025

k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 21, 2025

ahg-g changed the title ~~Polish the epp README.md file~~ Add README.md file to the epp pkg Feb 21, 2025

ahg-g mentioned this pull request Feb 21, 2025

Add code for Envoy extension that supports body-to-header translation #355

Merged

kfswain reviewed Feb 21, 2025

View reviewed changes

pkg/epp/README.md Show resolved Hide resolved

Polish the epp README.md file

9f73d52

ahg-g force-pushed the epp-readme branch from 3c490d1 to 9f73d52 Compare February 21, 2025 20:45

kfswain reviewed Feb 21, 2025

View reviewed changes

k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Feb 21, 2025

k8s-ci-robot assigned kfswain Feb 21, 2025

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 21, 2025

Addressed comments

fa4a77d

k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 21, 2025

ahg-g commented Feb 21, 2025

View reviewed changes

k8s-ci-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Feb 21, 2025

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 21, 2025

k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Feb 21, 2025

k8s-ci-robot merged commit 9bd136a into kubernetes-sigs:main Feb 21, 2025
6 of 7 checks passed

kaushikmitr pushed a commit to kaushikmitr/llm-instance-gateway that referenced this pull request Feb 27, 2025

Add README.md file to the epp pkg (kubernetes-sigs#386)

4280306

* Polish the epp README.md file * Addressed comments

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add README.md file to the epp pkg #386

Add README.md file to the epp pkg #386

ahg-g commented Feb 21, 2025

k8s-ci-robot commented Feb 21, 2025

netlify bot commented Feb 21, 2025 •

edited

Loading

kfswain Feb 21, 2025

ahg-g Feb 21, 2025

kfswain Feb 21, 2025

kfswain Feb 21, 2025

ahg-g Feb 21, 2025

kfswain Feb 21, 2025

ahg-g Feb 21, 2025

kfswain Feb 21, 2025

ahg-g Feb 21, 2025

kfswain commented Feb 21, 2025

ahg-g left a comment

ahg-g Feb 21, 2025

ahg-g Feb 21, 2025

ahg-g Feb 21, 2025

ahg-g Feb 21, 2025

kfswain commented Feb 21, 2025

ahg-g commented Feb 21, 2025

		@@ -0,0 +1,25 @@
		# The EndPoint Picker (EPP)
		This package provides the reference implementation for the Endpoint Picker (EPP). It implements the [extension protocol](../../docs/proposals/003-endpoint-picker-protocol), enabling a proxy or gateway to request endpoint hints from an extension. As it is implemented now, an EPP instance handles a single `InferencePool` (and so for each `InferencePool`, one must create a dedicated EPP deployment).

	- It selects from the pool of ready Pods designated by the assigned InferencePool.
	- It selects from the pool of ready Pods designated by the assigned InferencePool's [Selector](https://github.com/kubernetes-sigs/gateway-api-inference-extension/blob/7e3cd457cdcd01339b65861c8e472cf27e6b6e80/api/v1alpha1/inferencepool_types.go#L53) field.

Add README.md file to the epp pkg #386

Add README.md file to the epp pkg #386

Conversation

ahg-g commented Feb 21, 2025

k8s-ci-robot commented Feb 21, 2025

netlify bot commented Feb 21, 2025 • edited Loading

✅ Deploy Preview for gateway-api-inference-extension ready!

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kfswain commented Feb 21, 2025

ahg-g left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kfswain commented Feb 21, 2025

ahg-g commented Feb 21, 2025

netlify bot commented Feb 21, 2025 •

edited

Loading