-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Envoy update #18
Envoy update #18
Changes from 4 commits
868a861
9fa80a9
c3571bd
c46a496
e98db98
234a0ac
cc9105f
32f050c
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,25 @@ | ||
apiVersion: v1 | ||
kind: ConfigMap | ||
metadata: | ||
name: envoy-gateway-config | ||
namespace: envoy-gateway-system | ||
data: | ||
# This manifest's main purpose is to set `enabledEnvoyPatchPolicy` to `true`. | ||
# Any field under `admin` is optional, and only for enabling the admin endpoints, for debugging. | ||
# Admin Interface: https://www.envoyproxy.io/docs/envoy/latest/operations/admin | ||
# PatchPolicy docs: https://gateway.envoyproxy.io/docs/tasks/extensibility/envoy-patch-policy/#enable-envoypatchpolicy | ||
envoy-gateway.yaml: | | ||
apiVersion: gateway.envoyproxy.io/v1alpha1 | ||
kind: EnvoyGateway | ||
provider: | ||
type: Kubernetes | ||
gateway: | ||
controllerName: gateway.envoyproxy.io/gatewayclass-controller | ||
extensionApis: | ||
enableEnvoyPatchPolicy: true | ||
# admin: | ||
# enablePprof: true | ||
# address: | ||
# host: 127.0.0.1 | ||
# port: 19000 | ||
# enabledDumpConfig: true |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,69 @@ | ||
apiVersion: apps/v1 | ||
kind: Deployment | ||
metadata: | ||
name: instance-gateway-ext-proc | ||
namespace: default | ||
labels: | ||
app: instance-gateway-ext-proc | ||
spec: | ||
replicas: 1 | ||
selector: | ||
matchLabels: | ||
app: instance-gateway-ext-proc | ||
template: | ||
metadata: | ||
labels: | ||
app: instance-gateway-ext-proc | ||
spec: | ||
containers: | ||
- name: instance-gateway-ext-proc | ||
image: ghcr.io/tomatillo-and-multiverse/ext-proc:demo | ||
args: | ||
#TODO: specify label selector and dynamically update pods | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. we should actually pass the name of the LLMServerPool There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah, totally agree, I plan to have ext-proc pull the selection from the config on the LSP so there is always a single source of truth (once we have that set up) |
||
- -pods | ||
- "vllm-78665f78c4-h4kx4,vllm-78665f78c4-hnz84" | ||
- -podIPs | ||
- "10.24.11.6:8000,10.24.5.7:8000" | ||
- -enable-fairness | ||
- "false" | ||
ports: | ||
- containerPort: 9002 | ||
- name: curl | ||
image: curlimages/curl | ||
command: ["sleep", "3600"] | ||
--- | ||
apiVersion: v1 | ||
kind: Service | ||
metadata: | ||
name: instance-gateway-ext-proc | ||
namespace: default | ||
spec: | ||
selector: | ||
app: instance-gateway-ext-proc | ||
ports: | ||
- protocol: TCP | ||
port: 9002 | ||
targetPort: 9002 | ||
type: ClusterIP | ||
--- | ||
apiVersion: gateway.envoyproxy.io/v1alpha1 | ||
kind: EnvoyExtensionPolicy | ||
metadata: | ||
name: ext-proc-policy | ||
namespace: default | ||
kfswain marked this conversation as resolved.
Show resolved
Hide resolved
|
||
spec: | ||
extProc: | ||
- backendRefs: | ||
- group: "" | ||
kind: Service | ||
name: instance-gateway-ext-proc | ||
port: 9002 | ||
processingMode: | ||
request: | ||
body: Buffered | ||
response: | ||
messageTimeout: 5s | ||
targetRef: | ||
group: gateway.networking.k8s.io | ||
kind: Gateway | ||
name: llm-gateway |
kfswain marked this conversation as resolved.
Show resolved
Hide resolved
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,78 @@ | ||
apiVersion: gateway.envoyproxy.io/v1alpha1 | ||
kind: EnvoyPatchPolicy | ||
metadata: | ||
name: custom-response-patch-policy | ||
namespace: default | ||
spec: | ||
targetRef: | ||
group: gateway.networking.k8s.io | ||
kind: Gateway | ||
name: inference-gateway | ||
type: JSONPatch | ||
jsonPatches: | ||
# Necessary to create a cluster of the type: ORIGINAL_DST to allow for | ||
# direct pod scheduling. Which is heavily utilized in our scheduling. | ||
# Specifically the field `original_dst_lb_config` allows us to enable | ||
# `use_http_header` and `http_header_name`. | ||
# Source: https://www.envoyproxy.io/docs/envoy/latest/api-v3/config/cluster/v3/cluster.proto | ||
- type: "type.googleapis.com/envoy.config.cluster.v3.Cluster" | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🚀 |
||
name: original_destination_cluster | ||
operation: | ||
op: add | ||
path: "" | ||
value: | ||
name: original_destination_cluster | ||
type: ORIGINAL_DST | ||
original_dst_lb_config: | ||
use_http_header: true | ||
http_header_name: "target-pod" | ||
connect_timeout: 6s | ||
lb_policy: CLUSTER_PROVIDED | ||
dns_lookup_family: V4_ONLY | ||
|
||
# The listener is required to route requests to the original destination | ||
# cluster we just made. | ||
- type: "type.googleapis.com/envoy.config.listener.v3.Listener" | ||
kfswain marked this conversation as resolved.
Show resolved
Hide resolved
|
||
# The listener name is of the form <GatewayNamespace>/<GatewayName>/<GatewayListenerName> | ||
name: default/inference-gateway/http | ||
operation: | ||
op: add | ||
path: "/filter_chains" | ||
value: | ||
- filters: | ||
- name: envoy.filters.network.http_connection_manager | ||
typed_config: | ||
"@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager | ||
stat_prefix: http | ||
codec_type: AUTO | ||
route_config: | ||
name: local_route | ||
virtual_hosts: | ||
- name: backend | ||
domains: ["*"] | ||
routes: | ||
- match: | ||
prefix: "/" | ||
route: | ||
cluster: original_destination_cluster | ||
timeout: 10s | ||
http_filters: | ||
- name: envoy.filters.http.ext_proc | ||
typed_config: | ||
"@type": type.googleapis.com/envoy.extensions.filters.http.ext_proc.v3.ExternalProcessor | ||
failure_mode_allow: false | ||
grpc_service: | ||
envoy_grpc: | ||
# This is the cluster name as created by the EnvoyExtensionPolicy | ||
# Name is of the form <CRDKind>/<GatewayNamespace>/<ExtensionPolicyName>/<IndexOfBackend> | ||
cluster_name: envoyextensionpolicy/default/ext-proc-policy/0 | ||
processing_mode: | ||
request_header_mode: "SEND" | ||
response_header_mode: "SEND" | ||
request_body_mode: "BUFFERED" | ||
response_body_mode: "NONE" | ||
request_trailer_mode: "SKIP" | ||
response_trailer_mode: "SKIP" | ||
- name: envoy.filters.http.router | ||
typed_config: | ||
"@type": type.googleapis.com/envoy.extensions.filters.http.router.v3.Router |
This file was deleted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have instructions for deploying the Envoy gateway controller?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup! The quickstart on line 10 points them there