Skip to content

Commit fcad109

Browse files
authored
Envoy update (#18)
* moving all yaml to default namespace * adding new files * small updates * Update PoC to support Envoy best Practices * moving llm-gw ext-proc port to 8081 * Update envoy to patch an HTTPRoute virtual host. Also adding the manifests to the top level Ext-Proc implementation * Removing image ref so the most recent image is used * Grammatical changes
1 parent af75860 commit fcad109

15 files changed

+429
-176
lines changed

examples/poc/README.md

+21-9
Original file line numberDiff line numberDiff line change
@@ -17,29 +17,41 @@ This project sets up an Envoy gateway with a custom external processing which im
1717
### Steps
1818

1919
1. **Deploy Sample vLLM Application**
20+
2021
NOTE: Create a HuggingFace API token and store it in a secret named `hf-token` with key `token`. This is configured in the `HUGGING_FACE_HUB_TOKEN` and `HF_TOKEN` environment variables in `./manifests/samples/vllm-lora-deployment.yaml`.
2122

2223
```bash
23-
kubectl apply -f ./manifests/samples/vllm-lora-deployment.yaml
24-
kubectl apply -f ./manifests/samples/vllm-lora-service.yaml
24+
kubectl apply -f ./manifests/vllm/vllm-lora-deployment.yaml
25+
kubectl apply -f ./manifests/vllm/vllm-lora-service.yaml
2526
```
2627

27-
2. **Install GatewayClass with Ext Proc**
28-
A custom GatewayClass `llm-gateway` which is configured with the llm routing ext proc will be installed into the `llm-gateway` namespace. It's configured to listen on port 8081 for traffic through ext-proc (in addition to the default 8080), see the `EnvoyProxy` configuration in `installation.yaml`. When you create Gateways, make sure the `llm-gateway` GatewayClass is used.
28+
1. **Update Envoy Gateway Config to enable Patch Policy**
29+
30+
Our custom LLM Gateway ext-proc is patched into the existing envoy gateway via `EnvoyPatchPolicy`. To enable this feature, we must extend the Envoy Gateway config map. To do this, simply run:
31+
```bash
32+
kubectl apply -f ./manifests/gateway/enable_patch_policy.yaml
33+
kubectl rollout restart deployment envoy-gateway -n envoy-gateway-system
2934

30-
NOTE: Ensure the `llm-route-ext-proc` deployment is updated with the pod names and internal IP addresses of the vLLM replicas. This step is crucial for the correct routing of requests based on headers. This won't be needed once we make ext proc dynamically read the pods.
35+
```
36+
Additionally, if you would like to enable the admin interface, you can uncomment the admin lines and run this again.
37+
38+
39+
1. **Deploy Gateway**
3140

3241
```bash
33-
kubectl apply -f ./manifests/installation.yaml
42+
kubectl apply -f ./manifests/gateway/gateway.yaml
3443
```
3544

36-
3. **Deploy Gateway**
45+
1. **Deploy Ext-Proc**
3746

3847
```bash
39-
kubectl apply -f ./manifests/samples/gateway.yaml
48+
kubectl apply -f ./manifests/gateway/ext_proc.yaml
49+
kubectl apply -f ./manifests/gateway/patch_policy.yaml
4050
```
51+
**NOTE**: Ensure the `instance-gateway-ext-proc` deployment is updated with the pod names and internal IP addresses of the vLLM replicas. This step is crucial for the correct routing of requests based on headers. This won't be needed once we make ext proc dynamically read the pods.
52+
53+
1. **Try it out**
4154

42-
4. **Try it out**
4355
Wait until the gateway is ready.
4456

4557
```bash
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
apiVersion: v1
2+
kind: ConfigMap
3+
metadata:
4+
name: envoy-gateway-config
5+
namespace: envoy-gateway-system
6+
data:
7+
# This manifest's main purpose is to set `enabledEnvoyPatchPolicy` to `true`.
8+
# Any field under `admin` is optional, and only for enabling the admin endpoints, for debugging.
9+
# Admin Interface: https://www.envoyproxy.io/docs/envoy/latest/operations/admin
10+
# PatchPolicy docs: https://gateway.envoyproxy.io/docs/tasks/extensibility/envoy-patch-policy/#enable-envoypatchpolicy
11+
envoy-gateway.yaml: |
12+
apiVersion: gateway.envoyproxy.io/v1alpha1
13+
kind: EnvoyGateway
14+
provider:
15+
type: Kubernetes
16+
gateway:
17+
controllerName: gateway.envoyproxy.io/gatewayclass-controller
18+
extensionApis:
19+
enableEnvoyPatchPolicy: true
20+
enableBackend: true
21+
# admin:
22+
# enablePprof: true
23+
# address:
24+
# host: 127.0.0.1
25+
# port: 19000
26+
# enabledDumpConfig: true
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,69 @@
1+
apiVersion: apps/v1
2+
kind: Deployment
3+
metadata:
4+
name: instance-gateway-ext-proc
5+
namespace: default
6+
labels:
7+
app: instance-gateway-ext-proc
8+
spec:
9+
replicas: 1
10+
selector:
11+
matchLabels:
12+
app: instance-gateway-ext-proc
13+
template:
14+
metadata:
15+
labels:
16+
app: instance-gateway-ext-proc
17+
spec:
18+
containers:
19+
- name: instance-gateway-ext-proc
20+
image: ghcr.io/tomatillo-and-multiverse/ext-proc:demo
21+
args:
22+
#TODO: specify label selector and dynamically update pods
23+
- -pods
24+
- "vllm-78665f78c4-h4kx4,vllm-78665f78c4-hnz84"
25+
- -podIPs
26+
- "10.24.11.6:8000,10.24.5.7:8000"
27+
- -enable-fairness
28+
- "false"
29+
ports:
30+
- containerPort: 9002
31+
- name: curl
32+
image: curlimages/curl
33+
command: ["sleep", "3600"]
34+
---
35+
apiVersion: v1
36+
kind: Service
37+
metadata:
38+
name: instance-gateway-ext-proc
39+
namespace: default
40+
spec:
41+
selector:
42+
app: instance-gateway-ext-proc
43+
ports:
44+
- protocol: TCP
45+
port: 9002
46+
targetPort: 9002
47+
type: ClusterIP
48+
---
49+
apiVersion: gateway.envoyproxy.io/v1alpha1
50+
kind: EnvoyExtensionPolicy
51+
metadata:
52+
name: ext-proc-policy
53+
namespace: default
54+
spec:
55+
extProc:
56+
- backendRefs:
57+
- group: ""
58+
kind: Service
59+
name: instance-gateway-ext-proc
60+
port: 9002
61+
processingMode:
62+
request:
63+
body: Buffered
64+
response:
65+
messageTimeout: 5s
66+
targetRef:
67+
group: gateway.networking.k8s.io
68+
kind: HTTPRoute
69+
name: llm-route
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,47 @@
1+
2+
---
3+
apiVersion: gateway.networking.k8s.io/v1
4+
kind: Gateway
5+
metadata:
6+
name: <GATEWAY-NAME>
7+
spec:
8+
gatewayClassName: <GATEWAY-NAME>
9+
listeners:
10+
- name: http
11+
protocol: HTTP
12+
port: 8080
13+
- name: llm-gw
14+
protocol: HTTP
15+
port: 8081
16+
---
17+
apiVersion: gateway.networking.k8s.io/v1
18+
kind: GatewayClass
19+
metadata:
20+
name: <GATEWAY-NAME>
21+
spec:
22+
controllerName: gateway.envoyproxy.io/gatewayclass-controller
23+
---
24+
apiVersion: gateway.envoyproxy.io/v1alpha1
25+
kind: Backend
26+
metadata:
27+
name: backend-dummy
28+
spec:
29+
endpoints:
30+
- fqdn:
31+
# Both these values are arbitrary and unused as the PatchPolicy redirects requests.
32+
hostname: 'foo.bar.com'
33+
port: 8080
34+
---
35+
apiVersion: gateway.networking.k8s.io/v1
36+
kind: HTTPRoute
37+
metadata:
38+
name: llm-route
39+
spec:
40+
parentRefs:
41+
- name: inference-gateway
42+
sectionName: llm-gw
43+
rules:
44+
- backendRefs:
45+
- group: gateway.envoyproxy.io
46+
kind: Backend
47+
name: backend-dummy
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,38 @@
1+
apiVersion: gateway.envoyproxy.io/v1alpha1
2+
kind: EnvoyPatchPolicy
3+
metadata:
4+
name: custom-response-patch-policy
5+
namespace: default
6+
spec:
7+
targetRef:
8+
group: gateway.networking.k8s.io
9+
kind: Gateway
10+
name: <GATEWAY-NAME>
11+
type: JSONPatch
12+
jsonPatches:
13+
# Necessary to create a cluster of the type: ORIGINAL_DST to allow for
14+
# direct pod scheduling. Which is heavily utilized in our scheduling.
15+
# Specifically the field `original_dst_lb_config` allows us to enable
16+
# `use_http_header` and `http_header_name`.
17+
# Source: https://www.envoyproxy.io/docs/envoy/latest/api-v3/config/cluster/v3/cluster.proto
18+
- type: "type.googleapis.com/envoy.config.cluster.v3.Cluster"
19+
name: original_destination_cluster
20+
operation:
21+
op: add
22+
path: ""
23+
value:
24+
name: original_destination_cluster
25+
type: ORIGINAL_DST
26+
original_dst_lb_config:
27+
use_http_header: true
28+
http_header_name: "target-pod"
29+
connect_timeout: 6s
30+
lb_policy: CLUSTER_PROVIDED
31+
dns_lookup_family: V4_ONLY
32+
33+
- type: "type.googleapis.com/envoy.config.route.v3.RouteConfiguration"
34+
name: default/<GATEWAY-NAME>/llm-gw
35+
operation:
36+
op: replace
37+
path: "/virtual_hosts/1/routes/0/route/cluster"
38+
value: original_destination_cluster

examples/poc/manifests/installation.yaml

-155
This file was deleted.

0 commit comments

Comments
 (0)