Skip to content

Commit 9270ff6

Browse files
authored
Merge pull request #35 from liu-cong/envoy
Some minor fixes in Envoy setup
2 parents e80791b + 9288a9f commit 9270ff6

File tree

4 files changed

+8
-9
lines changed

4 files changed

+8
-9
lines changed

pkg/README.md

+4-4
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@
1010

1111
Our custom LLM Gateway ext-proc is patched into the existing envoy gateway via `EnvoyPatchPolicy`. To enable this feature, we must extend the Envoy Gateway config map. To do this, simply run:
1212
```bash
13-
kubectl apply -f ./manifests/gateway/enable_patch_policy.yaml
13+
kubectl apply -f ./manifests/enable_patch_policy.yaml
1414
kubectl rollout restart deployment envoy-gateway -n envoy-gateway-system
1515

1616
```
@@ -20,14 +20,14 @@
2020
1. **Deploy Gateway**
2121

2222
```bash
23-
kubectl apply -f ./manifests/gateway/gateway.yaml
23+
kubectl apply -f ./manifests/gateway.yaml
2424
```
2525

2626
1. **Deploy Ext-Proc**
2727

2828
```bash
29-
kubectl apply -f ./manifests/gateway/ext_proc.yaml
30-
kubectl apply -f ./manifests/gateway/patch_policy.yaml
29+
kubectl apply -f ./manifests/ext_proc.yaml
30+
kubectl apply -f ./manifests/patch_policy.yaml
3131
```
3232
**NOTE**: Ensure the `instance-gateway-ext-proc` deployment is updated with the pod names and internal IP addresses of the vLLM replicas. This step is crucial for the correct routing of requests based on headers. This won't be needed once we make ext proc dynamically read the pods.
3333

pkg/manifests/ext_proc.yaml

+2-3
Original file line numberDiff line numberDiff line change
@@ -17,15 +17,14 @@ spec:
1717
spec:
1818
containers:
1919
- name: instance-gateway-ext-proc
20+
# TODO(https://github.com/kubernetes-sigs/llm-instance-gateway/issues/34) Update the image and args.
2021
image: <BUILT-IMAGE>
2122
args:
22-
#TODO: specify label selector and dynamically update pods
23+
# TODO(https://github.com/kubernetes-sigs/llm-instance-gateway/issues/12) Remove this once ext proc can dynamically reconcile on LLMServerPool.
2324
- -pods
2425
- "vllm-78665f78c4-h4kx4,vllm-78665f78c4-hnz84"
2526
- -podIPs
2627
- "10.24.11.6:8000,10.24.5.7:8000"
27-
- -enable-fairness
28-
- "false"
2928
ports:
3029
- containerPort: 9002
3130
- name: curl

pkg/manifests/gateway.yaml

+1-1
Original file line numberDiff line numberDiff line change
@@ -38,7 +38,7 @@ metadata:
3838
name: llm-route
3939
spec:
4040
parentRefs:
41-
- name: inference-gateway
41+
- name: <GATEWAY-NAME>
4242
sectionName: llm-gw
4343
rules:
4444
- backendRefs:

pkg/manifests/patch_policy.yaml

+1-1
Original file line numberDiff line numberDiff line change
@@ -34,5 +34,5 @@ spec:
3434
name: default/<GATEWAY-NAME>/llm-gw
3535
operation:
3636
op: replace
37-
path: "/virtual_hosts/1/routes/0/route/cluster"
37+
path: "/virtual_hosts/0/routes/0/route/cluster"
3838
value: original_destination_cluster

0 commit comments

Comments
 (0)