File tree 3 files changed +3
-3
lines changed
3 files changed +3
-3
lines changed Original file line number Diff line number Diff line change @@ -23,7 +23,7 @@ This quickstart guide is intended for engineers familiar with k8s and model serv
23
23
1 . ** Install the Inference Extension CRDs:**
24
24
25
25
``` sh
26
- kubectl apply -k https://github.com/kubernetes-sigs/gateway-api-inference-extension/config/crd
26
+ kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/releases/download/v0.1.0/manifests.yaml
27
27
```
28
28
29
29
1 . ** Deploy InferenceModel**
Original file line number Diff line number Diff line change 71
71
spec :
72
72
containers :
73
73
- name : inference-gateway-ext-proc
74
- image : us-central1-docker.pkg.dev/k8s-staging-images/gateway-api-inference-extension/epp:main
74
+ image : us-central1-docker.pkg.dev/k8s-staging-images/gateway-api-inference-extension/epp:v0.1.0
75
75
args :
76
76
- -poolName
77
77
- " vllm-llama2-7b-pool"
Original file line number Diff line number Diff line change 14
14
spec :
15
15
containers :
16
16
- name : lora
17
- image : " vllm/vllm-openai:latest "
17
+ image : " vllm/vllm-openai:0.7.1 "
18
18
imagePullPolicy : Always
19
19
command : ["python3", "-m", "vllm.entrypoints.openai.api_server"]
20
20
args :
You can’t perform that action at this time.
0 commit comments