Skip to content

Commit 0068bea

Browse files
committed
Replacing main for getting started guide
Signed-off-by: Kellen Swain <[email protected]>
1 parent bd43ea4 commit 0068bea

File tree

1 file changed

+22
-22
lines changed

1 file changed

+22
-22
lines changed

Diff for: site-src/guides/index.md

+22-22
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,7 @@ This quickstart guide is intended for engineers familiar with k8s and model serv
2929
Deploy a sample vLLM deployment with the proper protocol to work with the LLM Instance Gateway.
3030
```bash
3131
kubectl create secret generic hf-token --from-literal=token=$HF_TOKEN # Your Hugging Face Token with access to Llama2
32-
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/vllm/gpu-deployment.yaml
32+
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/release-0.2/config/manifests/vllm/gpu-deployment.yaml
3333
```
3434

3535
#### CPU-Based Model Server
@@ -38,37 +38,37 @@ This quickstart guide is intended for engineers familiar with k8s and model serv
3838
Deploy a sample vLLM deployment with the proper protocol to work with the LLM Instance Gateway.
3939
```bash
4040
kubectl create secret generic hf-token --from-literal=token=$HF_TOKEN # Your Hugging Face Token with access to Qwen
41-
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/vllm/cpu-deployment.yaml
41+
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/release-0.2/config/manifests/vllm/cpu-deployment.yaml
4242
```
4343

4444
### Install the Inference Extension CRDs
4545

4646
```bash
47-
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/crd/bases/inference.networking.x-k8s.io_inferencepools.yaml
48-
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/crd/bases/inference.networking.x-k8s.io_inferencemodels.yaml
47+
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/release-0.2/config/crd/bases/inference.networking.x-k8s.io_inferencepools.yaml
48+
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/release-0.2/config/crd/bases/inference.networking.x-k8s.io_inferencemodels.yaml
4949
```
5050

5151
### Deploy InferenceModel
5252

5353
Deploy the sample InferenceModel which is configured to load balance traffic between the `tweet-summary-0` and `tweet-summary-1`
5454
[LoRA adapters](https://docs.vllm.ai/en/latest/features/lora.html) of the sample model server.
5555
```bash
56-
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/inferencemodel.yaml
56+
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/release-0.2/config/manifests/inferencemodel.yaml
5757
```
5858

5959
### Update Envoy Gateway Config to enable Patch Policy**
6060

6161
Our custom LLM Gateway ext-proc is patched into the existing envoy gateway via `EnvoyPatchPolicy`. To enable this feature, we must extend the Envoy Gateway config map. To do this, simply run:
6262
```bash
63-
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/enable_patch_policy.yaml
63+
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/release-0.2/config/manifests/gateway/enable_patch_policy.yaml
6464
kubectl rollout restart deployment envoy-gateway -n envoy-gateway-system
6565
```
6666
Additionally, if you would like to enable the admin interface, you can uncomment the admin lines and run this again.
6767

6868
### Deploy Gateway
6969

7070
```bash
71-
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/gateway.yaml
71+
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/release-0.2/config/manifests/gateway/gateway.yaml
7272
```
7373
> **_NOTE:_** This file couples together the gateway infra and the HTTPRoute infra for a convenient, quick startup. Creating additional/different InferencePools on the same gateway will require an additional set of: `Backend`, `HTTPRoute`, the resources included in the `./config/manifests/gateway/ext-proc.yaml` file, and an additional `./config/manifests/gateway/patch_policy.yaml` file. ***Should you choose to experiment, familiarity with xDS and Envoy are very useful.***
7474
@@ -81,13 +81,13 @@ This quickstart guide is intended for engineers familiar with k8s and model serv
8181
### Deploy the Inference Extension and InferencePool
8282

8383
```bash
84-
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/ext_proc.yaml
84+
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/release-0.2/config/manifests/ext_proc.yaml
8585
```
8686
### Deploy Envoy Gateway Custom Policies
8787

8888
```bash
89-
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/extension_policy.yaml
90-
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/patch_policy.yaml
89+
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/release-0.2/config/manifests/gateway/extension_policy.yaml
90+
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/release-0.2/config/manifests/gateway/patch_policy.yaml
9191
```
9292
> **_NOTE:_** This is also per InferencePool, and will need to be configured to support the new pool should you wish to experiment further.
9393
@@ -96,7 +96,7 @@ This quickstart guide is intended for engineers familiar with k8s and model serv
9696
For high-traffic benchmarking you can apply this manifest to avoid any defaults that can cause timeouts/errors.
9797

9898
```bash
99-
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/traffic_policy.yaml
99+
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/release-0.2/config/manifests/gateway/traffic_policy.yaml
100100
```
101101

102102
### Try it out
@@ -120,16 +120,16 @@ This quickstart guide is intended for engineers familiar with k8s and model serv
120120
The following cleanup assumes you would like to clean ALL resources that were created in this quickstart guide.
121121
please be careful not to delete resources you'd like to keep.
122122
```bash
123-
kubectl delete -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/traffic_policy.yaml --ignore-not-found
124-
kubectl delete -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/extension_policy.yaml --ignore-not-found
125-
kubectl delete -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/patch_policy.yaml --ignore-not-found
126-
kubectl delete -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/ext_proc.yaml --ignore-not-found
127-
kubectl delete -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/gateway.yaml --ignore-not-found
128-
kubectl delete -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/enable_patch_policy.yaml --ignore-not-found
129-
kubectl delete -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/inferencemodel.yaml --ignore-not-found
130-
kubectl delete -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/crd/bases/inference.networking.x-k8s.io_inferencepools.yaml --ignore-not-found
131-
kubectl delete -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/crd/bases/inference.networking.x-k8s.io_inferencemodels.yaml --ignore-not-found
132-
kubectl delete -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/vllm/cpu-deployment.yaml --ignore-not-found
133-
kubectl delete -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/vllm/gpu-deployment.yaml --ignore-not-found
123+
kubectl delete -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/release-0.2/config/manifests/gateway/traffic_policy.yaml --ignore-not-found
124+
kubectl delete -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/release-0.2/config/manifests/gateway/extension_policy.yaml --ignore-not-found
125+
kubectl delete -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/release-0.2/config/manifests/gateway/patch_policy.yaml --ignore-not-found
126+
kubectl delete -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/release-0.2/config/manifests/ext_proc.yaml --ignore-not-found
127+
kubectl delete -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/release-0.2/config/manifests/gateway/gateway.yaml --ignore-not-found
128+
kubectl delete -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/release-0.2/config/manifests/gateway/enable_patch_policy.yaml --ignore-not-found
129+
kubectl delete -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/release-0.2/config/manifests/inferencemodel.yaml --ignore-not-found
130+
kubectl delete -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/release-0.2/config/crd/bases/inference.networking.x-k8s.io_inferencepools.yaml --ignore-not-found
131+
kubectl delete -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/release-0.2/config/crd/bases/inference.networking.x-k8s.io_inferencemodels.yaml --ignore-not-found
132+
kubectl delete -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/release-0.2/config/manifests/vllm/cpu-deployment.yaml --ignore-not-found
133+
kubectl delete -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/release-0.2/config/manifests/vllm/gpu-deployment.yaml --ignore-not-found
134134
kubectl delete secret hf-token --ignore-not-found
135135
```

0 commit comments

Comments
 (0)