Skip to content

Commit 015bbc8

Browse files
committed
fiddling with mkdocs syntax
1 parent f4271cc commit 015bbc8

File tree

1 file changed

+9
-17
lines changed

1 file changed

+9
-17
lines changed

site-src/guides/index.md

+9-17
Original file line numberDiff line numberDiff line change
@@ -19,31 +19,27 @@ This quickstart guide is intended for engineers familiar with k8s and model serv
1919
kubectl create secret generic hf-token --from-literal=token=$HF_TOKEN # Your Hugging Face Token with access to Llama2
2020
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/pkg/manifests/vllm/deployment.yaml
2121
```
22-
23-
2. **Install the Inference Extension CRDs:**
22+
1. **Install the Inference Extension CRDs:**
2423

2524
```sh
2625
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/releases/download/v0.1.0/manifests.yaml
27-
```
28-
29-
3. **Deploy InferenceModel**
26+
27+
1. **Deploy InferenceModel**
3028

3129
Deploy the sample InferenceModel which is configured to load balance traffic between the `tweet-summary-0` and `tweet-summary-1`
3230
[LoRA adapters](https://docs.vllm.ai/en/latest/features/lora.html) of the sample model server.
3331
```bash
3432
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/pkg/manifests/inferencemodel.yaml
3533
```
36-
37-
4. **Update Envoy Gateway Config to enable Patch Policy**
34+
1. **Update Envoy Gateway Config to enable Patch Policy**
3835

3936
Our custom LLM Gateway ext-proc is patched into the existing envoy gateway via `EnvoyPatchPolicy`. To enable this feature, we must extend the Envoy Gateway config map. To do this, simply run:
4037
```bash
4138
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/pkg/manifests/gateway/enable_patch_policy.yaml
4239
kubectl rollout restart deployment envoy-gateway -n envoy-gateway-system
4340
```
4441
Additionally, if you would like to enable the admin interface, you can uncomment the admin lines and run this again.
45-
46-
5. **Deploy Gateway**
42+
1. **Deploy Gateway**
4743

4844
```bash
4945
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/pkg/manifests/gateway/gateway.yaml
@@ -56,30 +52,26 @@ This quickstart guide is intended for engineers familiar with k8s and model serv
5652
NAME CLASS ADDRESS PROGRAMMED AGE
5753
inference-gateway inference-gateway <MY_ADDRESS> True 22s
5854
```
59-
60-
6. **Deploy the Inference Extension and InferencePool**
55+
1. **Deploy the Inference Extension and InferencePool**
6156

6257
```bash
6358
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/pkg/manifests/ext_proc.yaml
6459
```
65-
66-
7. **Deploy Envoy Gateway Custom Policies**
60+
1. **Deploy Envoy Gateway Custom Policies**
6761

6862
```bash
6963
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/pkg/manifests/gateway/extension_policy.yaml
7064
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/pkg/manifests/gateway/patch_policy.yaml
7165
```
7266
> **_NOTE:_** This is also per InferencePool, and will need to be configured to support the new pool should you wish to experiment further.
73-
74-
8. **OPTIONALLY**: Apply Traffic Policy
67+
1. **OPTIONALLY**: Apply Traffic Policy
7568

7669
For high-traffic benchmarking you can apply this manifest to avoid any defaults that can cause timeouts/errors.
7770

7871
```bash
7972
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/pkg/manifests/gateway/traffic_policy.yaml
8073
```
81-
82-
9. **Try it out**
74+
1. **Try it out**
8375

8476
Wait until the gateway is ready.
8577

0 commit comments

Comments
 (0)