From 36cd051975952328c438d2c6e92035f4d83cdde3 Mon Sep 17 00:00:00 2001
From: Kellen Swain <kfswain@google.com>
Date: Mon, 10 Feb 2025 19:09:59 +0000
Subject: [PATCH 1/4] Link to v0.1.0 getting started guide

---
 site-src/guides/index.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/site-src/guides/index.md b/site-src/guides/index.md
index 92f6412a..d5e943d4 100644
--- a/site-src/guides/index.md
+++ b/site-src/guides/index.md
@@ -1,3 +1,3 @@
 # Getting started with Gateway API Inference Extension
 
-TODO
\ No newline at end of file
+To get started using our project follow this guide [here](https://github.com/kubernetes-sigs/gateway-api-inference-extension/blob/release-v0.1.0/pkg/README.md)!
\ No newline at end of file

From e9224d8ec319b4a4a7e3a1ec55feb4fd571c1f06 Mon Sep 17 00:00:00 2001
From: Kellen Swain <kfswain@google.com>
Date: Mon, 10 Feb 2025 19:24:45 +0000
Subject: [PATCH 2/4] Moving getting started guide to the site

---
 pkg/README.md            | 95 +---------------------------------------
 site-src/guides/index.md | 95 +++++++++++++++++++++++++++++++++++++++-
 2 files changed, 95 insertions(+), 95 deletions(-)

diff --git a/pkg/README.md b/pkg/README.md
index 04ebfde2..b53ef777 100644
--- a/pkg/README.md
+++ b/pkg/README.md
@@ -1,96 +1,3 @@
 ## Quickstart
 
-This quickstart guide is intended for engineers familiar with k8s and model servers (vLLM in this instance). The goal of this guide is to get a first, single InferencePool up and running! 
-
-### Requirements
- - Envoy Gateway [v1.2.1](https://gateway.envoyproxy.io/docs/install/install-yaml/#install-with-yaml) or higher
- - A cluster with:
-   - Support for Services of type `LoadBalancer`. (This can be validated by ensuring your Envoy Gateway is up and running). For example, with Kind,
-     you can follow [these steps](https://kind.sigs.k8s.io/docs/user/loadbalancer).
-   - 3 GPUs to run the sample model server. Adjust the number of replicas in `./manifests/vllm/deployment.yaml` as needed.
-
-### Steps
-
-1. **Deploy Sample Model Server**
-
-   Create a Hugging Face secret to download the model [meta-llama/Llama-2-7b-hf](https://huggingface.co/meta-llama/Llama-2-7b-hf). Ensure that the token grants access to this model.
-   Deploy a sample vLLM deployment with the proper protocol to work with the LLM Instance Gateway.
-   ```bash
-   kubectl create secret generic hf-token --from-literal=token=$HF_TOKEN # Your Hugging Face Token with access to Llama2
-   kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/pkg/manifests/vllm/deployment.yaml
-   ```
-
-1. **Install the Inference Extension CRDs:**
-
-   ```sh
-   kubectl apply -k https://github.com/kubernetes-sigs/gateway-api-inference-extension/config/crd
-   ```
-
-1. **Deploy InferenceModel**
-
-   Deploy the sample InferenceModel which is configured to load balance traffic between the `tweet-summary-0` and `tweet-summary-1`
-   [LoRA adapters](https://docs.vllm.ai/en/latest/features/lora.html) of the sample model server.
-   ```bash
-   kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/pkg/manifests/inferencemodel.yaml
-   ```
-
-1. **Update Envoy Gateway Config to enable Patch Policy**
-
-   Our custom LLM Gateway ext-proc is patched into the existing envoy gateway via `EnvoyPatchPolicy`. To enable this feature, we must extend the Envoy Gateway config map. To do this, simply run:
-   ```bash
-   kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/pkg/manifests/gateway/enable_patch_policy.yaml
-   kubectl rollout restart deployment envoy-gateway -n envoy-gateway-system
-   ```
-   Additionally, if you would like to enable the admin interface, you can uncomment the admin lines and run this again.
-
-1. **Deploy Gateway**
-
-   ```bash
-   kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/pkg/manifests/gateway/gateway.yaml
-   ```
-   > **_NOTE:_** This file couples together the gateway infra and the HTTPRoute infra for a convenient, quick startup. Creating additional/different InferencePools on the same gateway will require an additional set of: `Backend`, `HTTPRoute`, the resources included in the `./manifests/gateway/ext-proc.yaml` file, and an additional `./manifests/gateway/patch_policy.yaml` file. ***Should you choose to experiment, familiarity with xDS and Envoy are very useful.***
-
-   Confirm that the Gateway was assigned an IP address and reports a `Programmed=True` status:
-   ```bash
-   $ kubectl get gateway inference-gateway
-   NAME                CLASS               ADDRESS         PROGRAMMED   AGE
-   inference-gateway   inference-gateway   <MY_ADDRESS>    True         22s
-   ```
-
-1. **Deploy the Inference Extension and InferencePool**
-
-   ```bash
-   kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/pkg/manifests/ext_proc.yaml
-   ```
-
-1. **Deploy Envoy Gateway Custom Policies**
-
-   ```bash
-   kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/pkg/manifests/gateway/extension_policy.yaml
-   kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/pkg/manifests/gateway/patch_policy.yaml
-   ```
-   > **_NOTE:_** This is also per InferencePool, and will need to be configured to support the new pool should you wish to experiment further.
-
-1. **OPTIONALLY**: Apply Traffic Policy
-
-   For high-traffic benchmarking you can apply this manifest to avoid any defaults that can cause timeouts/errors.
-
-   ```bash
-   kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/pkg/manifests/gateway/traffic_policy.yaml
-   ```
-
-1. **Try it out**
-
-   Wait until the gateway is ready.
-
-   ```bash
-   IP=$(kubectl get gateway/inference-gateway -o jsonpath='{.status.addresses[0].value}')
-   PORT=8081
-
-   curl -i ${IP}:${PORT}/v1/completions -H 'Content-Type: application/json' -d '{
-   "model": "tweet-summary",
-   "prompt": "Write as if you were a critic: San Francisco",
-   "max_tokens": 100,
-   "temperature": 0
-   }'
-   ```
\ No newline at end of file
+Please refer to our Getting started guide here: https://gateway-api-inference-extension.sigs.k8s.io/guides/
\ No newline at end of file
diff --git a/site-src/guides/index.md b/site-src/guides/index.md
index d5e943d4..8a175fdc 100644
--- a/site-src/guides/index.md
+++ b/site-src/guides/index.md
@@ -1,3 +1,96 @@
 # Getting started with Gateway API Inference Extension
 
-To get started using our project follow this guide [here](https://github.com/kubernetes-sigs/gateway-api-inference-extension/blob/release-v0.1.0/pkg/README.md)!
\ No newline at end of file
+This quickstart guide is intended for engineers familiar with k8s and model servers (vLLM in this instance). The goal of this guide is to get a first, single InferencePool up and running! 
+
+### Requirements
+ - Envoy Gateway [v1.2.1](https://gateway.envoyproxy.io/docs/install/install-yaml/#install-with-yaml) or higher
+ - A cluster with:
+   - Support for Services of type `LoadBalancer`. (This can be validated by ensuring your Envoy Gateway is up and running). For example, with Kind,
+     you can follow [these steps](https://kind.sigs.k8s.io/docs/user/loadbalancer).
+   - 3 GPUs to run the sample model server. Adjust the number of replicas in `./manifests/vllm/deployment.yaml` as needed.
+
+### Steps
+
+1. **Deploy Sample Model Server**
+
+   Create a Hugging Face secret to download the model [meta-llama/Llama-2-7b-hf](https://huggingface.co/meta-llama/Llama-2-7b-hf). Ensure that the token grants access to this model.
+   Deploy a sample vLLM deployment with the proper protocol to work with the LLM Instance Gateway.
+   ```bash
+   kubectl create secret generic hf-token --from-literal=token=$HF_TOKEN # Your Hugging Face Token with access to Llama2
+   kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/pkg/manifests/vllm/deployment.yaml
+   ```
+
+1. **Install the Inference Extension CRDs:**
+
+   ```sh
+   kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/releases/download/v0.1.0/manifests.yaml
+   ```
+
+1. **Deploy InferenceModel**
+
+   Deploy the sample InferenceModel which is configured to load balance traffic between the `tweet-summary-0` and `tweet-summary-1`
+   [LoRA adapters](https://docs.vllm.ai/en/latest/features/lora.html) of the sample model server.
+   ```bash
+   kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/pkg/manifests/inferencemodel.yaml
+   ```
+
+1. **Update Envoy Gateway Config to enable Patch Policy**
+
+   Our custom LLM Gateway ext-proc is patched into the existing envoy gateway via `EnvoyPatchPolicy`. To enable this feature, we must extend the Envoy Gateway config map. To do this, simply run:
+   ```bash
+   kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/pkg/manifests/gateway/enable_patch_policy.yaml
+   kubectl rollout restart deployment envoy-gateway -n envoy-gateway-system
+   ```
+   Additionally, if you would like to enable the admin interface, you can uncomment the admin lines and run this again.
+
+1. **Deploy Gateway**
+
+   ```bash
+   kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/pkg/manifests/gateway/gateway.yaml
+   ```
+   > **_NOTE:_** This file couples together the gateway infra and the HTTPRoute infra for a convenient, quick startup. Creating additional/different InferencePools on the same gateway will require an additional set of: `Backend`, `HTTPRoute`, the resources included in the `./manifests/gateway/ext-proc.yaml` file, and an additional `./manifests/gateway/patch_policy.yaml` file. ***Should you choose to experiment, familiarity with xDS and Envoy are very useful.***
+
+   Confirm that the Gateway was assigned an IP address and reports a `Programmed=True` status:
+   ```bash
+   $ kubectl get gateway inference-gateway
+   NAME                CLASS               ADDRESS         PROGRAMMED   AGE
+   inference-gateway   inference-gateway   <MY_ADDRESS>    True         22s
+   ```
+
+1. **Deploy the Inference Extension and InferencePool**
+
+   ```bash
+   kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/pkg/manifests/ext_proc.yaml
+   ```
+
+1. **Deploy Envoy Gateway Custom Policies**
+
+   ```bash
+   kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/pkg/manifests/gateway/extension_policy.yaml
+   kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/pkg/manifests/gateway/patch_policy.yaml
+   ```
+   > **_NOTE:_** This is also per InferencePool, and will need to be configured to support the new pool should you wish to experiment further.
+
+1. **OPTIONALLY**: Apply Traffic Policy
+
+   For high-traffic benchmarking you can apply this manifest to avoid any defaults that can cause timeouts/errors.
+
+   ```bash
+   kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/pkg/manifests/gateway/traffic_policy.yaml
+   ```
+
+1. **Try it out**
+
+   Wait until the gateway is ready.
+
+   ```bash
+   IP=$(kubectl get gateway/inference-gateway -o jsonpath='{.status.addresses[0].value}')
+   PORT=8081
+
+   curl -i ${IP}:${PORT}/v1/completions -H 'Content-Type: application/json' -d '{
+   "model": "tweet-summary",
+   "prompt": "Write as if you were a critic: San Francisco",
+   "max_tokens": 100,
+   "temperature": 0
+   }'
+   ```
\ No newline at end of file

From f4271cc542b2a9b6c25ad6243c32a4775639e7c5 Mon Sep 17 00:00:00 2001
From: Kellen Swain <kfswain@google.com>
Date: Mon, 10 Feb 2025 19:27:46 +0000
Subject: [PATCH 3/4] site doesnt support markdown syntax for ordered lists,
 making explicit

---
 site-src/guides/index.md | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/site-src/guides/index.md b/site-src/guides/index.md
index 8a175fdc..580993f5 100644
--- a/site-src/guides/index.md
+++ b/site-src/guides/index.md
@@ -20,13 +20,13 @@ This quickstart guide is intended for engineers familiar with k8s and model serv
    kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/pkg/manifests/vllm/deployment.yaml
    ```
 
-1. **Install the Inference Extension CRDs:**
+2. **Install the Inference Extension CRDs:**
 
    ```sh
    kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/releases/download/v0.1.0/manifests.yaml
    ```
 
-1. **Deploy InferenceModel**
+3. **Deploy InferenceModel**
 
    Deploy the sample InferenceModel which is configured to load balance traffic between the `tweet-summary-0` and `tweet-summary-1`
    [LoRA adapters](https://docs.vllm.ai/en/latest/features/lora.html) of the sample model server.
@@ -34,7 +34,7 @@ This quickstart guide is intended for engineers familiar with k8s and model serv
    kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/pkg/manifests/inferencemodel.yaml
    ```
 
-1. **Update Envoy Gateway Config to enable Patch Policy**
+4. **Update Envoy Gateway Config to enable Patch Policy**
 
    Our custom LLM Gateway ext-proc is patched into the existing envoy gateway via `EnvoyPatchPolicy`. To enable this feature, we must extend the Envoy Gateway config map. To do this, simply run:
    ```bash
@@ -43,7 +43,7 @@ This quickstart guide is intended for engineers familiar with k8s and model serv
    ```
    Additionally, if you would like to enable the admin interface, you can uncomment the admin lines and run this again.
 
-1. **Deploy Gateway**
+5. **Deploy Gateway**
 
    ```bash
    kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/pkg/manifests/gateway/gateway.yaml
@@ -57,13 +57,13 @@ This quickstart guide is intended for engineers familiar with k8s and model serv
    inference-gateway   inference-gateway   <MY_ADDRESS>    True         22s
    ```
 
-1. **Deploy the Inference Extension and InferencePool**
+6. **Deploy the Inference Extension and InferencePool**
 
    ```bash
    kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/pkg/manifests/ext_proc.yaml
    ```
 
-1. **Deploy Envoy Gateway Custom Policies**
+7. **Deploy Envoy Gateway Custom Policies**
 
    ```bash
    kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/pkg/manifests/gateway/extension_policy.yaml
@@ -71,7 +71,7 @@ This quickstart guide is intended for engineers familiar with k8s and model serv
    ```
    > **_NOTE:_** This is also per InferencePool, and will need to be configured to support the new pool should you wish to experiment further.
 
-1. **OPTIONALLY**: Apply Traffic Policy
+8. **OPTIONALLY**: Apply Traffic Policy
 
    For high-traffic benchmarking you can apply this manifest to avoid any defaults that can cause timeouts/errors.
 
@@ -79,7 +79,7 @@ This quickstart guide is intended for engineers familiar with k8s and model serv
    kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/pkg/manifests/gateway/traffic_policy.yaml
    ```
 
-1. **Try it out**
+9. **Try it out**
 
    Wait until the gateway is ready.
 

From 015bbc8e2657c470db4a86d0ca149022487386a6 Mon Sep 17 00:00:00 2001
From: Kellen Swain <kfswain@google.com>
Date: Mon, 10 Feb 2025 19:40:57 +0000
Subject: [PATCH 4/4] fiddling with mkdocs syntax

---
 site-src/guides/index.md | 26 +++++++++-----------------
 1 file changed, 9 insertions(+), 17 deletions(-)

diff --git a/site-src/guides/index.md b/site-src/guides/index.md
index 580993f5..e4cbec6f 100644
--- a/site-src/guides/index.md
+++ b/site-src/guides/index.md
@@ -19,22 +19,19 @@ This quickstart guide is intended for engineers familiar with k8s and model serv
    kubectl create secret generic hf-token --from-literal=token=$HF_TOKEN # Your Hugging Face Token with access to Llama2
    kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/pkg/manifests/vllm/deployment.yaml
    ```
-
-2. **Install the Inference Extension CRDs:**
+1. **Install the Inference Extension CRDs:**
 
    ```sh
    kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/releases/download/v0.1.0/manifests.yaml
-   ```
-
-3. **Deploy InferenceModel**
+   
+1. **Deploy InferenceModel**
 
    Deploy the sample InferenceModel which is configured to load balance traffic between the `tweet-summary-0` and `tweet-summary-1`
    [LoRA adapters](https://docs.vllm.ai/en/latest/features/lora.html) of the sample model server.
    ```bash
    kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/pkg/manifests/inferencemodel.yaml
    ```
-
-4. **Update Envoy Gateway Config to enable Patch Policy**
+1. **Update Envoy Gateway Config to enable Patch Policy**
 
    Our custom LLM Gateway ext-proc is patched into the existing envoy gateway via `EnvoyPatchPolicy`. To enable this feature, we must extend the Envoy Gateway config map. To do this, simply run:
    ```bash
@@ -42,8 +39,7 @@ This quickstart guide is intended for engineers familiar with k8s and model serv
    kubectl rollout restart deployment envoy-gateway -n envoy-gateway-system
    ```
    Additionally, if you would like to enable the admin interface, you can uncomment the admin lines and run this again.
-
-5. **Deploy Gateway**
+1. **Deploy Gateway**
 
    ```bash
    kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/pkg/manifests/gateway/gateway.yaml
@@ -56,30 +52,26 @@ This quickstart guide is intended for engineers familiar with k8s and model serv
    NAME                CLASS               ADDRESS         PROGRAMMED   AGE
    inference-gateway   inference-gateway   <MY_ADDRESS>    True         22s
    ```
-
-6. **Deploy the Inference Extension and InferencePool**
+1. **Deploy the Inference Extension and InferencePool**
 
    ```bash
    kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/pkg/manifests/ext_proc.yaml
    ```
-
-7. **Deploy Envoy Gateway Custom Policies**
+1. **Deploy Envoy Gateway Custom Policies**
 
    ```bash
    kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/pkg/manifests/gateway/extension_policy.yaml
    kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/pkg/manifests/gateway/patch_policy.yaml
    ```
    > **_NOTE:_** This is also per InferencePool, and will need to be configured to support the new pool should you wish to experiment further.
-
-8. **OPTIONALLY**: Apply Traffic Policy
+1. **OPTIONALLY**: Apply Traffic Policy
 
    For high-traffic benchmarking you can apply this manifest to avoid any defaults that can cause timeouts/errors.
 
    ```bash
    kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/pkg/manifests/gateway/traffic_policy.yaml
    ```
-
-9. **Try it out**
+1. **Try it out**
 
    Wait until the gateway is ready.