Fixes to the adapter rollouts guide

ahg-g · ahg-g · commit b94fd5620020 · 2025-02-14T22:53:50.000Z
diff --git a/cloudbuild.yaml b/cloudbuild.yaml
@@ -12,7 +12,7 @@ steps:
     - GIT_TAG=$_GIT_TAG
     - EXTRA_TAG=$_PULL_BASE_REF
     - DOCKER_BUILDX_CMD=/buildx-entrypoint
-  - name: lora-adapter-syncer
+  - name: gcr.io/k8s-testimages/gcb-docker-gcloud:v20220830-45cbff55bc
     entrypoint: make
     args:
       - syncer-image-push
diff --git a/mkdocs.yml b/mkdocs.yml
@@ -56,6 +56,7 @@ nav:
   - Guides:
     - User Guides:
       - Getting started: guides/index.md
+      - Adapter Rollout: guides/adapter-rollout.md
     - Implementer's Guide: guides/implementers.md
   - Reference:
     - API Reference: reference/spec.md
diff --git a/site-src/guides/adapter-rollout.md b/site-src/guides/adapter-rollout.md
@@ -1,4 +1,4 @@
-# Getting started with Gateway API Inference Extension with Dynamic lora updates on vllm
+# Adapter Rollout
 
 The goal of this guide is to get a single InferencePool running with vLLM and demonstrate use of dynamic lora updating! 
 
@@ -42,6 +42,8 @@ Rest of the steps are same as [general setup](https://github.com/kubernetes-sigs
                     - base-model: meta-llama/Llama-2-7b-hf
                       id: tweet-summary-2
                       source: vineetsharma/qlora-adapter-Llama-2-7b-hf-TweetSumm
+```
+
 2. Configure a canary rollout with traffic split using LLMService. In this example, 40% of traffic for tweet-summary model will be sent to the ***tweet-summary-2*** adapter .
 
 ```yaml
diff --git a/site-src/guides/index.md b/site-src/guides/index.md
@@ -68,6 +68,7 @@ This quickstart guide is intended for engineers familiar with k8s and model serv
    kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/pkg/manifests/gateway/patch_policy.yaml
    ```
    > **_NOTE:_** This is also per InferencePool, and will need to be configured to support the new pool should you wish to experiment further.
+   
 1. **OPTIONALLY**: Apply Traffic Policy
 
    For high-traffic benchmarking you can apply this manifest to avoid any defaults that can cause timeouts/errors.
@@ -89,4 +90,4 @@ This quickstart guide is intended for engineers familiar with k8s and model serv
    "max_tokens": 100,
    "temperature": 0
    }'
-   ```
+   ```