Skip to content

Commit 5b82374

Browse files
authored
fixed filepath that points to gpu based model server deployment in few places (#451)
Signed-off-by: Nir Rozenbaum <[email protected]>
1 parent 9bd981b commit 5b82374

File tree

3 files changed

+4
-4
lines changed

3 files changed

+4
-4
lines changed

hack/release-quickstart.sh

+2-2
Original file line numberDiff line numberDiff line change
@@ -51,9 +51,9 @@ sed -i.bak '/us-central1-docker.pkg.dev\/k8s-staging-images\/gateway-api-inferen
5151
sed -i.bak -E "s|us-central1-docker\.pkg\.dev/k8s-staging-images|registry.k8s.io|g" "$EXT_PROC"
5252

5353
# -----------------------------------------------------------------------------
54-
# Update config/manifests/vllm/deployment.yaml
54+
# Update config/manifests/vllm/gpu-deployment.yaml
5555
# -----------------------------------------------------------------------------
56-
VLLM_DEPLOY="config/manifests/vllm/deployment.yaml"
56+
VLLM_DEPLOY="config/manifests/vllm/gpu-deployment.yaml"
5757
echo "Updating ${VLLM_DEPLOY} ..."
5858

5959
# Update the vLLM image version

site-src/guides/index.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,7 @@ This quickstart guide is intended for engineers familiar with k8s and model serv
2424

2525
#### GPU-Based Model Server
2626

27-
For this setup, you will need 3 GPUs to run the sample model server. Adjust the number of replicas in `./config/manifests/vllm/deployment.yaml` as needed.
27+
For this setup, you will need 3 GPUs to run the sample model server. Adjust the number of replicas in `./config/manifests/vllm/gpu-deployment.yaml` as needed.
2828
Create a Hugging Face secret to download the model [meta-llama/Llama-2-7b-hf](https://huggingface.co/meta-llama/Llama-2-7b-hf). Ensure that the token grants access to this model.
2929
Deploy a sample vLLM deployment with the proper protocol to work with the LLM Instance Gateway.
3030
```bash

test/e2e/e2e_suite_test.go

+1-1
Original file line numberDiff line numberDiff line change
@@ -69,7 +69,7 @@ const (
6969
// clientManifest is the manifest for the client test resources.
7070
clientManifest = "../testdata/client.yaml"
7171
// modelServerManifest is the manifest for the model server test resources.
72-
modelServerManifest = "../../config/manifests/vllm/deployment.yaml"
72+
modelServerManifest = "../../config/manifests/vllm/gpu-deployment.yaml"
7373
// modelServerSecretManifest is the manifest for the model server secret resource.
7474
modelServerSecretManifest = "../testdata/model-secret.yaml"
7575
// inferPoolManifest is the manifest for the inference pool CRD.

0 commit comments

Comments
 (0)