Skip to content

Commit d1669b8

Browse files
Configure gpu-deployment.yaml to force vLLM v1 with LoRA
Until 0.8.3 is released, using the LoRA flag disables automatic v1 opt-in.
1 parent 731f244 commit d1669b8

File tree

1 file changed

+4
-0
lines changed

1 file changed

+4
-0
lines changed

config/manifests/vllm/gpu-deployment.yaml

+4
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,10 @@ spec:
3333
- '{"name": "tweet-summary-0", "path": "vineetsharma/qlora-adapter-Llama-2-7b-hf-TweetSumm", "base_model_name": "llama-2"}'
3434
- '{"name": "tweet-summary-1", "path": "vineetsharma/qlora-adapter-Llama-2-7b-hf-TweetSumm", "base_model_name": "llama-2"}'
3535
env:
36+
# Enabling LoRA support temporarily disables automatic v1, we want to force it on
37+
# until 0.8.3 vLLM is released.
38+
- name: VLLM_USE_V1
39+
value: "1"
3640
- name: PORT
3741
value: "8000"
3842
- name: HUGGING_FACE_HUB_TOKEN

0 commit comments

Comments
 (0)