Skip to content

Commit 391c444

Browse files
committed
adjust the gpu deployment to increase max batch size
1 parent a13a123 commit 391c444

File tree

1 file changed

+8
-0
lines changed

1 file changed

+8
-0
lines changed

config/manifests/vllm/gpu-deployment.yaml

+8
Original file line numberDiff line numberDiff line change
@@ -24,9 +24,17 @@ spec:
2424
- "1"
2525
- "--port"
2626
- "8000"
27+
- "--max-num-seq"
28+
- "1024"
29+
- "--max-model-len"
30+
- "2048"
31+
- "--compilation-config"
32+
- "3"
2733
- "--enable-lora"
2834
- "--max-loras"
2935
- "2"
36+
- "--max-lora-rank"
37+
- "8"
3038
- "--max-cpu-loras"
3139
- "12"
3240
env:

0 commit comments

Comments
 (0)