Skip to content

Commit 6c948b2

Browse files
committed
adjust the gpu deployment to increase max batch size
1 parent a13a123 commit 6c948b2

File tree

1 file changed

+4
-0
lines changed

1 file changed

+4
-0
lines changed

config/manifests/vllm/gpu-deployment.yaml

+4
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,10 @@ spec:
2424
- "1"
2525
- "--port"
2626
- "8000"
27+
- "--max-num-seq"
28+
- "2048"
29+
- "--compilation-config"
30+
- "3"
2731
- "--enable-lora"
2832
- "--max-loras"
2933
- "2"

0 commit comments

Comments
 (0)