Skip to content

Commit 877332a

Browse files
iacolippoYi4Liu
authored andcommitted
[Doc] Update nginx guide: remove privileged from vllm container run and add target GPU ID (vllm-project#14217)
Signed-off-by: Iacopo Poli <[email protected]>
1 parent 4c260f0 commit 877332a

File tree

1 file changed

+3
-3
lines changed

1 file changed

+3
-3
lines changed

docs/source/deployment/nginx.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -95,14 +95,14 @@ Notes:
9595

9696
- If you have your HuggingFace models cached somewhere else, update `hf_cache_dir` below.
9797
- If you don't have an existing HuggingFace cache you will want to start `vllm0` and wait for the model to complete downloading and the server to be ready. This will ensure that `vllm1` can leverage the model you just downloaded and it won't have to be downloaded again.
98-
- The below example assumes GPU backend used. If you are using CPU backend, remove `--gpus all`, add `VLLM_CPU_KVCACHE_SPACE` and `VLLM_CPU_OMP_THREADS_BIND` environment variables to the docker run command.
98+
- The below example assumes GPU backend used. If you are using CPU backend, remove `--gpus device=ID`, add `VLLM_CPU_KVCACHE_SPACE` and `VLLM_CPU_OMP_THREADS_BIND` environment variables to the docker run command.
9999
- Adjust the model name that you want to use in your vLLM servers if you don't want to use `Llama-2-7b-chat-hf`.
100100

101101
```console
102102
mkdir -p ~/.cache/huggingface/hub/
103103
hf_cache_dir=~/.cache/huggingface/
104-
docker run -itd --ipc host --privileged --network vllm_nginx --gpus all --shm-size=10.24gb -v $hf_cache_dir:/root/.cache/huggingface/ -p 8081:8000 --name vllm0 vllm --model meta-llama/Llama-2-7b-chat-hf
105-
docker run -itd --ipc host --privileged --network vllm_nginx --gpus all --shm-size=10.24gb -v $hf_cache_dir:/root/.cache/huggingface/ -p 8082:8000 --name vllm1 vllm --model meta-llama/Llama-2-7b-chat-hf
104+
docker run -itd --ipc host --network vllm_nginx --gpus device=0 --shm-size=10.24gb -v $hf_cache_dir:/root/.cache/huggingface/ -p 8081:8000 --name vllm0 vllm --model meta-llama/Llama-2-7b-chat-hf
105+
docker run -itd --ipc host --network vllm_nginx --gpus device=1 --shm-size=10.24gb -v $hf_cache_dir:/root/.cache/huggingface/ -p 8082:8000 --name vllm1 vllm --model meta-llama/Llama-2-7b-chat-hf
106106
```
107107

108108
:::{note}

0 commit comments

Comments
 (0)