-
-
Notifications
You must be signed in to change notification settings - Fork 7.5k
[RFC]: copy pynvml code into vllm codebase #12977
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
The approach I take here, is inspired by |
Actually, it's not a bad idea, especially only need to copy one file. PyTorch also leverage some similar ideas to solve some build or deps issue, such as using miniz as zip library, etc. |
This should have been done a long time ago |
Closing as completed by #12963 |
This code is copied from nvidia-ml-py and as per vllm-project#12977 we will need to periodically sync the code to pick up bugfixes. Signed-off-by: Mark McLoughlin <[email protected]>
This code is copied from nvidia-ml-py and as per vllm-project#12977 we will need to periodically sync the code to pick up bugfixes. Signed-off-by: Mark McLoughlin <[email protected]>
This code is copied from nvidia-ml-py and as per vllm-project#12977 we will need to periodically sync the code to pick up bugfixes. Signed-off-by: Mark McLoughlin <[email protected]>
Motivation.
We have suffered a lot from the module
pynvml
recently, see #12847 for example.libnvml.so is the library behind nvidia-smi, and pynvml is a Python wrapper around it. We use it to get GPU status without initializing CUDA context in the current process.
Historically, there are two packages that provide a module named
pynvml
:nvidia-ml-py
(https://pypi.org/project/nvidia-ml-py/): The official wrapper. It is a dependency of vLLM, and is installed when users install vLLM. It provides a Python module namedpynvml
.pynvml
(https://pypi.org/project/pynvml/): An unofficial wrapper. Prior to version 12.0, it also provides a Python modulepynvml
, and therefore conflicts with the official one. What's worse, the module is a Python package, and has higher priority than the official one which is a standalone Python file. This causes errors when both of them are installed. Starting from version 12.0, it migrates to a new module namedpynvml_utils
to avoid the conflict.To make vLLM work, we have to make sure, there's no
pynvml
package, or thepynvml
package has version 12.0 or higher. However, neither of them is a doable solution:pynvml
just to make vLLM work.pynvml==12.0
as vLLM's dependency, then it can work for vLLM, but will break other libraries. Notably, deepspeed depends onpynvml==11.5.0
: https://github.com/ray-project/ray/blob/9e3ec5972cd952d2b50f3b20abc24ced5abb8b54/python/requirements_compiled.txt#L1611 The module is so confusing, that lots of community libraries don't knownvidia-ml-py
is the official one. Lots of community libraries dependspynvml
, e.g. https://github.com/Sygil-Dev/sygil-webui/blob/d88fa9e8c4d9cefbbfb0b445ad79d4ddb85c8e36/requirements.txt#L17 . What's worse, even nvidia official containernvcr.io/nvidia/pytorch:25.01-py3
uses the unofficialpynvml<12.0
.To summarize, we are in a dependency hell due to the historical confusing packages.
Proposed Change.
To solve the problem, I propose to copy the code from
nvidia-ml-py
into vLLM, and usevllm.third_party.pynvml
to import it. See #12963 for the prototype.The solution is only to rescue us from the dependency hell. We don't need to maintain the code. If there are bugfixes in
nvidia-ml-py
in the future, we can periodically sync the code.This is the first time we copy a whole package into vllm, so I'm creating a separate directory
vllm/third_party
to hold the code.This RFC is for future reference, when we need to copy code into
vllm/third_party
.Feedback Period.
No response
CC List.
No response
Any Other Things.
No response
Before submitting a new issue...
The text was updated successfully, but these errors were encountered: