-
-
Notifications
You must be signed in to change notification settings - Fork 7.7k
[Bug]: RuntimeError: No CUDA GPUs are available in transformers v4.48.0 or above when running Ray RLHF example #13597
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
possibly it can be the serialization of |
Having the same error even when transformers==4.47.1 |
@johnny12150 what's your use case and your code? how do you use vllm? are you using rlhf? |
I set |
Update: When I warp the demo's main logic into a from test_ray_vllm_rlhf import main
main() With |
Hi @youkaichao, |
@ArthurinRUC I updated the script in #14185 to pass the class by string name. it should solve the problem now.
might be related to ray, not sure what's exactly happening here. |
I am running into same issue when following distributed offline inference example. However this only happens when TP>1, with TP=0, I am able to use same setup without any issues. Created an issue with more details: #14413 |
using cmd |
Closing this, since it seems like there are some workarounds // if you are using Ray to serve models, consider using Ray Serve: https://docs.ray.io/en/latest/serve/llm/serving-llms.html |
I finally figure out what's wrong :) TL;DR it is a issue only related to In So if you run code like For each vLLM RayWorker, vLLM will reset its |
Your current environment
The output of `python collect_env.py`
🐛 Describe the bug
Hi for all!
I failed to run the vLLM project RLHF example script. The code is exactly same as the vLLM docs page: https://docs.vllm.ai/en/latest/getting_started/examples/rlhf.html
The error messages are:
I found in transformers==4.47.1 the script could run normally. However when I tried transformers==4.48.0, 4.48.1 and 4.49.0 I got the error messages above. Then I checked pip envs with
pip list
and found only transformers versions are different.I've tried to change vllm version between 0.7.0 and 0.7.2, the behavior is the same.
I make a issue in transformers repo: huggingface/transformers#36295
Related issue in Ray project: #13230
Before submitting a new issue...
The text was updated successfully, but these errors were encountered: