-
-
Notifications
You must be signed in to change notification settings - Fork 7.6k
Avoid mistakenly picking Gaudi/HPU if XPU is requested. #11018
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
👋 Hi! Thank you for contributing to the vLLM project. Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can do one of these:
🚀 |
nice catch! I think you are using some latest Intel CPU with GPU and NPU, and |
5c4cf90
to
9f23dbf
Compare
@jikunshang updated the change to be simpler and more generic. On the same computer with that NPU present, openvino, xpu and cpu are all valid target names and are all preceded by the is_hpu() check in setup.py and thus cannot be selected. |
I feel the best fix is making |
@jikunshang you're right because VLLM_TARGET_DEVICE is always set, defaulting to cuda :( I think explicitly requiring setting it for HPU as it is done with most other targets would be best. If there are setups where vllm on gaudi is being tested relying only on hw detection, they would need to be updated and that may be too complicated (if CI/production systems not keeping up closely with changes in vllm) Other ways of fixing this:
I still find the cleanest would be every target being selected explicitly. I feel the heuristics for the hw check may get even more complex with different types of hw and naming of devices in the future unless there is a single good indicator of Gaudi hw being present (like a line in dmesg) |
Hi, I opened a PR #12046 fixing this issue - it definitely is a bug in |
In setup.py _is_hpu() will return true when /dev/accel/accel0 is present but that can happen with XPU devices also (Intel iGPU).
This change allows VLLM_TARGET_DEVICE="xpu" to override that and proceed with an XPU install.
Maybe a simpler is_hpu() that solely relies on VLLM_TARGET_DEVICE="hpu" would be cleaner but the current setup is probably needed for some existing setups.