-
-
Notifications
You must be signed in to change notification settings - Fork 7.8k
[Usage]: How to infer the reward model of nvidia/Llama-3.1-Nemotron-70B-Reward-HF? #11459
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
See #8967 (comment) |
Please try out #11469. You should be able to run the model just by setting |
Hi, in my understanding, vllm/vllm/model_executor/models/adapters.py Lines 56 to 59 in 9ddac56
|
Looking deeper into this, that alone isn't quite enough to solve the problem because |
Try this code snippet to get logits: I have verified that the logits here is semi-identical to those obtained by huggingface transformers |
This issue has been automatically marked as stale because it has not had any activity within 90 days. It will be automatically closed if no further activity occurs within 30 days. Leave a comment if you feel this issue should remain open. Thank you! |
This issue has been automatically closed due to inactivity. Please feel free to reopen if you feel it is still relevant. Thank you! |
Your current environment
python3.9
How would you like to use vllm
I want to run inference of a [specific model](put link here). I don't know how to integrate it with vllm.
Before submitting a new issue...
The text was updated successfully, but these errors were encountered: