-
-
Notifications
You must be signed in to change notification settings - Fork 7.6k
[New Model]: Support Zyphra/Zamba2-7B #9382
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
+1 |
2 similar comments
+1 |
+1 |
This issue has been automatically marked as stale because it has not had any activity within 90 days. It will be automatically closed if no further activity occurs within 30 days. Leave a comment if you feel this issue should remain open. Thank you! |
Hey. Yury here from Zyphra. We have an internal version that works, going to open a PR sometime soon. |
This is done now. Merged into 0.8.x version of vLLM |
Uh oh!
There was an error while loading. Please reload this page.
The model to consider.
Announcement blog: https://www.zyphra.com/post/zamba2-7b
Base model: https://huggingface.co/Zyphra/Zamba2-7B
Instruct tuned: https://huggingface.co/Zyphra/Zamba2-7B-Instruct
The closest model vllm already supports.
Jamba, as it is a mixture of state-space and transformers blocks
What's your difficulty of supporting the model you want?
Should be easy once Mamba2 support lands in #9292, however this
use_shared_attention_lora
case seems possibly complexAll of the HF-compatible modeling code can be found here: https://github.com/Zyphra/transformers_zamba2/tree/main/src/transformers/models/zamba2
Before submitting a new issue...
The text was updated successfully, but these errors were encountered: