How integrate with hf with minial modification? #242
Replies: 7 comments
-
Thanks for your interest and great question! You can install vLLM from source and directly modify the model code. |
Beta Was this translation helpful? Give feedback.
-
This is a huge change, is there any easier way to do with llama? I dont want insert these code to my transformers based existing project. |
Beta Was this translation helpful? Give feedback.
-
Can you guys point out in the documentation which are the necessary modifications? Or give a tutorial on the modification steps of a model. "Rewrite the forward methods" section in the document is too brief. |
Beta Was this translation helpful? Give feedback.
-
@lucasjinreal Is your model different from the original LLaMA? If not, you can simply pass the path to your model weights in |
Beta Was this translation helpful? Give feedback.
-
@liujuncn Thanks for your feedback. We'll describe more details in the doc. In order to address your issue quickly, could you share with us the specific model you're interested in using with vLLM? Depending on the model architecture, we might be able to incorporate support for it promptly. |
Beta Was this translation helpful? Give feedback.
-
@WoosukKwon Can u be more specific? Like I have hf based
How can I specific the And my generate loop was with stream, does it supported? |
Beta Was this translation helpful? Give feedback.
-
For example x-transformers here: We can choose to combine different tricks. So how would a custom model architecture be possible using vLLM? |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
I saw all are wrappers around vllm, how to integrate hf and see out-of-box boost from my existing model?
Beta Was this translation helpful? Give feedback.
All reactions