-
Notifications
You must be signed in to change notification settings - Fork 866
Gaudi on TGI #2752
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Gaudi on TGI #2752
Conversation
A new release of TGI went live yesterday, which means the Gaudi backend image is now available! 🎉 With this update, I think we’re ready to publish the blog post |
nice! |
@pcuenca Thanks for the review! 😄 This is my first blog post, so I really appreciate all the feedback and comments! |
Very nice post, @baptistecolle! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great blog post @baptistecolle 🚀 🔥
I left a few minor comments. I will also provide you a new thumbnail tomorrow.
intel-gaudi-backend-for-tgi.md
Outdated
- Gemma (7B) | ||
- Llava-v1.6-Mistral-7B | ||
|
||
Furthermore, we also support all models implemented in the [Transformers library](https://huggingface.co/docs/transformers/index), providing a [fallback mechanism](https://huggingface.co/docs/text-generation-inference/basic_tutorials/non_core_models) that ensures you can still run any model on Gaudi hardware even if it's not yet specifically optimized. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure this works on Gaudi XD
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, so I investigated and tried running a model with a custom architecture (Refact-1_6B-fim) using TGI Gaudi, but it doesn’t work. We actually disabled the fallback to Transformers in TGI Gaudi (see this line).
As a result, during startup, you get a ValueError: Unsupported model type gpt_refact
. It would be great to add this fallback back in the future so we could attempt to load more models on a “best effort, no guarantee” basis. But for now, I’ll remove it from the blog post.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@IlyasMoutawwakil I guess this could work for most models with the Gaudi support you added to Transformers? We'll have to check
a4cdba6
to
fe6e867
Compare
fe6e867
to
c7953d6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's ship this soon no?
Yes! we are just waiting for confirmation from Intel |
intel-gaudi-backend-for-tgi.md
Outdated
|
||
We've fully integrated Gaudi support into TGI's main codebase in PR [#3091](https://github.com/huggingface/text-generation-inference/pull/3091). Previously, we maintained a separate fork for Gaudi devices at [tgi-gaudi](https://github.com/huggingface/tgi-gaudi). This was cumbersome for users and prevented us from supporting the latest TGI features at launch. Now using the new [TGI multi-backend architecture](https://huggingface.co/blog/tgi-multi-backend), we support Gaudi directly on TGI – no more finicking on a custom repository 🙌 | ||
|
||
This integration supports Intel's full line of Gaudi hardware: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about adding a cross-link to Intel's Gaudi product page at the end of this section? https://www.intel.com/content/www/us/en/developer/platform/gaudi/develop/overview.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great post!
Added link to Gaudi product page
Added co-authors
Updated
3f93d19
to
a4bc9a8
Compare
Congratulations! You've made it this far! Once merged, the article will appear at https://huggingface.co/blog. Official articles
require additional reviews. Alternatively, you can write a community article following the process here.
Preparing the Article
You're not quite done yet, though. Please make sure to follow this process (as documented here):
md
file. You can also specifyguest
ororg
for the authors.Here is an example of a complete PR: #2382
Getting a Review
Please make sure to get a review from someone on your team or a co-author.
Once this is done and once all the steps above are completed, you should be able to merge.
There is no need for additional reviews if you and your co-authors are happy and meet all of the above.
Feel free to add @pcuenca as a reviewer if you want a final check. Keep in mind he'll be biased toward light reviews
(e.g., check for proper metadata) rather than content reviews unless explicitly asked.