Skip to content

Gaudi on TGI #2752

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Mar 28, 2025
Merged

Gaudi on TGI #2752

merged 10 commits into from
Mar 28, 2025

Conversation

baptistecolle
Copy link
Contributor

Congratulations! You've made it this far! Once merged, the article will appear at https://huggingface.co/blog. Official articles
require additional reviews. Alternatively, you can write a community article following the process here.

Preparing the Article

You're not quite done yet, though. Please make sure to follow this process (as documented here):

  • Add an entry to _blog.yml.
  • Add a thumbnail. There are no requirements here, but there is a template if it's helpful.
  • Check you use a short title and blog path.
  • Upload any additional assets (such as images) to the Documentation Images repo. This is to reduce bloat in the GitHub base repo when cloning and pulling. Try to have small images to avoid a slow or expensive user experience.
  • Add metadata (such as authors) to your md file. You can also specify guest or org for the authors.
  • Ensure the publication date is correct.
  • Preview the content. A quick way is to paste the markdown content in https://huggingface.co/new-blog. Do not click publish, this is just a way to do an early check.

Here is an example of a complete PR: #2382

Getting a Review

Please make sure to get a review from someone on your team or a co-author.
Once this is done and once all the steps above are completed, you should be able to merge.
There is no need for additional reviews if you and your co-authors are happy and meet all of the above.

Feel free to add @pcuenca as a reviewer if you want a final check. Keep in mind he'll be biased toward light reviews
(e.g., check for proper metadata) rather than content reviews unless explicitly asked.

@baptistecolle
Copy link
Contributor Author

A new release of TGI went live yesterday, which means the Gaudi backend image is now available! 🎉
Release v3.2.1

With this update, I think we’re ready to publish the blog post

@julien-c
Copy link
Member

nice!

@baptistecolle
Copy link
Contributor Author

@pcuenca Thanks for the review! 😄

This is my first blog post, so I really appreciate all the feedback and comments!

@pcuenca
Copy link
Member

pcuenca commented Mar 20, 2025

Very nice post, @baptistecolle!

Copy link
Contributor

@regisss regisss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great blog post @baptistecolle 🚀 🔥
I left a few minor comments. I will also provide you a new thumbnail tomorrow.

- Gemma (7B)
- Llava-v1.6-Mistral-7B

Furthermore, we also support all models implemented in the [Transformers library](https://huggingface.co/docs/transformers/index), providing a [fallback mechanism](https://huggingface.co/docs/text-generation-inference/basic_tutorials/non_core_models) that ensures you can still run any model on Gaudi hardware even if it's not yet specifically optimized.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure this works on Gaudi XD

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, so I investigated and tried running a model with a custom architecture (Refact-1_6B-fim) using TGI Gaudi, but it doesn’t work. We actually disabled the fallback to Transformers in TGI Gaudi (see this line).

As a result, during startup, you get a ValueError: Unsupported model type gpt_refact. It would be great to add this fallback back in the future so we could attempt to load more models on a “best effort, no guarantee” basis. But for now, I’ll remove it from the blog post.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@IlyasMoutawwakil I guess this could work for most models with the Gaudi support you added to Transformers? We'll have to check

Copy link
Member

@julien-c julien-c left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's ship this soon no?

@baptistecolle
Copy link
Contributor Author

let's ship this soon no?

Yes! we are just waiting for confirmation from Intel


We've fully integrated Gaudi support into TGI's main codebase in PR [#3091](https://github.com/huggingface/text-generation-inference/pull/3091). Previously, we maintained a separate fork for Gaudi devices at [tgi-gaudi](https://github.com/huggingface/tgi-gaudi). This was cumbersome for users and prevented us from supporting the latest TGI features at launch. Now using the new [TGI multi-backend architecture](https://huggingface.co/blog/tgi-multi-backend), we support Gaudi directly on TGI – no more finicking on a custom repository 🙌

This integration supports Intel's full line of Gaudi hardware:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about adding a cross-link to Intel's Gaudi product page at the end of this section? https://www.intel.com/content/www/us/en/developer/platform/gaudi/develop/overview.html

@baptistecolle baptistecolle requested a review from kding1 March 27, 2025 07:45
Copy link
Member

@jeffboudier jeffboudier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great post!

@baptistecolle baptistecolle merged commit 746d6aa into main Mar 28, 2025
1 check passed
@baptistecolle baptistecolle deleted the add-gaudi branch March 28, 2025 08:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants