[Bugfix] Adjust mllama to regional compilation #15112

jkaniecki · 2025-03-19T08:46:41Z

When trying to perform regional compilation with t.compile (compiling layers separately instead of calling t.compile on whole model) on mllama model with Gaudi devices, such an error occures:

ValueError: Unknown decoder layer type <class 'torch._dynamo.eval_frame.OptimizedModule'>
Regional compilation for Gaudi devices has been added with #13213

Cause for this issue is checking layer classes by their names inside mllama code e.g.:
if isinstance(decoder_layer, MllamaCrossAttentionDecoderLayer):

Torch.compile wraps module after compilation with torch._dynamo.eval_frame.OptimizedModule name, that's wht we see mismatch in isinstance function. To resolve that we can distinguish layers basing on self.cross_attention_layers ids - and so do proposed changes. We don't also need raise ValueError in layer instance checking as there is no option for decoder layers to be of different types than desired ones.

github-actions · 2025-03-19T08:46:50Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

Signed-off-by: Jan Kaniecki <[email protected]>

heheda12345

LGTM! Thanks for your contribution.

Signed-off-by: Jan Kaniecki <[email protected]>

This PR involves cherry-pick of vllm-project#15112 from the upstream and a fix for cos_sin preparation in emb layers to match regional compilation. --------- Signed-off-by: Jan Kaniecki <[email protected]>

Signed-off-by: Jan Kaniecki <[email protected]> Signed-off-by: Louis Ulmer <[email protected]>

This PR involves cherry-pick of vllm-project#15112 from the upstream and a fix for cos_sin preparation in emb layers to match regional compilation. --------- Signed-off-by: Jan Kaniecki <[email protected]>

Signed-off-by: Jan Kaniecki <[email protected]>

Signed-off-by: Jan Kaniecki <[email protected]> Signed-off-by: Mu Huai <[email protected]>

mergify bot mentioned this pull request Mar 19, 2025

[Bugfix] Adjust mllama to regional compilation #15111

Closed

jkaniecki added 2 commits March 19, 2025 12:27

Remove isinstance checking in mllama

058b5dd

Signed-off-by: Jan Kaniecki <[email protected]>

Update mllama.py

49cd230

Signed-off-by: Jan Kaniecki <[email protected]>

jkaniecki force-pushed the mllama_regional_compilation branch from 2f5b299 to 49cd230 Compare March 19, 2025 10:28

DarkLight1337 requested a review from heheda12345 March 19, 2025 11:11

heheda12345 approved these changes Mar 19, 2025

View reviewed changes

DarkLight1337 approved these changes Mar 19, 2025

View reviewed changes

DarkLight1337 enabled auto-merge (squash) March 19, 2025 13:00

github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Mar 19, 2025

vllm-bot merged commit 8363cd0 into vllm-project:main Mar 19, 2025
41 of 43 checks passed

gmarinho2 pushed a commit to gmarinho2/vllm that referenced this pull request Apr 1, 2025

[Bugfix] Adjust mllama to regional compilation (vllm-project#15112)

4cd4543

Signed-off-by: Jan Kaniecki <[email protected]>

jkaniecki added a commit to HabanaAI/vllm-fork that referenced this pull request Apr 2, 2025

[Bugfix] Adjust mllama to regional compilation (vllm-project#15112)

242d549

Signed-off-by: Jan Kaniecki <[email protected]>

jkaniecki mentioned this pull request Apr 2, 2025

Adjust mllama to regional compilation HabanaAI/vllm-fork#999

Merged

lulmer pushed a commit to lulmer/vllm that referenced this pull request Apr 7, 2025

[Bugfix] Adjust mllama to regional compilation (vllm-project#15112)

1fb6b5c

Signed-off-by: Jan Kaniecki <[email protected]> Signed-off-by: Louis Ulmer <[email protected]>

nishith-fujitsu pushed a commit to nishith-fujitsu/vllm that referenced this pull request Apr 9, 2025

[Bugfix] Adjust mllama to regional compilation (vllm-project#15112)

52860ae

Signed-off-by: Jan Kaniecki <[email protected]>

ckhordiasma mentioned this pull request Apr 17, 2025

[do not merge] pr test for nm changes into 2.20 red-hat-data-services/vllm#107

Closed

shreyankg pushed a commit to shreyankg/vllm that referenced this pull request May 3, 2025

[Bugfix] Adjust mllama to regional compilation (vllm-project#15112)

a0151f6

Signed-off-by: Jan Kaniecki <[email protected]>

RichardoMrMu pushed a commit to RichardoMrMu/vllm that referenced this pull request May 12, 2025

[Bugfix] Adjust mllama to regional compilation (vllm-project#15112)

72bca86

Signed-off-by: Jan Kaniecki <[email protected]> Signed-off-by: Mu Huai <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Bugfix] Adjust mllama to regional compilation #15112

[Bugfix] Adjust mllama to regional compilation #15112

Uh oh!

jkaniecki commented Mar 19, 2025 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Mar 19, 2025

Uh oh!

heheda12345 left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

[Bugfix] Adjust mllama to regional compilation #15112

[Bugfix] Adjust mllama to regional compilation #15112

Uh oh!

Conversation

jkaniecki commented Mar 19, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Mar 19, 2025

Uh oh!

heheda12345 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

jkaniecki commented Mar 19, 2025 •

edited by github-actions bot

Loading