Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Model] Add SupportsMultiModal.get_language_model interface #16007

Merged
merged 7 commits into from
Apr 9, 2025

Conversation

NickLucche
Copy link
Contributor

@NickLucche NickLucche commented Apr 3, 2025

Most vlms adhere to HF unwritten standard and use self.language_model, but the naming is not enforced.
This PR adds a getter to abstract that naming. See discussion in #15782 (comment) for more context.

I think Whisper is the only outlier in this taxonomy.

Copy link

github-actions bot commented Apr 3, 2025

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

@hmellor
Copy link
Member

hmellor commented Apr 3, 2025

This seems like the kind of thing that we could fix on the HF side. Similar to the getter and setter functions we have for input and output embeddings

@jeejeelee
Copy link
Collaborator

How about getting all modalities modules using get_mm_mapping?

@NickLucche
Copy link
Contributor Author

This seems like the kind of thing that we could fix on the HF side

Isn't this entirely dependent on how the model is implemented in vllm?

How about getting all modalities modules using get_mm_mapping?

Unfortunately it's not implemented for every model. But if it were implemented, that would work. I was mostly taking @DarkLight1337 advice.

@hmellor
Copy link
Member

hmellor commented Apr 3, 2025

Isn't this entirely dependent on how the model is implemented in vllm?

You're right, since we reimplement the modelling code in vLLM, having nice utilities in Transformers modelling code doesnt help us here.

However, I do see a future where the Transformers backend is stable and performant enough that much of the modelling code in vLLM will not be needed anymore 🤞

@NickLucche
Copy link
Contributor Author

Yeah I'd be very happy with that future, modelling would be easier.
But I think we may still have cases where research teams for whatever reason might not go through hf.transformers implementation first (Qwen maybe?) and contribute their model here.
For those cases I think it's not bad if the interface grows a tiny bit tighter.

@hmellor
Copy link
Member

hmellor commented Apr 3, 2025

We do hope to make model contributions to Transformers easier in future, but yes there may still be some models which need to be modelled in vLLM, which is fine.

@DarkLight1337
Copy link
Member

DarkLight1337 commented Apr 3, 2025

To avoid merge conflicts, let's wait until #15712 #16076 is merged first

@DarkLight1337
Copy link
Member

The other PR has been merged, can you update this one? Also there have been a couple new multi-modal models so you should update them as well.

Signed-off-by: NickLucche <[email protected]>
Signed-off-by: NickLucche <[email protected]>
Signed-off-by: NickLucche <[email protected]>
@NickLucche NickLucche force-pushed the mm-get-language-model branch from c3bc449 to f8eea45 Compare April 8, 2025 08:03
@mergify mergify bot removed the tpu Related to Google TPUs label Apr 8, 2025
@NickLucche
Copy link
Contributor Author

Rebased and added llama4. I am counting 30 architectures, does that check out?

@DarkLight1337
Copy link
Member

DarkLight1337 commented Apr 8, 2025

You're also missing mllama (llama 3.2 multimodal) and phi4mm (Phi-4-multimodal)

Signed-off-by: NickLucche <[email protected]>
@NickLucche
Copy link
Contributor Author

Thanks for looking into it!

Signed-off-by: NickLucche <[email protected]>
Signed-off-by: NickLucche <[email protected]>
@mergify mergify bot added the documentation Improvements or additions to documentation label Apr 8, 2025
Signed-off-by: NickLucche <[email protected]>
@DarkLight1337 DarkLight1337 enabled auto-merge (squash) April 9, 2025 08:31
@github-actions github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Apr 9, 2025
@vllm-bot vllm-bot merged commit d55244d into vllm-project:main Apr 9, 2025
46 of 50 checks passed
DarkLight1337 added a commit to DarkLight1337/vllm that referenced this pull request Apr 9, 2025
Signed-off-by: DarkLight1337 <[email protected]>
zRzRzRzRzRzRzR pushed a commit to zRzRzRzRzRzRzR/vllm that referenced this pull request Apr 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation ready ONLY add when PR is ready to merge/full CI is needed v1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants