Skip to content

[V1] Scatter and gather placeholders in the model runner #16076

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 39 commits into from
Apr 8, 2025

Conversation

ywang96
Copy link
Member

@ywang96 ywang96 commented Apr 4, 2025

Reopened from accidentally merged #15712

This PR is an attempt to move scatter_patch_features and gather_patch_feature into the model runner (outside of the model) to avoid interfering with TPU graph compilation.

Breaking change for model developers:

  • PromptUpdateDetails.features has been replaced with PromptUpdateDetails.is_embed. You can use the newly added factories PromptUpdateDetails.select_text and PromptUpdateDetails.select_token_id to generate is_embed based on the target text/token ID.
  • BaseProcessingInfo.get_num_image_tokens should now return the equivalent of PromptUpdateDetails.is_embed.sum() instead of the number of tokens in PromptUpdateDetails.features.

Originally authored by @DarkLight1337; co-authored by @mgoin

DarkLight1337 and others added 26 commits March 31, 2025 16:41
Signed-off-by: DarkLight1337 <[email protected]>
Signed-off-by: DarkLight1337 <[email protected]>
Signed-off-by: DarkLight1337 <[email protected]>
Signed-off-by: DarkLight1337 <[email protected]>
Signed-off-by: DarkLight1337 <[email protected]>
Signed-off-by: DarkLight1337 <[email protected]>
Signed-off-by: DarkLight1337 <[email protected]>
Signed-off-by: DarkLight1337 <[email protected]>
Signed-off-by: DarkLight1337 <[email protected]>
Signed-off-by: DarkLight1337 <[email protected]>
Signed-off-by: DarkLight1337 <[email protected]>
Signed-off-by: DarkLight1337 <[email protected]>
Signed-off-by: Roger Wang <[email protected]>
Signed-off-by: Roger Wang <[email protected]>
Signed-off-by: Roger Wang <[email protected]>
Roger Wang and others added 3 commits April 5, 2025 01:22
Signed-off-by: Roger Wang <[email protected]>
Signed-off-by: DarkLight1337 <[email protected]>
Signed-off-by: DarkLight1337 <[email protected]>
@ywang96
Copy link
Member Author

ywang96 commented Apr 6, 2025

I'm holding on merging this PR until #16113 and #16117 are merged.

DarkLight1337 added a commit to houseroad/vllm that referenced this pull request Apr 7, 2025
Signed-off-by: DarkLight1337 <[email protected]>
@DarkLight1337
Copy link
Member

DarkLight1337 commented Apr 7, 2025

Verifying llama4 now...

Signed-off-by: DarkLight1337 <[email protected]>
Signed-off-by: DarkLight1337 <[email protected]>
Signed-off-by: DarkLight1337 <[email protected]>
Signed-off-by: DarkLight1337 <[email protected]>
Copy link

mergify bot commented Apr 7, 2025

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @ywang96.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@mergify mergify bot added the needs-rebase label Apr 7, 2025
@mergify mergify bot removed the needs-rebase label Apr 7, 2025
@DarkLight1337 DarkLight1337 merged commit f2ebb6f into main Apr 8, 2025
45 checks passed
@DarkLight1337 DarkLight1337 deleted the v1-is-embed branch April 8, 2025 02:43
@github-project-automation github-project-automation bot moved this from In Progress to Done in Multi-modality Core Apr 8, 2025
nishith-fujitsu pushed a commit to nishith-fujitsu/vllm that referenced this pull request Apr 9, 2025
…t#16076)

Signed-off-by: DarkLight1337 <[email protected]>
Signed-off-by: mgoin <[email protected]>
Signed-off-by: Roger Wang <[email protected]>
Co-authored-by: DarkLight1337 <[email protected]>
Co-authored-by: mgoin <[email protected]>
Co-authored-by: Jennifer Zhao <[email protected]>
yangw-dev pushed a commit to yangw-dev/vllm that referenced this pull request Apr 21, 2025
…t#16076)

Signed-off-by: DarkLight1337 <[email protected]>
Signed-off-by: mgoin <[email protected]>
Signed-off-by: Roger Wang <[email protected]>
Co-authored-by: DarkLight1337 <[email protected]>
Co-authored-by: mgoin <[email protected]>
Co-authored-by: Jennifer Zhao <[email protected]>
Signed-off-by: Yang Wang <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation multi-modality Related to multi-modality (#4194) ready ONLY add when PR is ready to merge/full CI is needed tpu Related to Google TPUs v1
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

3 participants