[V1][Performance] Implement custom serializaton for MultiModalKwargs #16279

p88h · 2025-04-08T18:10:49Z

WIP, just handles the basic case of simple str->Tensor dict.

Update the custom msgpack encoding/decoding to work with lists of buffers so that the backing data of tensors/numpy arrays contained in messages is sent directly by zmq without copying. Signed-off-by: Nick Hill <[email protected]>

Signed-off-by: Nick Hill <[email protected]>

# Conflicts: # vllm/v1/engine/core_client.py

…ocopy Signed-off-by: Nick Hill <[email protected]>

…ocopy Signed-off-by: Nick Hill <[email protected]> # Conflicts: # vllm/v1/engine/core.py # vllm/v1/engine/core_client.py # vllm/v1/serial_utils.py

WIP, just handles the basic case of simple str->Tensor dict.

github-actions · 2025-04-08T18:11:00Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

njhill · 2025-04-08T18:42:26Z

Thanks @p88h, this is pretty much what I had planned! It would be great if you could take this on though.

We are planning to disable the use of pickle by default, so it would be good to do this in a way at avoid that. For handling tensors/ndarrays in general, I have a PR #13790 which also eliminates some mem copies, so it would be good to base on top of that (I'll try to get it merged asap, just need to add a unit test).

Unfortunately msgspec doesn't support custom nested types, but I thought to have intermediate types like:

@dataclass
class FlatNestedTensors:
    tensors: list[torch.Tensor]
    structure: list[Union[int, list]]

where structure is a nested list of indexes into the tensors list. And then the tensors should automatically be handled by the zero-copy logic.

Signed-off-by: Nick Hill <[email protected]>

njhill · 2025-04-08T19:37:53Z

@p88h I've added a test to #13790 so we can hopefully merge it shortly.

Signed-off-by: Nick Hill <[email protected]>

p88h · 2025-04-08T20:36:02Z

I added a comment on your PR, I think it can be simplified to sth like this:
https://gist.github.com/p88h/1daec6374c35293f6bced9333d6f2c4c

... if it worked. That inner msgspec serialization is not working as expected.

…ocopy

Signed-off-by: Nick Hill <[email protected]>

p88h · 2025-04-09T19:31:34Z

So when encoding msgspec is actually smart enough to just handle everything recursively.
Unfortunately when decoding not everything is automatic, so some fix-up is necessary - but a lot of simpler now.

Well, except...

_items_by_modality doesn't want to be easily serialized.

For now, I implemented something that works, unfortunately, the MultiModalField part requires a pickle - msgpack doesn't even notice it's a custom type, just encodes the whole MultiModalFieldElem as a dict and just ignores that part.

It also adds additional complexity to serializing MultiModalKwargs, too - since the old and new layout overlap (share tensors), we likely don't want to send that over twice, even with zero copy that seems unnecesary. The approach is now to use either items by modality (and reconstruct UserDict) OR just pass through the dict for V0-style usage (which I think is still in use even in V1)

…ocopy

Co-authored-by: Russell Bryant <[email protected]>

Signed-off-by: Nick Hill <[email protected]>

njhill · 2025-04-10T17:32:55Z

Thanks @p88h I will review again once #13790 is merged (imminently) and this is rebased on top of that.

mergify · 2025-04-10T19:24:22Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @p88h.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

Signed-off-by: Chenyaaang <[email protected]>

…llm-project#15423) Signed-off-by: Chih-Chieh-Yang <[email protected]> Co-authored-by: Yu Chin Fabian Lim <[email protected]>

…-project#16416) Signed-off-by: DarkLight1337 <[email protected]>

p88h · 2025-04-10T21:25:22Z

@njhill @DarkLight1337 PTAL #16432

njhill and others added 7 commits February 24, 2025 15:28

TypeAlias keyword is python >= 3.10 only

35d1cd9

Signed-off-by: Nick Hill <[email protected]>

use highest pickle protocol

f6f26b6

Signed-off-by: Nick Hill <[email protected]>

Merge remote-tracking branch 'origin/main' into tensor-nocopy

4382a16

# Conflicts: # vllm/v1/engine/core_client.py

Merge remote-tracking branch 'refs/remotes/origin/main' into tensor-n…

9d91483

…ocopy Signed-off-by: Nick Hill <[email protected]>

Merge remote-tracking branch 'refs/remotes/origin/main' into tensor-n…

ea75bd3

…ocopy Signed-off-by: Nick Hill <[email protected]> # Conflicts: # vllm/v1/engine/core.py # vllm/v1/engine/core_client.py # vllm/v1/serial_utils.py

Implement custom serializaton for MultiModalKwargs

06efa46

WIP, just handles the basic case of simple str->Tensor dict.

p88h requested review from WoosukKwon, robertgshaw2-redhat, njhill, ywang96, comaniac and alexm-redhat as code owners April 8, 2025 18:10

mergify bot added the v1 label Apr 8, 2025

p88h mentioned this pull request Apr 8, 2025

[Bug]: Huge memory overhead with V1 (multiprocessing) when handling several multimodal inputs #16185

Closed

1 task

Support all NestedTensors

543ee7b

p88h and others added 3 commits April 8, 2025 20:58

fix formatting

87b7385

proper logger format

f5db471

Add unit test

95b0600

Signed-off-by: Nick Hill <[email protected]>

pre-commit fix

910f30f

Signed-off-by: Nick Hill <[email protected]>

njhill and others added 6 commits April 8, 2025 18:41

Merge remote-tracking branch 'refs/remotes/origin/main' into tensor-n…

747ce1c

…ocopy

Fix unrecognized type decode

478ce09

Signed-off-by: Nick Hill <[email protected]>

use msgspec.Raw for tensor data

2d92af1

Merge branch 'main' into serialize-multimodal-kwargs

7215037

Merge branch 'main' into tensor-nocopy

7789c99

Merge branch 'tensor-nocopy' into serialize-multimodal-kwargs

c1d62ad

njhill and others added 9 commits April 9, 2025 21:14

Merge remote-tracking branch 'refs/remotes/origin/main' into tensor-n…

f946398

…ocopy

Merge branch 'vllm-project:main' into serialize-multimodal-kwargs

095d4fd

Merge remote-tracking branch 'refs/remotes/origin/main' into tensor-n…

60797b4

…ocopy

Update vllm/v1/serial_utils.py

c0c6e43

Co-authored-by: Russell Bryant <[email protected]>

Update vllm/v1/serial_utils.py

3b978ad

Co-authored-by: Russell Bryant <[email protected]>

Update vllm/v1/serial_utils.py

80d90a5

Co-authored-by: Russell Bryant <[email protected]>

Update vllm/v1/serial_utils.py

6bd45dc

Co-authored-by: Russell Bryant <[email protected]>

Update vllm/v1/serial_utils.py

97c144b

Co-authored-by: Russell Bryant <[email protected]>

Comment/docstring updates

c6c2a90

Signed-off-by: Nick Hill <[email protected]>

p88h added 2 commits April 10, 2025 20:34

Merge branch 'vllm-project:main' into serialize-multimodal-kwargs

793c39c

Merge branch 'tensor-nocopy' into serialize-multimodal-kwargs

714d615

mergify bot added the needs-rebase label Apr 10, 2025

p88h and others added 6 commits April 10, 2025 22:25

Get rid of (some) workarounds, slightly more efficient encoding

aa64391

style fixes

2d471bc

Implement support for _items_by_modality, review fixes

9ca2552

[Bugfix] Fix bug when dataset is json (vllm-project#15899)

e1295bc

Signed-off-by: Chenyaaang <[email protected]>

[Model] Reduce redundant computations in mamba2 blocks for Bamba-9B (v…

9a81901

…llm-project#15423) Signed-off-by: Chih-Chieh-Yang <[email protected]> Co-authored-by: Yu Chin Fabian Lim <[email protected]>

[VLM] Avoid unnecessary dummy multimodal data during processing (vllm…

e0483bc

…-project#16416) Signed-off-by: DarkLight1337 <[email protected]>

p88h requested a review from DarkLight1337 as a code owner April 10, 2025 20:36

mergify bot added the multi-modality Related to multi-modality (#4194) label Apr 10, 2025

p88h added 2 commits April 10, 2025 22:40

Merge branch 'main' into serialize-multimodal-kwargs

bdacbb8

Merge branch 'vllm-project:main' into serialize-multimodal-kwargs

7f779ef

mergify bot removed the needs-rebase label Apr 10, 2025

p88h closed this Apr 10, 2025

p88h deleted the serialize-multimodal-kwargs branch April 10, 2025 20:43

p88h mentioned this pull request Apr 10, 2025

[V1][Performance] Implement custom serializaton for MultiModalKwargs [Rebased] #16432

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[V1][Performance] Implement custom serializaton for MultiModalKwargs #16279

[V1][Performance] Implement custom serializaton for MultiModalKwargs #16279

p88h commented Apr 8, 2025 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Apr 8, 2025

Uh oh!

njhill commented Apr 8, 2025

Uh oh!

njhill commented Apr 8, 2025

Uh oh!

p88h commented Apr 8, 2025

Uh oh!

p88h commented Apr 9, 2025

Uh oh!

njhill commented Apr 10, 2025

Uh oh!

mergify bot commented Apr 10, 2025

Uh oh!

p88h commented Apr 10, 2025

Uh oh!

Uh oh!

Uh oh!

[V1][Performance] Implement custom serializaton for MultiModalKwargs #16279

[V1][Performance] Implement custom serializaton for MultiModalKwargs #16279

Conversation

p88h commented Apr 8, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Apr 8, 2025

Uh oh!

njhill commented Apr 8, 2025

Uh oh!

njhill commented Apr 8, 2025

Uh oh!

p88h commented Apr 8, 2025

Uh oh!

p88h commented Apr 9, 2025

Uh oh!

njhill commented Apr 10, 2025

Uh oh!

mergify bot commented Apr 10, 2025

Uh oh!

p88h commented Apr 10, 2025

Uh oh!

Uh oh!

p88h commented Apr 8, 2025 •

edited by github-actions bot

Loading