[Misc] Adding `MMMU-Pro` vision dataset to serving benchmark #10804

ywang96 · 2024-12-01T04:16:49Z

This PR adds the support for MMMU-Pro vision dataset to the serving benchmark. This dataset is image-token-heavy (single image with average resolution of 1700x1600) and comes with a generic short text prompt, as describe in the dataset model card:

Vision: In this subset, questions are embedded within screenshots or photos, and models must integrate visual and textual information to answer correctly. No separate text is fed into the model.

Example command to run the benchmark

vllm serve llava-hf/llava-v1.6-mistral-7b-hf

python3 benchmarks/benchmark_serving.py \
--dataset-path MMMU/MMMU_Pro \
--dataset-name hf \
--hf-subset "vision" \
--hf-split test \
--model llava-hf/llava-v1.6-mistral-7b-hf  \
--backend openai-chat \
--endpoint /v1/chat/completions \
--num-prompts 100

Co-authored-by: @heheda12345

Signed-off-by: Roger Wang <[email protected]>

github-actions · 2024-12-01T04:17:03Z

👋 Hi! Thank you for contributing to the vLLM project.
Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can do one of these:

Add ready label to the PR
Enable auto-merge.

🚀

Co-authored-by: Chen Zhang <[email protected]> Signed-off-by: Roger Wang <[email protected]>

ywang96 · 2024-12-01T04:24:07Z

Note: Currently the serving benchmark does not correctly count the input image tokens, therefore the reported total token generation rate is not accurate.

Signed-off-by: Roger Wang <[email protected]>

Isotr0py

Overall LGTM! I left a comment about dataset sampling, PTAL!

benchmarks/benchmark_serving.py

Co-authored-by: Isotr0py <[email protected]>

Signed-off-by: Roger Wang <[email protected]>

Isotr0py

LGTM!

…oject#10804) Signed-off-by: Roger Wang <[email protected]> Co-authored-by: Chen Zhang <[email protected]> Co-authored-by: Isotr0py <[email protected]> Signed-off-by: Andrew Feldman <[email protected]>

…oject#10804) Signed-off-by: Roger Wang <[email protected]> Co-authored-by: Chen Zhang <[email protected]> Co-authored-by: Isotr0py <[email protected]>

mmmu_pro

65203d5

Signed-off-by: Roger Wang <[email protected]>

move shuffle and add coauthor

1ae37f8

Co-authored-by: Chen Zhang <[email protected]> Signed-off-by: Roger Wang <[email protected]>

ywang96 added the ready ONLY add when PR is ready to merge/full CI is needed label Dec 1, 2024

fix prompt formatting

771f5ef

Signed-off-by: Roger Wang <[email protected]>

Isotr0py reviewed Dec 1, 2024

View reviewed changes

benchmarks/benchmark_serving.py Outdated Show resolved Hide resolved

benchmarks/benchmark_serving.py Outdated Show resolved Hide resolved

ywang96 and others added 3 commits November 30, 2024 23:48

Update benchmarks/benchmark_serving.py

2cd66d5

Co-authored-by: Isotr0py <[email protected]>

format

a3c5c0e

Signed-off-by: Roger Wang <[email protected]>

clean up and add assert

65f4074

Signed-off-by: Roger Wang <[email protected]>

Isotr0py approved these changes Dec 1, 2024

View reviewed changes

Isotr0py enabled auto-merge (squash) December 1, 2024 08:07

Isotr0py merged commit c11f172 into vllm-project:main Dec 1, 2024
28 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Misc] Adding `MMMU-Pro` vision dataset to serving benchmark #10804

[Misc] Adding `MMMU-Pro` vision dataset to serving benchmark #10804

Uh oh!

ywang96 commented Dec 1, 2024 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Dec 1, 2024

Uh oh!

ywang96 commented Dec 1, 2024 •

edited

Loading

Uh oh!

Isotr0py left a comment

Uh oh!

Uh oh!

Uh oh!

Isotr0py left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

[Misc] Adding MMMU-Pro vision dataset to serving benchmark #10804

[Misc] Adding MMMU-Pro vision dataset to serving benchmark #10804

Uh oh!

Conversation

ywang96 commented Dec 1, 2024 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Dec 1, 2024

Uh oh!

ywang96 commented Dec 1, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Isotr0py left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Isotr0py left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

[Misc] Adding `MMMU-Pro` vision dataset to serving benchmark #10804

[Misc] Adding `MMMU-Pro` vision dataset to serving benchmark #10804

ywang96 commented Dec 1, 2024 •

edited by github-actions bot

Loading

ywang96 commented Dec 1, 2024 •

edited

Loading