You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Which version of vLLM actually supports Qwen2.5-VL-72B inference, I've tried various versions since 0.8.2, including the latest commit tonight, and have had all sorts of problems with it, such as OUT OF MEMORY, etc. Also the offline version and the server version I've tried and have had all sorts of problems with both. So is there a normal usable version?
Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.
The text was updated successfully, but these errors were encountered:
Uh oh!
There was an error while loading. Please reload this page.
Your current environment
The output of `python collect_env.py`
🐛 Describe the bug
Which version of vLLM actually supports Qwen2.5-VL-72B inference, I've tried various versions since 0.8.2, including the latest commit tonight, and have had all sorts of problems with it, such as OUT OF MEMORY, etc. Also the offline version and the server version I've tried and have had all sorts of problems with both. So is there a normal usable version?
Some bug:
[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] WorkerProc hit an exception.
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] Traceback (most recent call last):
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] File "/jizhicfs/leoyizhang/anaconda3/envs/vllm_0.8.4_f344107/lib/python3.12/site-packages/vllm/v1/executor/multiproc_executor.py", line 465, in worker_busy_loop
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] output = func(*args, **kwargs)
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] ^^^^^^^^^^^^^^^^^^^^^
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] File "/jizhicfs/leoyizhang/anaconda3/envs/vllm_0.8.4_f344107/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] return func(*args, **kwargs)
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] ^^^^^^^^^^^^^^^^^^^^^
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] File "/jizhicfs/leoyizhang/anaconda3/envs/vllm_0.8.4_f344107/lib/python3.12/site-packages/vllm/v1/worker/gpu_worker.py", line 263, in execute_model
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] output = self.model_runner.execute_model(scheduler_output)
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] File "/jizhicfs/leoyizhang/anaconda3/envs/vllm_0.8.4_f344107/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] return func(*args, **kwargs)
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] ^^^^^^^^^^^^^^^^^^^^^
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] File "/jizhicfs/leoyizhang/anaconda3/envs/vllm_0.8.4_f344107/lib/python3.12/site-packages/vllm/v1/worker/gpu_model_runner.py", line 1077, in execute_model
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] hidden_states = self.model(
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] ^^^^^^^^^^^
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] File "/jizhicfs/leoyizhang/anaconda3/envs/vllm_0.8.4_f344107/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] return self._call_impl(*args, **kwargs)
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] File "/jizhicfs/leoyizhang/anaconda3/envs/vllm_0.8.4_f344107/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] return forward_call(*args, **kwargs)
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] File "/jizhicfs/leoyizhang/anaconda3/envs/vllm_0.8.4_f344107/lib/python3.12/site-packages/vllm/model_executor/models/qwen2_5_vl.py", line 1114, in forward
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] hidden_states = self.language_model.model(
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] ^^^^^^^^^^^^^^^^^^^^^^^^^^
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] File "/jizhicfs/leoyizhang/anaconda3/envs/vllm_0.8.4_f344107/lib/python3.12/site-packages/vllm/compilation/decorators.py", line 245, in call
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] model_output = self.forward(*args, **kwargs)
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] File "/jizhicfs/leoyizhang/anaconda3/envs/vllm_0.8.4_f344107/lib/python3.12/site-packages/vllm/model_executor/models/qwen2.py", line 326, in forward
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] def forward(
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] File "/jizhicfs/leoyizhang/anaconda3/envs/vllm_0.8.4_f344107/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] return self._call_impl(*args, **kwargs)
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] File "/jizhicfs/leoyizhang/anaconda3/envs/vllm_0.8.4_f344107/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] return forward_call(*args, **kwargs)
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] File "/jizhicfs/leoyizhang/anaconda3/envs/vllm_0.8.4_f344107/lib/python3.12/site-packages/torch/_dynamo/eval_frame.py", line 745, in _fn
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] return fn(*args, **kwargs)
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] ^^^^^^^^^^^^^^^^^^^
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] File "/jizhicfs/leoyizhang/anaconda3/envs/vllm_0.8.4_f344107/lib/python3.12/site-packages/torch/fx/graph_module.py", line 822, in call_wrapped
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] return self._wrapped_call(self, *args, **kwargs)
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] File "/jizhicfs/leoyizhang/anaconda3/envs/vllm_0.8.4_f344107/lib/python3.12/site-packages/torch/fx/graph_module.py", line 400, in call
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] raise e
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] File "/jizhicfs/leoyizhang/anaconda3/envs/vllm_0.8.4_f344107/lib/python3.12/site-packages/torch/fx/graph_module.py", line 387, in call
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] return super(self.cls, obj).call(*args, **kwargs) # type: ignore[misc]
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] File "/jizhicfs/leoyizhang/anaconda3/envs/vllm_0.8.4_f344107/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] return self._call_impl(*args, **kwargs)
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] File "/jizhicfs/leoyizhang/anaconda3/envs/vllm_0.8.4_f344107/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] return forward_call(*args, **kwargs)
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] File "<eval_with_key>.162", line 574, in forward
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] submod_1 = self.submod_1(getitem, s0, getitem_1, getitem_2, getitem_3); getitem = getitem_1 = getitem_2 = submod_1 = None
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] File "/jizhicfs/leoyizhang/anaconda3/envs/vllm_0.8.4_f344107/lib/python3.12/site-packages/torch/fx/graph_module.py", line 822, in call_wrapped
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] return self._wrapped_call(self, *args, **kwargs)
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] File "/jizhicfs/leoyizhang/anaconda3/envs/vllm_0.8.4_f344107/lib/python3.12/site-packages/torch/fx/graph_module.py", line 400, in call
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] raise e
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] File "/jizhicfs/leoyizhang/anaconda3/envs/vllm_0.8.4_f344107/lib/python3.12/site-packages/torch/fx/graph_module.py", line 387, in call
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] return super(self.cls, obj).call(*args, **kwargs) # type: ignore[misc]
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] File "/jizhicfs/leoyizhang/anaconda3/envs/vllm_0.8.4_f344107/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] return self._call_impl(*args, **kwargs)
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] File "/jizhicfs/leoyizhang/anaconda3/envs/vllm_0.8.4_f344107/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] return forward_call(*args, **kwargs)
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] File "<eval_with_key>.2", line 5, in forward
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] unified_attention_with_output = torch.ops.vllm.unified_attention_with_output(query_2, key_2, value, output_1, 'language_model.model.layers.0.self_attn.attn'); query_2 = key_2 = value = output_1 = unified_attention_with_output = None
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] File "/jizhicfs/leoyizhang/anaconda3/envs/vllm_0.8.4_f344107/lib/python3.12/site-packages/torch/_ops.py", line 1123, in call
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] return self._op(*args, **(kwargs or {}))
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] File "/jizhicfs/leoyizhang/anaconda3/envs/vllm_0.8.4_f344107/lib/python3.12/site-packages/vllm/attention/layer.py", line 415, in unified_attention_with_output
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] self.impl.forward(self,
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] File "/jizhicfs/leoyizhang/anaconda3/envs/vllm_0.8.4_f344107/lib/python3.12/site-packages/vllm/v1/attention/backends/flash_attn.py", line 578, in forward
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] cascade_attention(
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] File "/jizhicfs/leoyizhang/anaconda3/envs/vllm_0.8.4_f344107/lib/python3.12/site-packages/vllm/v1/attention/backends/flash_attn.py", line 710, in cascade_attention
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] prefix_output, prefix_lse = flash_attn_varlen_func(
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] ^^^^^^^^^^^^^^^^^^^^^^^
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] File "/jizhicfs/leoyizhang/anaconda3/envs/vllm_0.8.4_f344107/lib/python3.12/site-packages/vllm/vllm_flash_attn/flash_attn_interface.py", line 253, in flash_attn_varlen_func
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] out, softmax_lse, _, _ = torch.ops._vllm_fa3_C.fwd(
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] ^^^^^^^^^^^^^^^^^^^^^^^^^^
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] File "/jizhicfs/leoyizhang/anaconda3/envs/vllm_0.8.4_f344107/lib/python3.12/site-packages/torch/_ops.py", line 1123, in call
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] return self._op(*args, **(kwargs or {}))
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] RuntimeError: scheduler_metadata must have shape (metadata_size)
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] Traceback (most recent call last):
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] File "/jizhicfs/leoyizhang/anaconda3/envs/vllm_0.8.4_f344107/lib/python3.12/site-packages/vllm/v1/executor/multiproc_executor.py", line 465, in worker_busy_loop
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] output = func(*args, **kwargs)
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] ^^^^^^^^^^^^^^^^^^^^^
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] File "/jizhicfs/leoyizhang/anaconda3/envs/vllm_0.8.4_f344107/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] return func(*args, **kwargs)
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] ^^^^^^^^^^^^^^^^^^^^^
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] File "/jizhicfs/leoyizhang/anaconda3/envs/vllm_0.8.4_f344107/lib/python3.12/site-packages/vllm/v1/worker/gpu_worker.py", line 263, in execute_model
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] output = self.model_runner.execute_model(scheduler_output)
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] File "/jizhicfs/leoyizhang/anaconda3/envs/vllm_0.8.4_f344107/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] return func(*args, **kwargs)
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] ^^^^^^^^^^^^^^^^^^^^^
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] File "/jizhicfs/leoyizhang/anaconda3/envs/vllm_0.8.4_f344107/lib/python3.12/site-packages/vllm/v1/worker/gpu_model_runner.py", line 1077, in execute_model
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] hidden_states = self.model(
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] ^^^^^^^^^^^
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] File "/jizhicfs/leoyizhang/anaconda3/envs/vllm_0.8.4_f344107/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] return self._call_impl(*args, **kwargs)
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] File "/jizhicfs/leoyizhang/anaconda3/envs/vllm_0.8.4_f344107/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] return forward_call(*args, **kwargs)
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] File "/jizhicfs/leoyizhang/anaconda3/envs/vllm_0.8.4_f344107/lib/python3.12/site-packages/vllm/model_executor/models/qwen2_5_vl.py", line 1114, in forward
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] hidden_states = self.language_model.model(
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] ^^^^^^^^^^^^^^^^^^^^^^^^^^
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] File "/jizhicfs/leoyizhang/anaconda3/envs/vllm_0.8.4_f344107/lib/python3.12/site-packages/vllm/compilation/decorators.py", line 245, in call
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] model_output = self.forward(*args, **kwargs)
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] File "/jizhicfs/leoyizhang/anaconda3/envs/vllm_0.8.4_f344107/lib/python3.12/site-packages/vllm/model_executor/models/qwen2.py", line 326, in forward
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] def forward(
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] File "/jizhicfs/leoyizhang/anaconda3/envs/vllm_0.8.4_f344107/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] return self._call_impl(*args, **kwargs)
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] File "/jizhicfs/leoyizhang/anaconda3/envs/vllm_0.8.4_f344107/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] return forward_call(*args, **kwargs)
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] File "/jizhicfs/leoyizhang/anaconda3/envs/vllm_0.8.4_f344107/lib/python3.12/site-packages/torch/_dynamo/eval_frame.py", line 745, in _fn
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] return fn(*args, **kwargs)
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] ^^^^^^^^^^^^^^^^^^^
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] File "/jizhicfs/leoyizhang/anaconda3/envs/vllm_0.8.4_f344107/lib/python3.12/site-packages/torch/fx/graph_module.py", line 822, in call_wrapped
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] return self._wrapped_call(self, *args, **kwargs)
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] File "/jizhicfs/leoyizhang/anaconda3/envs/vllm_0.8.4_f344107/lib/python3.12/site-packages/torch/fx/graph_module.py", line 400, in call
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] raise e
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] File "/jizhicfs/leoyizhang/anaconda3/envs/vllm_0.8.4_f344107/lib/python3.12/site-packages/torch/fx/graph_module.py", line 387, in call
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] return super(self.cls, obj).call(*args, **kwargs) # type: ignore[misc]
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] File "/jizhicfs/leoyizhang/anaconda3/envs/vllm_0.8.4_f344107/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] return self._call_impl(*args, **kwargs)
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] File "/jizhicfs/leoyizhang/anaconda3/envs/vllm_0.8.4_f344107/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] return forward_call(*args, **kwargs)
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] File "<eval_with_key>.162", line 574, in forward
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] submod_1 = self.submod_1(getitem, s0, getitem_1, getitem_2, getitem_3); getitem = getitem_1 = getitem_2 = submod_1 = None
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] File "/jizhicfs/leoyizhang/anaconda3/envs/vllm_0.8.4_f344107/lib/python3.12/site-packages/torch/fx/graph_module.py", line 822, in call_wrapped
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] return self._wrapped_call(self, *args, **kwargs)
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] File "/jizhicfs/leoyizhang/anaconda3/envs/vllm_0.8.4_f344107/lib/python3.12/site-packages/torch/fx/graph_module.py", line 400, in call
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] raise e
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] File "/jizhicfs/leoyizhang/anaconda3/envs/vllm_0.8.4_f344107/lib/python3.12/site-packages/torch/fx/graph_module.py", line 387, in call
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] return super(self.cls, obj).call(*args, **kwargs) # type: ignore[misc]
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] File "/jizhicfs/leoyizhang/anaconda3/envs/vllm_0.8.4_f344107/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] return self._call_impl(*args, **kwargs)
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] File "/jizhicfs/leoyizhang/anaconda3/envs/vllm_0.8.4_f344107/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] return forward_call(*args, **kwargs)
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] File "<eval_with_key>.2", line 5, in forward
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] unified_attention_with_output = torch.ops.vllm.unified_attention_with_output(query_2, key_2, value, output_1, 'language_model.model.layers.0.self_attn.attn'); query_2 = key_2 = value = output_1 = unified_attention_with_output = None
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] File "/jizhicfs/leoyizhang/anaconda3/envs/vllm_0.8.4_f344107/lib/python3.12/site-packages/torch/_ops.py", line 1123, in call
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] return self._op(*args, **(kwargs or {}))
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] File "/jizhicfs/leoyizhang/anaconda3/envs/vllm_0.8.4_f344107/lib/python3.12/site-packages/vllm/attention/layer.py", line 415, in unified_attention_with_output
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] self.impl.forward(self,
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] File "/jizhicfs/leoyizhang/anaconda3/envs/vllm_0.8.4_f344107/lib/python3.12/site-packages/vllm/v1/attention/backends/flash_attn.py", line 578, in forward
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] cascade_attention(
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] File "/jizhicfs/leoyizhang/anaconda3/envs/vllm_0.8.4_f344107/lib/python3.12/site-packages/vllm/v1/attention/backends/flash_attn.py", line 710, in cascade_attention
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] prefix_output, prefix_lse = flash_attn_varlen_func(
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] ^^^^^^^^^^^^^^^^^^^^^^^
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] File "/jizhicfs/leoyizhang/anaconda3/envs/vllm_0.8.4_f344107/lib/python3.12/site-packages/vllm/vllm_flash_attn/flash_attn_interface.py", line 253, in flash_attn_varlen_func
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] out, softmax_lse, _, _ = torch.ops._vllm_fa3_C.fwd(
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] ^^^^^^^^^^^^^^^^^^^^^^^^^^
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] File "/jizhicfs/leoyizhang/anaconda3/envs/vllm_0.8.4_f344107/lib/python3.12/site-packages/torch/_ops.py", line 1123, in call
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] return self._op(*args, **(kwargs or {}))
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470] RuntimeError: scheduler_metadata must have shape (metadata_size)
�[1;36m(VllmWorker rank=3 pid=1863294)�[0;0m ERROR 04-23 01:03:33 [multiproc_executor.py:470]
Before submitting a new issue...
The text was updated successfully, but these errors were encountered: