Skip to content

[Bug][Failing Test]: Multi-Modal Models 3 - models/multimodal/generation/test_common.py #18528

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
1 task done
DarkLight1337 opened this issue May 22, 2025 · 7 comments
Closed
1 task done
Labels
bug Something isn't working ci-failure Issue about an unexpected test failure in CI

Comments

@DarkLight1337
Copy link
Member

Your current environment

N/A

🐛 Describe the bug

models/multimodal/generation/test_common.py::test_single_image_models[gemma3-test_case91] is failing on main. It is another illegal memory access error.

https://buildkite.com/vllm/ci/builds/20503/steps?jid=0196f626-d4d6-4af6-b10f-da8c3145ddfc

Stack:

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:68] Dumping input data
--- Logging error ---
[2025-05-22T05:33:18Z] Traceback (most recent call last):
[2025-05-22T05:33:18Z]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 207, in execute_model
[2025-05-22T05:33:18Z]     return self.model_executor.execute_model(scheduler_output)
[2025-05-22T05:33:18Z]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2025-05-22T05:33:18Z]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/executor/abstract.py", line 86, in execute_model
[2025-05-22T05:33:18Z]     output = self.collective_rpc("execute_model",
[2025-05-22T05:33:18Z]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2025-05-22T05:33:18Z]   File "/usr/local/lib/python3.12/dist-packages/vllm/executor/uniproc_executor.py", line 56, in collective_rpc
[2025-05-22T05:33:18Z]     answer = run_method(self.driver_worker, method, args, kwargs)
[2025-05-22T05:33:18Z]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2025-05-22T05:33:18Z]   File "/usr/local/lib/python3.12/dist-packages/vllm/utils.py", line 2598, in run_method
[2025-05-22T05:33:18Z]     return func(*args, **kwargs)
[2025-05-22T05:33:18Z]            ^^^^^^^^^^^^^^^^^^^^^
[2025-05-22T05:33:18Z]   File "/usr/local/lib/python3.12/dist-packages/torch/utils/_contextlib.py", line 116, in decorate_context
[2025-05-22T05:33:18Z]     return func(*args, **kwargs)
[2025-05-22T05:33:18Z]            ^^^^^^^^^^^^^^^^^^^^^
[2025-05-22T05:33:18Z]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_worker.py", line 276, in execute_model
[2025-05-22T05:33:18Z]     output = self.model_runner.execute_model(scheduler_output,
[2025-05-22T05:33:18Z]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2025-05-22T05:33:18Z]   File "/usr/local/lib/python3.12/dist-packages/torch/utils/_contextlib.py", line 116, in decorate_context
[2025-05-22T05:33:18Z]     return func(*args, **kwargs)
[2025-05-22T05:33:18Z]            ^^^^^^^^^^^^^^^^^^^^^
[2025-05-22T05:33:18Z]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_model_runner.py", line 1121, in execute_model
[2025-05-22T05:33:18Z]     self._prepare_inputs(scheduler_output))
[2025-05-22T05:33:18Z]     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2025-05-22T05:33:18Z]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_model_runner.py", line 528, in _prepare_inputs
[2025-05-22T05:33:18Z]     self.input_batch.block_table.commit(num_reqs)
[2025-05-22T05:33:18Z]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/block_table.py", line 81, in commit
[2025-05-22T05:33:18Z]     self.block_table[:num_reqs].copy_(self.block_table_cpu[:num_reqs],
[2025-05-22T05:33:18Z] RuntimeError: CUDA error: an illegal memory access was encountered
[2025-05-22T05:33:18Z] CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
[2025-05-22T05:33:18Z] For debugging consider passing CUDA_LAUNCH_BLOCKING=1
[2025-05-22T05:33:18Z] Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
[2025-05-22T05:33:18Z] 
[2025-05-22T05:33:18Z] 
[2025-05-22T05:33:18Z] During handling of the above exception, another exception occurred:
[2025-05-22T05:33:18Z] 
[2025-05-22T05:33:18Z] Traceback (most recent call last):
[2025-05-22T05:33:18Z]   File "/usr/lib/python3.12/logging/__init__.py", line 1160, in emit
[2025-05-22T05:33:18Z]     msg = self.format(record)
[2025-05-22T05:33:18Z]           ^^^^^^^^^^^^^^^^^^^
[2025-05-22T05:33:18Z]   File "/usr/lib/python3.12/logging/__init__.py", line 999, in format
[2025-05-22T05:33:18Z]     return fmt.format(record)
[2025-05-22T05:33:18Z]            ^^^^^^^^^^^^^^^^^^
[2025-05-22T05:33:18Z]   File "/usr/local/lib/python3.12/dist-packages/vllm/logging_utils/formatter.py", line 13, in format
[2025-05-22T05:33:18Z]     msg = logging.Formatter.format(self, record)
[2025-05-22T05:33:18Z]           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2025-05-22T05:33:18Z]   File "/usr/lib/python3.12/logging/__init__.py", line 703, in format
[2025-05-22T05:33:18Z]     record.message = record.getMessage()
[2025-05-22T05:33:18Z]                      ^^^^^^^^^^^^^^^^^^^
[2025-05-22T05:33:18Z]   File "/usr/lib/python3.12/logging/__init__.py", line 392, in getMessage
[2025-05-22T05:33:18Z]     msg = msg % self.args
[2025-05-22T05:33:18Z]           ~~~~^~~~~~~~~~~
[2025-05-22T05:33:18Z]   File "/usr/local/lib/python3.12/dist-packages/vllm/config.py", line 4488, in __str__
[2025-05-22T05:33:18Z]     f"compilation_config={self.compilation_config!r}")
[2025-05-22T05:33:18Z]                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2025-05-22T05:33:18Z]   File "/usr/local/lib/python3.12/dist-packages/vllm/config.py", line 3872, in __repr__
[2025-05-22T05:33:18Z]     for k, v in asdict(self).items():
[2025-05-22T05:33:18Z]                 ^^^^^^^^^^^^
[2025-05-22T05:33:18Z]   File "/usr/lib/python3.12/dataclasses.py", line 1329, in asdict
[2025-05-22T05:33:18Z]     return _asdict_inner(obj, dict_factory)
[2025-05-22T05:33:18Z]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2025-05-22T05:33:18Z]   File "/usr/lib/python3.12/dataclasses.py", line 1339, in _asdict_inner
[2025-05-22T05:33:18Z]     f.name: _asdict_inner(getattr(obj, f.name), dict)
[2025-05-22T05:33:18Z]             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2025-05-22T05:33:18Z]   File "/usr/lib/python3.12/dataclasses.py", line 1382, in _asdict_inner
[2025-05-22T05:33:18Z]     return type(obj)((_asdict_inner(k, dict_factory),
[2025-05-22T05:33:18Z]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2025-05-22T05:33:18Z]   File "/usr/lib/python3.12/dataclasses.py", line 1383, in <genexpr>
[2025-05-22T05:33:18Z]     _asdict_inner(v, dict_factory))
[2025-05-22T05:33:18Z]     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2025-05-22T05:33:18Z]   File "/usr/lib/python3.12/dataclasses.py", line 1386, in _asdict_inner
[2025-05-22T05:33:18Z]     return copy.deepcopy(obj)
[2025-05-22T05:33:18Z]            ^^^^^^^^^^^^^^^^^^
[2025-05-22T05:33:18Z]   File "/usr/lib/python3.12/copy.py", line 162, in deepcopy
[2025-05-22T05:33:18Z]     y = _reconstruct(x, memo, *rv)
[2025-05-22T05:33:18Z]         ^^^^^^^^^^^^^^^^^^^^^^^^^^
[2025-05-22T05:33:18Z]   File "/usr/lib/python3.12/copy.py", line 259, in _reconstruct
[2025-05-22T05:33:18Z]     state = deepcopy(state, memo)
[2025-05-22T05:33:18Z]             ^^^^^^^^^^^^^^^^^^^^^
[2025-05-22T05:33:18Z]   File "/usr/lib/python3.12/copy.py", line 136, in deepcopy
[2025-05-22T05:33:18Z]     y = copier(x, memo)
[2025-05-22T05:33:18Z]         ^^^^^^^^^^^^^^^
[2025-05-22T05:33:18Z]   File "/usr/lib/python3.12/copy.py", line 221, in _deepcopy_dict
[2025-05-22T05:33:18Z]     y[deepcopy(key, memo)] = deepcopy(value, memo)
[2025-05-22T05:33:18Z]                              ^^^^^^^^^^^^^^^^^^^^^
[2025-05-22T05:33:18Z]   File "/usr/lib/python3.12/copy.py", line 143, in deepcopy
[2025-05-22T05:33:18Z]     y = copier(memo)
[2025-05-22T05:33:18Z]         ^^^^^^^^^^^^
[2025-05-22T05:33:18Z]   File "/usr/local/lib/python3.12/dist-packages/torch/_tensor.py", line 172, in __deepcopy__
[2025-05-22T05:33:18Z]     new_storage = self._typed_storage()._deepcopy(memo)
[2025-05-22T05:33:18Z]                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2025-05-22T05:33:18Z]   File "/usr/local/lib/python3.12/dist-packages/torch/storage.py", line 1134, in _deepcopy
[2025-05-22T05:33:18Z]     return self._new_wrapped_storage(copy.deepcopy(self._untyped_storage, memo))
[2025-05-22T05:33:18Z]                                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2025-05-22T05:33:18Z]   File "/usr/lib/python3.12/copy.py", line 143, in deepcopy
[2025-05-22T05:33:18Z]     y = copier(memo)
[2025-05-22T05:33:18Z]         ^^^^^^^^^^^^
[2025-05-22T05:33:18Z]   File "/usr/local/lib/python3.12/dist-packages/torch/storage.py", line 239, in __deepcopy__
[2025-05-22T05:33:18Z]     new_storage = self.clone()
[2025-05-22T05:33:18Z]                   ^^^^^^^^^^^^
[2025-05-22T05:33:18Z]   File "/usr/local/lib/python3.12/dist-packages/torch/storage.py", line 253, in clone
[2025-05-22T05:33:18Z]     return type(self)(self.nbytes(), device=self.device).copy_(self)
[2025-05-22T05:33:18Z]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2025-05-22T05:33:18Z] RuntimeError: CUDA error: an illegal memory access was encountered
[2025-05-22T05:33:18Z] CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
[2025-05-22T05:33:18Z] For debugging consider passing CUDA_LAUNCH_BLOCKING=1
[2025-05-22T05:33:18Z] Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
[2025-05-22T05:33:18Z] 
[2025-05-22T05:33:18Z] Call stack:
[2025-05-22T05:33:18Z]   File "<string>", line 1, in <module>
[2025-05-22T05:33:18Z]   File "/usr/lib/python3.12/multiprocessing/spawn.py", line 122, in spawn_main
[2025-05-22T05:33:18Z]     exitcode = _main(fd, parent_sentinel)
[2025-05-22T05:33:18Z]   File "/usr/lib/python3.12/multiprocessing/spawn.py", line 135, in _main
[2025-05-22T05:33:18Z]     return self._bootstrap(parent_sentinel)
[2025-05-22T05:33:18Z]   File "/usr/lib/python3.12/multiprocessing/process.py", line 314, in _bootstrap
[2025-05-22T05:33:18Z]     self.run()
[2025-05-22T05:33:18Z]   File "/usr/lib/python3.12/multiprocessing/process.py", line 108, in run
[2025-05-22T05:33:18Z]     self._target(*self._args, **self._kwargs)
[2025-05-22T05:33:18Z]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 486, in run_engine_core
[2025-05-22T05:33:18Z]     engine_core.run_busy_loop()
[2025-05-22T05:33:18Z]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 513, in run_busy_loop
[2025-05-22T05:33:18Z]     self._process_engine_step()
[2025-05-22T05:33:18Z]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 538, in _process_engine_step
[2025-05-22T05:33:18Z]     outputs = self.step_fn()
[2025-05-22T05:33:18Z]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 226, in step
[2025-05-22T05:33:18Z]     model_output = self.execute_model(scheduler_output)
[2025-05-22T05:33:18Z]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 210, in execute_model
[2025-05-22T05:33:18Z]     dump_engine_exception(self.vllm_config, scheduler_output,
[2025-05-22T05:33:18Z]   File "/usr/local/lib/python3.12/dist-packages/vllm/logging_utils/dump_input.py", line 62, in dump_engine_exception
[2025-05-22T05:33:18Z]     _dump_engine_exception(config, scheduler_output, scheduler_stats)
[2025-05-22T05:33:18Z]   File "/usr/local/lib/python3.12/dist-packages/vllm/logging_utils/dump_input.py", line 70, in _dump_engine_exception
[2025-05-22T05:33:18Z]     logger.error(
[2025-05-22T05:33:18Z] Unable to print the message and arguments - possible formatting error.
[2025-05-22T05:33:18Z] Use the traceback above to help find the error.
[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:78] Dumping scheduler output for model execution:
[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:79] SchedulerOutput(scheduled_new_reqs=[NewRequestData(req_id=0,prompt_token_ids_len=281,mm_inputs=[{'pixel_values': tensor([[[[-0.6314, -0.6314, -0.6314,  ...,  0.5922,  0.5451,  0.5373],

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:79]           [-0.6314, -0.6314, -0.6314,  ...,  0.5922,  0.5451,  0.5373],

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:79]           [-0.6314, -0.6314, -0.6314,  ...,  0.5529,  0.5059,  0.4980],

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:79]           ...,

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:79]           [ 0.3176,  0.3176,  0.3020,  ...,  0.5294,  0.5373,  0.5373],

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:79]           [ 0.3176,  0.3176,  0.3020,  ...,  0.5294,  0.5373,  0.5373],

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:79]           [ 0.3176,  0.3176,  0.3020,  ...,  0.5294,  0.5373,  0.5373]],

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:79] 

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:79]          [[-0.8980, -0.8980, -0.8980,  ...,  0.5216,  0.4431,  0.4353],

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:79]           [-0.8980, -0.8980, -0.8980,  ...,  0.5216,  0.4431,  0.4353],

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:79]           [-0.8980, -0.8980, -0.8980,  ...,  0.4588,  0.3882,  0.3804],

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:79]           ...,

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:79]           [ 0.3647,  0.3647,  0.3490,  ...,  0.5451,  0.5529,  0.5529],

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:79]           [ 0.3647,  0.3647,  0.3490,  ...,  0.5451,  0.5529,  0.5529],

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:79]           [ 0.3647,  0.3647,  0.3490,  ...,  0.5451,  0.5529,  0.5529]],

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:79] 

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:79]          [[-0.9686, -0.9686, -0.9686,  ...,  0.4510,  0.3490,  0.3333],

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:79]           [-0.9686, -0.9686, -0.9686,  ...,  0.4510,  0.3490,  0.3333],

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:79]           [-0.9686, -0.9686, -0.9686,  ...,  0.3725,  0.2784,  0.2627],

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:79]           ...,

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:79]           [ 0.2863,  0.2863,  0.2706,  ...,  0.4431,  0.4510,  0.4510],

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:79]           [ 0.2863,  0.2863,  0.2706,  ...,  0.4431,  0.4510,  0.4510],

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:79]           [ 0.2863,  0.2863,  0.2706,  ...,  0.4431,  0.4510,  0.4510]]]]), 'num_crops': tensor([0])}],mm_hashes=['f60a83610bcc902af2e0be4780926de06a310afae0d11f9d2feee331134ff15a'],mm_positions=[PlaceholderRange(offset=4, length=260, is_embed=tensor([False, False,  True,  True,  True,  True,  True,  True,  True,  True,

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:79]          True,  True,  True,  True,  True,  True,  True,  True,  True,  True,

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:79]          True,  True,  True,  True,  True,  True,  True,  True,  True,  True,

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:79]          True,  True,  True,  True,  True,  True,  True,  True,  True,  True,

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:79]          True,  True,  True,  True,  True,  True,  True,  True,  True,  True,

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:79]          True,  True,  True,  True,  True,  True,  True,  True,  True,  True,

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:79]          True,  True,  True,  True,  True,  True,  True,  True,  True,  True,

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:79]          True,  True,  True,  True,  True,  True,  True,  True,  True,  True,

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:79]          True,  True,  True,  True,  True,  True,  True,  True,  True,  True,

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:79]          True,  True,  True,  True,  True,  True,  True,  True,  True,  True,

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:79]          True,  True,  True,  True,  True,  True,  True,  True,  True,  True,

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:79]          True,  True,  True,  True,  True,  True,  True,  True,  True,  True,

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:79]          True,  True,  True,  True,  True,  True,  True,  True,  True,  True,

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:79]          True,  True,  True,  True,  True,  True,  True,  True,  True,  True,

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:79]          True,  True,  True,  True,  True,  True,  True,  True,  True,  True,

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:79]          True,  True,  True,  True,  True,  True,  True,  True,  True,  True,

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:79]          True,  True,  True,  True,  True,  True,  True,  True,  True,  True,

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:79]          True,  True,  True,  True,  True,  True,  True,  True,  True,  True,

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:79]          True,  True,  True,  True,  True,  True,  True,  True,  True,  True,

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:79]          True,  True,  True,  True,  True,  True,  True,  True,  True,  True,

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:79]          True,  True,  True,  True,  True,  True,  True,  True,  True,  True,

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:79]          True,  True,  True,  True,  True,  True,  True,  True,  True,  True,

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:79]          True,  True,  True,  True,  True,  True,  True,  True,  True,  True,

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:79]          True,  True,  True,  True,  True,  True,  True,  True,  True,  True,

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:79]          True,  True,  True,  True,  True,  True,  True,  True,  True,  True,

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:79]          True,  True,  True,  True,  True,  True,  True,  True, False, False]))],sampling_params=SamplingParams(n=1, presence_penalty=0.0, frequency_penalty=0.0, repetition_penalty=1.0, temperature=0.0, top_p=1.0, top_k=0, min_p=0.0, seed=None, stop=[], stop_token_ids=[106], bad_words=[], include_stop_str_in_output=False, ignore_eos=False, max_tokens=128, min_tokens=0, logprobs=5, prompt_logprobs=None, skip_special_tokens=True, spaces_between_special_tokens=True, truncate_prompt_tokens=None, guided_decoding=None, extra_args=None),block_ids=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18],num_computed_tokens=0,lora_request=None)],scheduled_cached_reqs=[],num_scheduled_tokens={0: 281},total_num_scheduled_tokens=281,scheduled_spec_decode_tokens={},scheduled_encoder_inputs={0: [0]},num_common_prefix_blocks=18,finished_req_ids=[],free_encoder_input_ids=[],structured_output_request_ids={},grammar_bitmask=null,kv_connector_metadata=null)
[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [core.py:495] EngineCore encountered a fatal error.

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [core.py:495] Traceback (most recent call last):

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [core.py:495]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 486, in run_engine_core

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [core.py:495]     engine_core.run_busy_loop()

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [core.py:495]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 513, in run_busy_loop

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [core.py:495]     self._process_engine_step()

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [core.py:495]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 538, in _process_engine_step

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [core.py:495]     outputs = self.step_fn()

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [core.py:495]               ^^^^^^^^^^^^^^

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [core.py:495]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 226, in step

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [core.py:495]     model_output = self.execute_model(scheduler_output)

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [core.py:495]                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [core.py:495]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 213, in execute_model

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [core.py:495]     raise err

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [core.py:495]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 207, in execute_model

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [core.py:495]     return self.model_executor.execute_model(scheduler_output)

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [core.py:495]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [core.py:495]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/executor/abstract.py", line 86, in execute_model

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [core.py:495]     output = self.collective_rpc("execute_model",

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [core.py:495]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [core.py:495]   File "/usr/local/lib/python3.12/dist-packages/vllm/executor/uniproc_executor.py", line 56, in collective_rpc

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [core.py:495]     answer = run_method(self.driver_worker, method, args, kwargs)

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [core.py:495]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [core.py:495]   File "/usr/local/lib/python3.12/dist-packages/vllm/utils.py", line 2598, in run_method

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [core.py:495]     return func(*args, **kwargs)

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [core.py:495]            ^^^^^^^^^^^^^^^^^^^^^

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [core.py:495]   File "/usr/local/lib/python3.12/dist-packages/torch/utils/_contextlib.py", line 116, in decorate_context

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [core.py:495]     return func(*args, **kwargs)

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [core.py:495]            ^^^^^^^^^^^^^^^^^^^^^

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [core.py:495]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_worker.py", line 276, in execute_model

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [core.py:495]     output = self.model_runner.execute_model(scheduler_output,

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [core.py:495]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [core.py:495]   File "/usr/local/lib/python3.12/dist-packages/torch/utils/_contextlib.py", line 116, in decorate_context

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [core.py:495]     return func(*args, **kwargs)

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [core.py:495]            ^^^^^^^^^^^^^^^^^^^^^

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [core.py:495]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_model_runner.py", line 1121, in execute_model

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [core.py:495]     self._prepare_inputs(scheduler_output))

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [core.py:495]     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [core.py:495]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_model_runner.py", line 528, in _prepare_inputs

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [core.py:495]     self.input_batch.block_table.commit(num_reqs)

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [core.py:495]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/block_table.py", line 81, in commit

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [core.py:495]     self.block_table[:num_reqs].copy_(self.block_table_cpu[:num_reqs],

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [core.py:495] RuntimeError: CUDA error: an illegal memory access was encountered

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [core.py:495] CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [core.py:495] For debugging consider passing CUDA_LAUNCH_BLOCKING=1

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [core.py:495] Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [core.py:495] 
[2025-05-22T05:33:18Z] Process EngineCore_0:
[2025-05-22T05:33:18Z] Traceback (most recent call last):
[2025-05-22T05:33:18Z]   File "/usr/lib/python3.12/multiprocessing/process.py", line 314, in _bootstrap
[2025-05-22T05:33:18Z]     self.run()
[2025-05-22T05:33:18Z]   File "/usr/lib/python3.12/multiprocessing/process.py", line 108, in run
[2025-05-22T05:33:18Z]     self._target(*self._args, **self._kwargs)
[2025-05-22T05:33:18Z]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 497, in run_engine_core
[2025-05-22T05:33:18Z]     raise e
[2025-05-22T05:33:18Z]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 486, in run_engine_core
[2025-05-22T05:33:18Z]     engine_core.run_busy_loop()
[2025-05-22T05:33:18Z]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 513, in run_busy_loop
[2025-05-22T05:33:18Z]     self._process_engine_step()
[2025-05-22T05:33:18Z]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 538, in _process_engine_step
[2025-05-22T05:33:18Z]     outputs = self.step_fn()
[2025-05-22T05:33:18Z]               ^^^^^^^^^^^^^^
[2025-05-22T05:33:18Z]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 226, in step
[2025-05-22T05:33:18Z]     model_output = self.execute_model(scheduler_output)
[2025-05-22T05:33:18Z]                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2025-05-22T05:33:18Z]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 213, in execute_model
[2025-05-22T05:33:18Z]     raise err
[2025-05-22T05:33:18Z]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 207, in execute_model
[2025-05-22T05:33:18Z]     return self.model_executor.execute_model(scheduler_output)
[2025-05-22T05:33:18Z]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2025-05-22T05:33:18Z]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/executor/abstract.py", line 86, in execute_model
[2025-05-22T05:33:18Z]     output = self.collective_rpc("execute_model",
[2025-05-22T05:33:18Z]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2025-05-22T05:33:18Z]   File "/usr/local/lib/python3.12/dist-packages/vllm/executor/uniproc_executor.py", line 56, in collective_rpc
[2025-05-22T05:33:18Z]     answer = run_method(self.driver_worker, method, args, kwargs)
[2025-05-22T05:33:18Z]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2025-05-22T05:33:18Z]   File "/usr/local/lib/python3.12/dist-packages/vllm/utils.py", line 2598, in run_method
[2025-05-22T05:33:18Z]     return func(*args, **kwargs)
[2025-05-22T05:33:18Z]            ^^^^^^^^^^^^^^^^^^^^^
[2025-05-22T05:33:18Z]   File "/usr/local/lib/python3.12/dist-packages/torch/utils/_contextlib.py", line 116, in decorate_context
[2025-05-22T05:33:18Z]     return func(*args, **kwargs)
[2025-05-22T05:33:18Z]            ^^^^^^^^^^^^^^^^^^^^^
[2025-05-22T05:33:18Z]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_worker.py", line 276, in execute_model
[2025-05-22T05:33:18Z]     output = self.model_runner.execute_model(scheduler_output,
[2025-05-22T05:33:18Z]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2025-05-22T05:33:18Z]   File "/usr/local/lib/python3.12/dist-packages/torch/utils/_contextlib.py", line 116, in decorate_context
[2025-05-22T05:33:18Z]     return func(*args, **kwargs)
[2025-05-22T05:33:18Z]            ^^^^^^^^^^^^^^^^^^^^^
[2025-05-22T05:33:18Z]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_model_runner.py", line 1121, in execute_model
[2025-05-22T05:33:18Z]     self._prepare_inputs(scheduler_output))
[2025-05-22T05:33:18Z]     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2025-05-22T05:33:18Z]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_model_runner.py", line 528, in _prepare_inputs
[2025-05-22T05:33:18Z]     self.input_batch.block_table.commit(num_reqs)
[2025-05-22T05:33:18Z]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/block_table.py", line 81, in commit
[2025-05-22T05:33:18Z]     self.block_table[:num_reqs].copy_(self.block_table_cpu[:num_reqs],
[2025-05-22T05:33:18Z] RuntimeError: CUDA error: an illegal memory access was encountered
[2025-05-22T05:33:18Z] CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
[2025-05-22T05:33:18Z] For debugging consider passing CUDA_LAUNCH_BLOCKING=1
[2025-05-22T05:33:18Z] Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
[2025-05-22T05:33:18Z] 
[2025-05-22T05:33:18Z] 

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.
@DarkLight1337 DarkLight1337 added bug Something isn't working ci-failure Issue about an unexpected test failure in CI labels May 22, 2025
@DarkLight1337
Copy link
Member Author

@DarkLight1337
Copy link
Member Author

I'm unblocking the test on previous commits on main to see which commit caused the failure.

@DarkLight1337 DarkLight1337 changed the title [Bug][Failing Test]: Multi Modal 3 - models/multimodal/generation/test_common.py [Bug][Failing Test]: Multi-modal Models 3 - models/multimodal/generation/test_common.py May 22, 2025
@DarkLight1337 DarkLight1337 changed the title [Bug][Failing Test]: Multi-modal Models 3 - models/multimodal/generation/test_common.py [Bug][Failing Test]: Multi-Modal Models 3 - models/multimodal/generation/test_common.py May 22, 2025
@DarkLight1337
Copy link
Member Author

The test appears to be flaky. The earliest failure I found was in #18459 but the test passed in a later commit #18506, before failing again in #18331

@russellb
Copy link
Member

PR #18543 says the error here may still occur, but should no longer result in illegal memory access. It would be nice to see what the updated failure looks like after that fix went in.

@russellb
Copy link
Member

I ran this test successfully more than 200 times in a loop, so I haven't been able to reproduce it easily locally at this point.

@DarkLight1337
Copy link
Member Author

DarkLight1337 commented May 23, 2025

Ok then let's consider the test as fixed, thanks for the follow-up!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working ci-failure Issue about an unexpected test failure in CI
Projects
Status: Done
Development

No branches or pull requests

3 participants