[Bug][Failing Test]: Multi-Modal Models 3 - models/multimodal/generation/test_common.py #18528

DarkLight1337 · 2025-05-22T06:03:59Z

Your current environment

N/A

🐛 Describe the bug

models/multimodal/generation/test_common.py::test_single_image_models[gemma3-test_case91] is failing on main. It is another illegal memory access error.

https://buildkite.com/vllm/ci/builds/20503/steps?jid=0196f626-d4d6-4af6-b10f-da8c3145ddfc

Stack:

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:68] Dumping input data
--- Logging error ---
[2025-05-22T05:33:18Z] Traceback (most recent call last):
[2025-05-22T05:33:18Z]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 207, in execute_model
[2025-05-22T05:33:18Z]     return self.model_executor.execute_model(scheduler_output)
[2025-05-22T05:33:18Z]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2025-05-22T05:33:18Z]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/executor/abstract.py", line 86, in execute_model
[2025-05-22T05:33:18Z]     output = self.collective_rpc("execute_model",
[2025-05-22T05:33:18Z]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2025-05-22T05:33:18Z]   File "/usr/local/lib/python3.12/dist-packages/vllm/executor/uniproc_executor.py", line 56, in collective_rpc
[2025-05-22T05:33:18Z]     answer = run_method(self.driver_worker, method, args, kwargs)
[2025-05-22T05:33:18Z]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2025-05-22T05:33:18Z]   File "/usr/local/lib/python3.12/dist-packages/vllm/utils.py", line 2598, in run_method
[2025-05-22T05:33:18Z]     return func(*args, **kwargs)
[2025-05-22T05:33:18Z]            ^^^^^^^^^^^^^^^^^^^^^
[2025-05-22T05:33:18Z]   File "/usr/local/lib/python3.12/dist-packages/torch/utils/_contextlib.py", line 116, in decorate_context
[2025-05-22T05:33:18Z]     return func(*args, **kwargs)
[2025-05-22T05:33:18Z]            ^^^^^^^^^^^^^^^^^^^^^
[2025-05-22T05:33:18Z]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_worker.py", line 276, in execute_model
[2025-05-22T05:33:18Z]     output = self.model_runner.execute_model(scheduler_output,
[2025-05-22T05:33:18Z]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2025-05-22T05:33:18Z]   File "/usr/local/lib/python3.12/dist-packages/torch/utils/_contextlib.py", line 116, in decorate_context
[2025-05-22T05:33:18Z]     return func(*args, **kwargs)
[2025-05-22T05:33:18Z]            ^^^^^^^^^^^^^^^^^^^^^
[2025-05-22T05:33:18Z]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_model_runner.py", line 1121, in execute_model
[2025-05-22T05:33:18Z]     self._prepare_inputs(scheduler_output))
[2025-05-22T05:33:18Z]     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2025-05-22T05:33:18Z]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_model_runner.py", line 528, in _prepare_inputs
[2025-05-22T05:33:18Z]     self.input_batch.block_table.commit(num_reqs)
[2025-05-22T05:33:18Z]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/block_table.py", line 81, in commit
[2025-05-22T05:33:18Z]     self.block_table[:num_reqs].copy_(self.block_table_cpu[:num_reqs],
[2025-05-22T05:33:18Z] RuntimeError: CUDA error: an illegal memory access was encountered
[2025-05-22T05:33:18Z] CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
[2025-05-22T05:33:18Z] For debugging consider passing CUDA_LAUNCH_BLOCKING=1
[2025-05-22T05:33:18Z] Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
[2025-05-22T05:33:18Z] 
[2025-05-22T05:33:18Z] 
[2025-05-22T05:33:18Z] During handling of the above exception, another exception occurred:
[2025-05-22T05:33:18Z] 
[2025-05-22T05:33:18Z] Traceback (most recent call last):
[2025-05-22T05:33:18Z]   File "/usr/lib/python3.12/logging/__init__.py", line 1160, in emit
[2025-05-22T05:33:18Z]     msg = self.format(record)
[2025-05-22T05:33:18Z]           ^^^^^^^^^^^^^^^^^^^
[2025-05-22T05:33:18Z]   File "/usr/lib/python3.12/logging/__init__.py", line 999, in format
[2025-05-22T05:33:18Z]     return fmt.format(record)
[2025-05-22T05:33:18Z]            ^^^^^^^^^^^^^^^^^^
[2025-05-22T05:33:18Z]   File "/usr/local/lib/python3.12/dist-packages/vllm/logging_utils/formatter.py", line 13, in format
[2025-05-22T05:33:18Z]     msg = logging.Formatter.format(self, record)
[2025-05-22T05:33:18Z]           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2025-05-22T05:33:18Z]   File "/usr/lib/python3.12/logging/__init__.py", line 703, in format
[2025-05-22T05:33:18Z]     record.message = record.getMessage()
[2025-05-22T05:33:18Z]                      ^^^^^^^^^^^^^^^^^^^
[2025-05-22T05:33:18Z]   File "/usr/lib/python3.12/logging/__init__.py", line 392, in getMessage
[2025-05-22T05:33:18Z]     msg = msg % self.args
[2025-05-22T05:33:18Z]           ~~~~^~~~~~~~~~~
[2025-05-22T05:33:18Z]   File "/usr/local/lib/python3.12/dist-packages/vllm/config.py", line 4488, in __str__
[2025-05-22T05:33:18Z]     f"compilation_config={self.compilation_config!r}")
[2025-05-22T05:33:18Z]                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2025-05-22T05:33:18Z]   File "/usr/local/lib/python3.12/dist-packages/vllm/config.py", line 3872, in __repr__
[2025-05-22T05:33:18Z]     for k, v in asdict(self).items():
[2025-05-22T05:33:18Z]                 ^^^^^^^^^^^^
[2025-05-22T05:33:18Z]   File "/usr/lib/python3.12/dataclasses.py", line 1329, in asdict
[2025-05-22T05:33:18Z]     return _asdict_inner(obj, dict_factory)
[2025-05-22T05:33:18Z]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2025-05-22T05:33:18Z]   File "/usr/lib/python3.12/dataclasses.py", line 1339, in _asdict_inner
[2025-05-22T05:33:18Z]     f.name: _asdict_inner(getattr(obj, f.name), dict)
[2025-05-22T05:33:18Z]             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2025-05-22T05:33:18Z]   File "/usr/lib/python3.12/dataclasses.py", line 1382, in _asdict_inner
[2025-05-22T05:33:18Z]     return type(obj)((_asdict_inner(k, dict_factory),
[2025-05-22T05:33:18Z]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2025-05-22T05:33:18Z]   File "/usr/lib/python3.12/dataclasses.py", line 1383, in <genexpr>
[2025-05-22T05:33:18Z]     _asdict_inner(v, dict_factory))
[2025-05-22T05:33:18Z]     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2025-05-22T05:33:18Z]   File "/usr/lib/python3.12/dataclasses.py", line 1386, in _asdict_inner
[2025-05-22T05:33:18Z]     return copy.deepcopy(obj)
[2025-05-22T05:33:18Z]            ^^^^^^^^^^^^^^^^^^
[2025-05-22T05:33:18Z]   File "/usr/lib/python3.12/copy.py", line 162, in deepcopy
[2025-05-22T05:33:18Z]     y = _reconstruct(x, memo, *rv)
[2025-05-22T05:33:18Z]         ^^^^^^^^^^^^^^^^^^^^^^^^^^
[2025-05-22T05:33:18Z]   File "/usr/lib/python3.12/copy.py", line 259, in _reconstruct
[2025-05-22T05:33:18Z]     state = deepcopy(state, memo)
[2025-05-22T05:33:18Z]             ^^^^^^^^^^^^^^^^^^^^^
[2025-05-22T05:33:18Z]   File "/usr/lib/python3.12/copy.py", line 136, in deepcopy
[2025-05-22T05:33:18Z]     y = copier(x, memo)
[2025-05-22T05:33:18Z]         ^^^^^^^^^^^^^^^
[2025-05-22T05:33:18Z]   File "/usr/lib/python3.12/copy.py", line 221, in _deepcopy_dict
[2025-05-22T05:33:18Z]     y[deepcopy(key, memo)] = deepcopy(value, memo)
[2025-05-22T05:33:18Z]                              ^^^^^^^^^^^^^^^^^^^^^
[2025-05-22T05:33:18Z]   File "/usr/lib/python3.12/copy.py", line 143, in deepcopy
[2025-05-22T05:33:18Z]     y = copier(memo)
[2025-05-22T05:33:18Z]         ^^^^^^^^^^^^
[2025-05-22T05:33:18Z]   File "/usr/local/lib/python3.12/dist-packages/torch/_tensor.py", line 172, in __deepcopy__
[2025-05-22T05:33:18Z]     new_storage = self._typed_storage()._deepcopy(memo)
[2025-05-22T05:33:18Z]                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2025-05-22T05:33:18Z]   File "/usr/local/lib/python3.12/dist-packages/torch/storage.py", line 1134, in _deepcopy
[2025-05-22T05:33:18Z]     return self._new_wrapped_storage(copy.deepcopy(self._untyped_storage, memo))
[2025-05-22T05:33:18Z]                                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2025-05-22T05:33:18Z]   File "/usr/lib/python3.12/copy.py", line 143, in deepcopy
[2025-05-22T05:33:18Z]     y = copier(memo)
[2025-05-22T05:33:18Z]         ^^^^^^^^^^^^
[2025-05-22T05:33:18Z]   File "/usr/local/lib/python3.12/dist-packages/torch/storage.py", line 239, in __deepcopy__
[2025-05-22T05:33:18Z]     new_storage = self.clone()
[2025-05-22T05:33:18Z]                   ^^^^^^^^^^^^
[2025-05-22T05:33:18Z]   File "/usr/local/lib/python3.12/dist-packages/torch/storage.py", line 253, in clone
[2025-05-22T05:33:18Z]     return type(self)(self.nbytes(), device=self.device).copy_(self)
[2025-05-22T05:33:18Z]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2025-05-22T05:33:18Z] RuntimeError: CUDA error: an illegal memory access was encountered
[2025-05-22T05:33:18Z] CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
[2025-05-22T05:33:18Z] For debugging consider passing CUDA_LAUNCH_BLOCKING=1
[2025-05-22T05:33:18Z] Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
[2025-05-22T05:33:18Z] 
[2025-05-22T05:33:18Z] Call stack:
[2025-05-22T05:33:18Z]   File "<string>", line 1, in <module>
[2025-05-22T05:33:18Z]   File "/usr/lib/python3.12/multiprocessing/spawn.py", line 122, in spawn_main
[2025-05-22T05:33:18Z]     exitcode = _main(fd, parent_sentinel)
[2025-05-22T05:33:18Z]   File "/usr/lib/python3.12/multiprocessing/spawn.py", line 135, in _main
[2025-05-22T05:33:18Z]     return self._bootstrap(parent_sentinel)
[2025-05-22T05:33:18Z]   File "/usr/lib/python3.12/multiprocessing/process.py", line 314, in _bootstrap
[2025-05-22T05:33:18Z]     self.run()
[2025-05-22T05:33:18Z]   File "/usr/lib/python3.12/multiprocessing/process.py", line 108, in run
[2025-05-22T05:33:18Z]     self._target(*self._args, **self._kwargs)
[2025-05-22T05:33:18Z]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 486, in run_engine_core
[2025-05-22T05:33:18Z]     engine_core.run_busy_loop()
[2025-05-22T05:33:18Z]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 513, in run_busy_loop
[2025-05-22T05:33:18Z]     self._process_engine_step()
[2025-05-22T05:33:18Z]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 538, in _process_engine_step
[2025-05-22T05:33:18Z]     outputs = self.step_fn()
[2025-05-22T05:33:18Z]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 226, in step
[2025-05-22T05:33:18Z]     model_output = self.execute_model(scheduler_output)
[2025-05-22T05:33:18Z]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 210, in execute_model
[2025-05-22T05:33:18Z]     dump_engine_exception(self.vllm_config, scheduler_output,
[2025-05-22T05:33:18Z]   File "/usr/local/lib/python3.12/dist-packages/vllm/logging_utils/dump_input.py", line 62, in dump_engine_exception
[2025-05-22T05:33:18Z]     _dump_engine_exception(config, scheduler_output, scheduler_stats)
[2025-05-22T05:33:18Z]   File "/usr/local/lib/python3.12/dist-packages/vllm/logging_utils/dump_input.py", line 70, in _dump_engine_exception
[2025-05-22T05:33:18Z]     logger.error(
[2025-05-22T05:33:18Z] Unable to print the message and arguments - possible formatting error.
[2025-05-22T05:33:18Z] Use the traceback above to help find the error.
[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:78] Dumping scheduler output for model execution:
[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:79] SchedulerOutput(scheduled_new_reqs=[NewRequestData(req_id=0,prompt_token_ids_len=281,mm_inputs=[{'pixel_values': tensor([[[[-0.6314, -0.6314, -0.6314,  ...,  0.5922,  0.5451,  0.5373],

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:79]           [-0.6314, -0.6314, -0.6314,  ...,  0.5922,  0.5451,  0.5373],

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:79]           [-0.6314, -0.6314, -0.6314,  ...,  0.5529,  0.5059,  0.4980],

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:79]           ...,

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:79]           [ 0.3176,  0.3176,  0.3020,  ...,  0.5294,  0.5373,  0.5373],

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:79]           [ 0.3176,  0.3176,  0.3020,  ...,  0.5294,  0.5373,  0.5373],

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:79]           [ 0.3176,  0.3176,  0.3020,  ...,  0.5294,  0.5373,  0.5373]],

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:79] 

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:79]          [[-0.8980, -0.8980, -0.8980,  ...,  0.5216,  0.4431,  0.4353],

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:79]           [-0.8980, -0.8980, -0.8980,  ...,  0.5216,  0.4431,  0.4353],

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:79]           [-0.8980, -0.8980, -0.8980,  ...,  0.4588,  0.3882,  0.3804],

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:79]           ...,

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:79]           [ 0.3647,  0.3647,  0.3490,  ...,  0.5451,  0.5529,  0.5529],

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:79]           [ 0.3647,  0.3647,  0.3490,  ...,  0.5451,  0.5529,  0.5529],

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:79]           [ 0.3647,  0.3647,  0.3490,  ...,  0.5451,  0.5529,  0.5529]],

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:79] 

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:79]          [[-0.9686, -0.9686, -0.9686,  ...,  0.4510,  0.3490,  0.3333],

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:79]           [-0.9686, -0.9686, -0.9686,  ...,  0.4510,  0.3490,  0.3333],

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:79]           [-0.9686, -0.9686, -0.9686,  ...,  0.3725,  0.2784,  0.2627],

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:79]           ...,

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:79]           [ 0.2863,  0.2863,  0.2706,  ...,  0.4431,  0.4510,  0.4510],

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:79]           [ 0.2863,  0.2863,  0.2706,  ...,  0.4431,  0.4510,  0.4510],

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:79]           [ 0.2863,  0.2863,  0.2706,  ...,  0.4431,  0.4510,  0.4510]]]]), 'num_crops': tensor([0])}],mm_hashes=['f60a83610bcc902af2e0be4780926de06a310afae0d11f9d2feee331134ff15a'],mm_positions=[PlaceholderRange(offset=4, length=260, is_embed=tensor([False, False,  True,  True,  True,  True,  True,  True,  True,  True,

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:79]          True,  True,  True,  True,  True,  True,  True,  True,  True,  True,

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:79]          True,  True,  True,  True,  True,  True,  True,  True,  True,  True,

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:79]          True,  True,  True,  True,  True,  True,  True,  True,  True,  True,

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:79]          True,  True,  True,  True,  True,  True,  True,  True,  True,  True,

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:79]          True,  True,  True,  True,  True,  True,  True,  True,  True,  True,

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:79]          True,  True,  True,  True,  True,  True,  True,  True,  True,  True,

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:79]          True,  True,  True,  True,  True,  True,  True,  True,  True,  True,

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:79]          True,  True,  True,  True,  True,  True,  True,  True,  True,  True,

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:79]          True,  True,  True,  True,  True,  True,  True,  True,  True,  True,

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:79]          True,  True,  True,  True,  True,  True,  True,  True,  True,  True,

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:79]          True,  True,  True,  True,  True,  True,  True,  True,  True,  True,

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:79]          True,  True,  True,  True,  True,  True,  True,  True,  True,  True,

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:79]          True,  True,  True,  True,  True,  True,  True,  True,  True,  True,

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:79]          True,  True,  True,  True,  True,  True,  True,  True,  True,  True,

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:79]          True,  True,  True,  True,  True,  True,  True,  True,  True,  True,

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:79]          True,  True,  True,  True,  True,  True,  True,  True,  True,  True,

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:79]          True,  True,  True,  True,  True,  True,  True,  True,  True,  True,

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:79]          True,  True,  True,  True,  True,  True,  True,  True,  True,  True,

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:79]          True,  True,  True,  True,  True,  True,  True,  True,  True,  True,

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:79]          True,  True,  True,  True,  True,  True,  True,  True,  True,  True,

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:79]          True,  True,  True,  True,  True,  True,  True,  True,  True,  True,

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:79]          True,  True,  True,  True,  True,  True,  True,  True,  True,  True,

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:79]          True,  True,  True,  True,  True,  True,  True,  True,  True,  True,

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:79]          True,  True,  True,  True,  True,  True,  True,  True,  True,  True,

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [dump_input.py:79]          True,  True,  True,  True,  True,  True,  True,  True, False, False]))],sampling_params=SamplingParams(n=1, presence_penalty=0.0, frequency_penalty=0.0, repetition_penalty=1.0, temperature=0.0, top_p=1.0, top_k=0, min_p=0.0, seed=None, stop=[], stop_token_ids=[106], bad_words=[], include_stop_str_in_output=False, ignore_eos=False, max_tokens=128, min_tokens=0, logprobs=5, prompt_logprobs=None, skip_special_tokens=True, spaces_between_special_tokens=True, truncate_prompt_tokens=None, guided_decoding=None, extra_args=None),block_ids=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18],num_computed_tokens=0,lora_request=None)],scheduled_cached_reqs=[],num_scheduled_tokens={0: 281},total_num_scheduled_tokens=281,scheduled_spec_decode_tokens={},scheduled_encoder_inputs={0: [0]},num_common_prefix_blocks=18,finished_req_ids=[],free_encoder_input_ids=[],structured_output_request_ids={},grammar_bitmask=null,kv_connector_metadata=null)
[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [core.py:495] EngineCore encountered a fatal error.

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [core.py:495] Traceback (most recent call last):

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [core.py:495]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 486, in run_engine_core

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [core.py:495]     engine_core.run_busy_loop()

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [core.py:495]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 513, in run_busy_loop

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [core.py:495]     self._process_engine_step()

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [core.py:495]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 538, in _process_engine_step

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [core.py:495]     outputs = self.step_fn()

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [core.py:495]               ^^^^^^^^^^^^^^

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [core.py:495]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 226, in step

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [core.py:495]     model_output = self.execute_model(scheduler_output)

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [core.py:495]                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [core.py:495]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 213, in execute_model

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [core.py:495]     raise err

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [core.py:495]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 207, in execute_model

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [core.py:495]     return self.model_executor.execute_model(scheduler_output)

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [core.py:495]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [core.py:495]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/executor/abstract.py", line 86, in execute_model

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [core.py:495]     output = self.collective_rpc("execute_model",

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [core.py:495]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [core.py:495]   File "/usr/local/lib/python3.12/dist-packages/vllm/executor/uniproc_executor.py", line 56, in collective_rpc

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [core.py:495]     answer = run_method(self.driver_worker, method, args, kwargs)

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [core.py:495]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [core.py:495]   File "/usr/local/lib/python3.12/dist-packages/vllm/utils.py", line 2598, in run_method

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [core.py:495]     return func(*args, **kwargs)

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [core.py:495]            ^^^^^^^^^^^^^^^^^^^^^

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [core.py:495]   File "/usr/local/lib/python3.12/dist-packages/torch/utils/_contextlib.py", line 116, in decorate_context

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [core.py:495]     return func(*args, **kwargs)

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [core.py:495]            ^^^^^^^^^^^^^^^^^^^^^

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [core.py:495]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_worker.py", line 276, in execute_model

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [core.py:495]     output = self.model_runner.execute_model(scheduler_output,

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [core.py:495]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [core.py:495]   File "/usr/local/lib/python3.12/dist-packages/torch/utils/_contextlib.py", line 116, in decorate_context

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [core.py:495]     return func(*args, **kwargs)

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [core.py:495]            ^^^^^^^^^^^^^^^^^^^^^

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [core.py:495]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_model_runner.py", line 1121, in execute_model

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [core.py:495]     self._prepare_inputs(scheduler_output))

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [core.py:495]     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [core.py:495]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_model_runner.py", line 528, in _prepare_inputs

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [core.py:495]     self.input_batch.block_table.commit(num_reqs)

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [core.py:495]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/block_table.py", line 81, in commit

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [core.py:495]     self.block_table[:num_reqs].copy_(self.block_table_cpu[:num_reqs],

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [core.py:495] RuntimeError: CUDA error: an illegal memory access was encountered

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [core.py:495] CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [core.py:495] For debugging consider passing CUDA_LAUNCH_BLOCKING=1

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [core.py:495] Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.

[2025-05-22T05:33:18Z] ERROR 05-21 22:33:18 [core.py:495] 
[2025-05-22T05:33:18Z] Process EngineCore_0:
[2025-05-22T05:33:18Z] Traceback (most recent call last):
[2025-05-22T05:33:18Z]   File "/usr/lib/python3.12/multiprocessing/process.py", line 314, in _bootstrap
[2025-05-22T05:33:18Z]     self.run()
[2025-05-22T05:33:18Z]   File "/usr/lib/python3.12/multiprocessing/process.py", line 108, in run
[2025-05-22T05:33:18Z]     self._target(*self._args, **self._kwargs)
[2025-05-22T05:33:18Z]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 497, in run_engine_core
[2025-05-22T05:33:18Z]     raise e
[2025-05-22T05:33:18Z]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 486, in run_engine_core
[2025-05-22T05:33:18Z]     engine_core.run_busy_loop()
[2025-05-22T05:33:18Z]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 513, in run_busy_loop
[2025-05-22T05:33:18Z]     self._process_engine_step()
[2025-05-22T05:33:18Z]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 538, in _process_engine_step
[2025-05-22T05:33:18Z]     outputs = self.step_fn()
[2025-05-22T05:33:18Z]               ^^^^^^^^^^^^^^
[2025-05-22T05:33:18Z]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 226, in step
[2025-05-22T05:33:18Z]     model_output = self.execute_model(scheduler_output)
[2025-05-22T05:33:18Z]                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2025-05-22T05:33:18Z]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 213, in execute_model
[2025-05-22T05:33:18Z]     raise err
[2025-05-22T05:33:18Z]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 207, in execute_model
[2025-05-22T05:33:18Z]     return self.model_executor.execute_model(scheduler_output)
[2025-05-22T05:33:18Z]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2025-05-22T05:33:18Z]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/executor/abstract.py", line 86, in execute_model
[2025-05-22T05:33:18Z]     output = self.collective_rpc("execute_model",
[2025-05-22T05:33:18Z]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2025-05-22T05:33:18Z]   File "/usr/local/lib/python3.12/dist-packages/vllm/executor/uniproc_executor.py", line 56, in collective_rpc
[2025-05-22T05:33:18Z]     answer = run_method(self.driver_worker, method, args, kwargs)
[2025-05-22T05:33:18Z]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2025-05-22T05:33:18Z]   File "/usr/local/lib/python3.12/dist-packages/vllm/utils.py", line 2598, in run_method
[2025-05-22T05:33:18Z]     return func(*args, **kwargs)
[2025-05-22T05:33:18Z]            ^^^^^^^^^^^^^^^^^^^^^
[2025-05-22T05:33:18Z]   File "/usr/local/lib/python3.12/dist-packages/torch/utils/_contextlib.py", line 116, in decorate_context
[2025-05-22T05:33:18Z]     return func(*args, **kwargs)
[2025-05-22T05:33:18Z]            ^^^^^^^^^^^^^^^^^^^^^
[2025-05-22T05:33:18Z]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_worker.py", line 276, in execute_model
[2025-05-22T05:33:18Z]     output = self.model_runner.execute_model(scheduler_output,
[2025-05-22T05:33:18Z]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2025-05-22T05:33:18Z]   File "/usr/local/lib/python3.12/dist-packages/torch/utils/_contextlib.py", line 116, in decorate_context
[2025-05-22T05:33:18Z]     return func(*args, **kwargs)
[2025-05-22T05:33:18Z]            ^^^^^^^^^^^^^^^^^^^^^
[2025-05-22T05:33:18Z]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_model_runner.py", line 1121, in execute_model
[2025-05-22T05:33:18Z]     self._prepare_inputs(scheduler_output))
[2025-05-22T05:33:18Z]     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2025-05-22T05:33:18Z]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_model_runner.py", line 528, in _prepare_inputs
[2025-05-22T05:33:18Z]     self.input_batch.block_table.commit(num_reqs)
[2025-05-22T05:33:18Z]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/block_table.py", line 81, in commit
[2025-05-22T05:33:18Z]     self.block_table[:num_reqs].copy_(self.block_table_cpu[:num_reqs],
[2025-05-22T05:33:18Z] RuntimeError: CUDA error: an illegal memory access was encountered
[2025-05-22T05:33:18Z] CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
[2025-05-22T05:33:18Z] For debugging consider passing CUDA_LAUNCH_BLOCKING=1
[2025-05-22T05:33:18Z] Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
[2025-05-22T05:33:18Z] 
[2025-05-22T05:33:18Z]

Before submitting a new issue...

Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

The text was updated successfully, but these errors were encountered:

DarkLight1337 · 2025-05-22T06:05:31Z

It only started failing on the most recent nightly.

https://buildkite.com/organizations/vllm/analytics/suites/ci-1/tests/d550826c-15d9-8e12-bb2b-d311fbd29c3b?period=28days&tags=scm.branch%3Amain

DarkLight1337 · 2025-05-22T06:08:08Z

I'm unblocking the test on previous commits on main to see which commit caused the failure.

DarkLight1337 · 2025-05-22T08:58:50Z

The test appears to be flaky. The earliest failure I found was in #18459 but the test passed in a later commit #18506, before failing again in #18331

russellb · 2025-05-22T14:12:56Z

PR #18543 says the error here may still occur, but should no longer result in illegal memory access. It would be nice to see what the updated failure looks like after that fix went in.

markmc · 2025-05-22T14:18:42Z

Analytics links, for reference:

https://buildkite.com/organizations/vllm/analytics/suites/ci-1/tests/d550826c-15d9-8e12-bb2b-d311fbd29c3b?period=7days&tags=scm.branch%3Amain
https://buildkite.com/organizations/vllm/analytics/suites/ci-1/tests/ba2d7883-9643-8c72-8622-f680cc6e7b88?period=7days&tags=scm.branch%3Amain

russellb · 2025-05-22T18:31:06Z

I ran this test successfully more than 200 times in a loop, so I haven't been able to reproduce it easily locally at this point.

DarkLight1337 · 2025-05-23T01:55:57Z

Ok then let's consider the test as fixed, thanks for the follow-up!

DarkLight1337 added bug Something isn't working ci-failure Issue about an unexpected test failure in CI labels May 22, 2025

github-project-automation bot added this to CI Failures May 22, 2025

DarkLight1337 changed the title ~~[Bug][Failing Test]: Multi Modal 3 - models/multimodal/generation/test_common.py~~ [Bug][Failing Test]: Multi-modal Models 3 - models/multimodal/generation/test_common.py May 22, 2025

DarkLight1337 changed the title ~~[Bug][Failing Test]: Multi-modal Models 3 - models/multimodal/generation/test_common.py~~ [Bug][Failing Test]: Multi-Modal Models 3 - models/multimodal/generation/test_common.py May 22, 2025

abmfy mentioned this issue May 22, 2025

[Bugfix] Use random hidden states in dummy sampler run #18543

Merged

DarkLight1337 closed this as completed May 23, 2025

github-project-automation bot moved this to Done in CI Failures May 23, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Bug][Failing Test]: Multi-Modal Models 3 - models/multimodal/generation/test_common.py #18528

[Bug][Failing Test]: Multi-Modal Models 3 - models/multimodal/generation/test_common.py #18528

DarkLight1337 commented May 22, 2025

DarkLight1337 commented May 22, 2025

Uh oh!

DarkLight1337 commented May 22, 2025

Uh oh!

DarkLight1337 commented May 22, 2025

Uh oh!

russellb commented May 22, 2025

Uh oh!

markmc commented May 22, 2025

Uh oh!

russellb commented May 22, 2025

Uh oh!

DarkLight1337 commented May 23, 2025 •

edited

Loading

Uh oh!

Uh oh!

[Bug][Failing Test]: Multi-Modal Models 3 - models/multimodal/generation/test_common.py #18528

[Bug][Failing Test]: Multi-Modal Models 3 - models/multimodal/generation/test_common.py #18528

Comments

DarkLight1337 commented May 22, 2025

Your current environment

🐛 Describe the bug

Before submitting a new issue...

DarkLight1337 commented May 22, 2025

Uh oh!

DarkLight1337 commented May 22, 2025

Uh oh!

DarkLight1337 commented May 22, 2025

Uh oh!

russellb commented May 22, 2025

Uh oh!

markmc commented May 22, 2025

Uh oh!

russellb commented May 22, 2025

Uh oh!

DarkLight1337 commented May 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DarkLight1337 commented May 23, 2025 •

edited

Loading