[Bug][Failing Test]: V1 - v1/entrypoints/llm/test_struct_output_generate.py #18525

DarkLight1337 · 2025-05-22T04:41:07Z

Your current environment

N/A

🐛 Describe the bug

v1/entrypoints/llm/test_struct_output_generate.py::test_structured_output_with_reasoning_matrices fails on main

e.g. https://buildkite.com/vllm/ci/builds/20477/steps?jid=0196f3e2-128a-409e-bafa-5d676afc9557

Stack:

[2025-05-21T18:59:44Z] ERROR 05-21 11:59:44 [core.py:493] EngineCore failed to start.

[2025-05-21T18:59:44Z] ERROR 05-21 11:59:44 [core.py:493] Traceback (most recent call last):

[2025-05-21T18:59:44Z] ERROR 05-21 11:59:44 [core.py:493]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 484, in run_engine_core

[2025-05-21T18:59:44Z] ERROR 05-21 11:59:44 [core.py:493]     engine_core = EngineCoreProc(*args, **kwargs)

[2025-05-21T18:59:44Z] ERROR 05-21 11:59:44 [core.py:493]                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

[2025-05-21T18:59:44Z] ERROR 05-21 11:59:44 [core.py:493]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 383, in __init__

[2025-05-21T18:59:44Z] ERROR 05-21 11:59:44 [core.py:493]     super().__init__(vllm_config, executor_class, log_stats,

[2025-05-21T18:59:44Z] ERROR 05-21 11:59:44 [core.py:493]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 78, in __init__

[2025-05-21T18:59:44Z] ERROR 05-21 11:59:44 [core.py:493]     self._initialize_kv_caches(vllm_config)

[2025-05-21T18:59:44Z] ERROR 05-21 11:59:44 [core.py:493]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 164, in _initialize_kv_caches

[2025-05-21T18:59:44Z] ERROR 05-21 11:59:44 [core.py:493]     self.model_executor.initialize_from_config(kv_cache_configs)

[2025-05-21T18:59:44Z] ERROR 05-21 11:59:44 [core.py:493]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/executor/abstract.py", line 65, in initialize_from_config

[2025-05-21T18:59:44Z] ERROR 05-21 11:59:44 [core.py:493]     self.collective_rpc("compile_or_warm_up_model")

[2025-05-21T18:59:44Z] ERROR 05-21 11:59:44 [core.py:493]   File "/usr/local/lib/python3.12/dist-packages/vllm/executor/uniproc_executor.py", line 56, in collective_rpc

[2025-05-21T18:59:44Z] ERROR 05-21 11:59:44 [core.py:493]     answer = run_method(self.driver_worker, method, args, kwargs)

[2025-05-21T18:59:44Z] ERROR 05-21 11:59:44 [core.py:493]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

[2025-05-21T18:59:44Z] ERROR 05-21 11:59:44 [core.py:493]   File "/usr/local/lib/python3.12/dist-packages/vllm/utils.py", line 2598, in run_method

[2025-05-21T18:59:44Z] ERROR 05-21 11:59:44 [core.py:493]     return func(*args, **kwargs)

[2025-05-21T18:59:44Z] ERROR 05-21 11:59:44 [core.py:493]            ^^^^^^^^^^^^^^^^^^^^^

[2025-05-21T18:59:44Z] ERROR 05-21 11:59:44 [core.py:493]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_worker.py", line 254, in compile_or_warm_up_model

[2025-05-21T18:59:44Z] ERROR 05-21 11:59:44 [core.py:493]     self.model_runner._dummy_sampler_run(

[2025-05-21T18:59:44Z] ERROR 05-21 11:59:44 [core.py:493]   File "/usr/local/lib/python3.12/dist-packages/torch/utils/_contextlib.py", line 116, in decorate_context

[2025-05-21T18:59:44Z] ERROR 05-21 11:59:44 [core.py:493]     return func(*args, **kwargs)

[2025-05-21T18:59:44Z] ERROR 05-21 11:59:44 [core.py:493]            ^^^^^^^^^^^^^^^^^^^^^

[2025-05-21T18:59:44Z] ERROR 05-21 11:59:44 [core.py:493]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_model_runner.py", line 1765, in _dummy_sampler_run

[2025-05-21T18:59:44Z] ERROR 05-21 11:59:44 [core.py:493]     dummy_spec_decode_metadata = SpecDecodeMetadata.make_dummy(

[2025-05-21T18:59:44Z] ERROR 05-21 11:59:44 [core.py:493]                                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

[2025-05-21T18:59:44Z] ERROR 05-21 11:59:44 [core.py:493]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/spec_decode/metadata.py", line 38, in make_dummy

[2025-05-21T18:59:44Z] ERROR 05-21 11:59:44 [core.py:493]     draft_token_ids_tensor = torch.tensor(flattened_draft_token_ids,

[2025-05-21T18:59:44Z] ERROR 05-21 11:59:44 [core.py:493]                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

[2025-05-21T18:59:44Z] ERROR 05-21 11:59:44 [core.py:493] RuntimeError: CUDA error: an illegal memory access was encountered

[2025-05-21T18:59:44Z] ERROR 05-21 11:59:44 [core.py:493] CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.

[2025-05-21T18:59:44Z] ERROR 05-21 11:59:44 [core.py:493] For debugging consider passing CUDA_LAUNCH_BLOCKING=1

[2025-05-21T18:59:44Z] ERROR 05-21 11:59:44 [core.py:493] Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.

[2025-05-21T18:59:44Z] ERROR 05-21 11:59:44 [core.py:493] 
[2025-05-21T18:59:44Z] Process EngineCore_0:
[2025-05-21T18:59:44Z] Traceback (most recent call last):
[2025-05-21T18:59:44Z]   File "/usr/lib/python3.12/multiprocessing/process.py", line 314, in _bootstrap
[2025-05-21T18:59:44Z]     self.run()
[2025-05-21T18:59:44Z]   File "/usr/lib/python3.12/multiprocessing/process.py", line 108, in run
[2025-05-21T18:59:44Z]     self._target(*self._args, **self._kwargs)
[2025-05-21T18:59:44Z]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 497, in run_engine_core
[2025-05-21T18:59:44Z]     raise e
[2025-05-21T18:59:44Z]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 484, in run_engine_core
[2025-05-21T18:59:44Z]     engine_core = EngineCoreProc(*args, **kwargs)
[2025-05-21T18:59:44Z]                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2025-05-21T18:59:44Z]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 383, in __init__
[2025-05-21T18:59:44Z]     super().__init__(vllm_config, executor_class, log_stats,
[2025-05-21T18:59:44Z]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 78, in __init__
[2025-05-21T18:59:44Z]     self._initialize_kv_caches(vllm_config)
[2025-05-21T18:59:44Z]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 164, in _initialize_kv_caches
[2025-05-21T18:59:44Z]     self.model_executor.initialize_from_config(kv_cache_configs)
[2025-05-21T18:59:44Z]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/executor/abstract.py", line 65, in initialize_from_config
[2025-05-21T18:59:44Z]     self.collective_rpc("compile_or_warm_up_model")
[2025-05-21T18:59:44Z]   File "/usr/local/lib/python3.12/dist-packages/vllm/executor/uniproc_executor.py", line 56, in collective_rpc
[2025-05-21T18:59:44Z]     answer = run_method(self.driver_worker, method, args, kwargs)
[2025-05-21T18:59:44Z]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2025-05-21T18:59:44Z]   File "/usr/local/lib/python3.12/dist-packages/vllm/utils.py", line 2598, in run_method
[2025-05-21T18:59:44Z]     return func(*args, **kwargs)
[2025-05-21T18:59:44Z]            ^^^^^^^^^^^^^^^^^^^^^
[2025-05-21T18:59:44Z]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_worker.py", line 254, in compile_or_warm_up_model
[2025-05-21T18:59:44Z]     self.model_runner._dummy_sampler_run(
[2025-05-21T18:59:44Z]   File "/usr/local/lib/python3.12/dist-packages/torch/utils/_contextlib.py", line 116, in decorate_context
[2025-05-21T18:59:44Z]     return func(*args, **kwargs)
[2025-05-21T18:59:44Z]            ^^^^^^^^^^^^^^^^^^^^^
[2025-05-21T18:59:44Z]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_model_runner.py", line 1765, in _dummy_sampler_run
[2025-05-21T18:59:44Z]     dummy_spec_decode_metadata = SpecDecodeMetadata.make_dummy(
[2025-05-21T18:59:44Z]                                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2025-05-21T18:59:44Z]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/spec_decode/metadata.py", line 38, in make_dummy
[2025-05-21T18:59:44Z]     draft_token_ids_tensor = torch.tensor(flattened_draft_token_ids,
[2025-05-21T18:59:44Z]                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2025-05-21T18:59:44Z] RuntimeError: CUDA error: an illegal memory access was encountered
[2025-05-21T18:59:44Z] CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
[2025-05-21T18:59:44Z] For debugging consider passing CUDA_LAUNCH_BLOCKING=1
[2025-05-21T18:59:44Z] Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.

Before submitting a new issue...

Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

The text was updated successfully, but these errors were encountered:

DarkLight1337 · 2025-05-22T04:41:26Z

It is another memory access failure, so it's probably related to #15777

markmc · 2025-05-22T09:33:32Z

Analytics link: https://buildkite.com/organizations/vllm/analytics/suites/ci-1/tests?branch=main&period=1days&query=test_struct_output_generate&commit=Search

DarkLight1337 added bug Something isn't working ci-failure Issue about an unexpected test failure in CI labels May 22, 2025

github-project-automation bot added this to CI Failures May 22, 2025

DarkLight1337 changed the title ~~[Bug][Failing Test]: v1/entrypoints/llm/test_struct_output_generate.py~~ [Bug][Failing Test]: V1 tests - v1/entrypoints/llm/test_struct_output_generate.py May 22, 2025

DarkLight1337 changed the title ~~[Bug][Failing Test]: V1 tests - v1/entrypoints/llm/test_struct_output_generate.py~~ [Bug][Failing Test]: V1 - v1/entrypoints/llm/test_struct_output_generate.py May 22, 2025

abmfy mentioned this issue May 22, 2025

[Bugfix] Use random hidden states in dummy sampler run #18543

Merged

vllm-bot closed this as completed in #18543 May 22, 2025

github-project-automation bot moved this to Done in CI Failures May 22, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Bug][Failing Test]: V1 - v1/entrypoints/llm/test_struct_output_generate.py #18525

[Bug][Failing Test]: V1 - v1/entrypoints/llm/test_struct_output_generate.py #18525

DarkLight1337 commented May 22, 2025

DarkLight1337 commented May 22, 2025

Uh oh!

markmc commented May 22, 2025

Uh oh!

Uh oh!

[Bug][Failing Test]: V1 - v1/entrypoints/llm/test_struct_output_generate.py #18525

[Bug][Failing Test]: V1 - v1/entrypoints/llm/test_struct_output_generate.py #18525

Comments

DarkLight1337 commented May 22, 2025

Your current environment

🐛 Describe the bug

Before submitting a new issue...

DarkLight1337 commented May 22, 2025

Uh oh!

markmc commented May 22, 2025

Uh oh!