You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[2025-05-21T18:59:44Z] ERROR 05-21 11:59:44 [core.py:493] EngineCore failed to start.
[2025-05-21T18:59:44Z] ERROR 05-21 11:59:44 [core.py:493] Traceback (most recent call last):
[2025-05-21T18:59:44Z] ERROR 05-21 11:59:44 [core.py:493] File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 484, in run_engine_core
[2025-05-21T18:59:44Z] ERROR 05-21 11:59:44 [core.py:493] engine_core = EngineCoreProc(*args, **kwargs)
[2025-05-21T18:59:44Z] ERROR 05-21 11:59:44 [core.py:493] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2025-05-21T18:59:44Z] ERROR 05-21 11:59:44 [core.py:493] File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 383, in __init__
[2025-05-21T18:59:44Z] ERROR 05-21 11:59:44 [core.py:493] super().__init__(vllm_config, executor_class, log_stats,
[2025-05-21T18:59:44Z] ERROR 05-21 11:59:44 [core.py:493] File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 78, in __init__
[2025-05-21T18:59:44Z] ERROR 05-21 11:59:44 [core.py:493] self._initialize_kv_caches(vllm_config)
[2025-05-21T18:59:44Z] ERROR 05-21 11:59:44 [core.py:493] File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 164, in _initialize_kv_caches
[2025-05-21T18:59:44Z] ERROR 05-21 11:59:44 [core.py:493] self.model_executor.initialize_from_config(kv_cache_configs)
[2025-05-21T18:59:44Z] ERROR 05-21 11:59:44 [core.py:493] File "/usr/local/lib/python3.12/dist-packages/vllm/v1/executor/abstract.py", line 65, in initialize_from_config
[2025-05-21T18:59:44Z] ERROR 05-21 11:59:44 [core.py:493] self.collective_rpc("compile_or_warm_up_model")
[2025-05-21T18:59:44Z] ERROR 05-21 11:59:44 [core.py:493] File "/usr/local/lib/python3.12/dist-packages/vllm/executor/uniproc_executor.py", line 56, in collective_rpc
[2025-05-21T18:59:44Z] ERROR 05-21 11:59:44 [core.py:493] answer = run_method(self.driver_worker, method, args, kwargs)
[2025-05-21T18:59:44Z] ERROR 05-21 11:59:44 [core.py:493] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2025-05-21T18:59:44Z] ERROR 05-21 11:59:44 [core.py:493] File "/usr/local/lib/python3.12/dist-packages/vllm/utils.py", line 2598, in run_method
[2025-05-21T18:59:44Z] ERROR 05-21 11:59:44 [core.py:493] return func(*args, **kwargs)
[2025-05-21T18:59:44Z] ERROR 05-21 11:59:44 [core.py:493] ^^^^^^^^^^^^^^^^^^^^^
[2025-05-21T18:59:44Z] ERROR 05-21 11:59:44 [core.py:493] File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_worker.py", line 254, in compile_or_warm_up_model
[2025-05-21T18:59:44Z] ERROR 05-21 11:59:44 [core.py:493] self.model_runner._dummy_sampler_run(
[2025-05-21T18:59:44Z] ERROR 05-21 11:59:44 [core.py:493] File "/usr/local/lib/python3.12/dist-packages/torch/utils/_contextlib.py", line 116, in decorate_context
[2025-05-21T18:59:44Z] ERROR 05-21 11:59:44 [core.py:493] return func(*args, **kwargs)
[2025-05-21T18:59:44Z] ERROR 05-21 11:59:44 [core.py:493] ^^^^^^^^^^^^^^^^^^^^^
[2025-05-21T18:59:44Z] ERROR 05-21 11:59:44 [core.py:493] File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_model_runner.py", line 1765, in _dummy_sampler_run
[2025-05-21T18:59:44Z] ERROR 05-21 11:59:44 [core.py:493] dummy_spec_decode_metadata = SpecDecodeMetadata.make_dummy(
[2025-05-21T18:59:44Z] ERROR 05-21 11:59:44 [core.py:493] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2025-05-21T18:59:44Z] ERROR 05-21 11:59:44 [core.py:493] File "/usr/local/lib/python3.12/dist-packages/vllm/v1/spec_decode/metadata.py", line 38, in make_dummy
[2025-05-21T18:59:44Z] ERROR 05-21 11:59:44 [core.py:493] draft_token_ids_tensor = torch.tensor(flattened_draft_token_ids,
[2025-05-21T18:59:44Z] ERROR 05-21 11:59:44 [core.py:493] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2025-05-21T18:59:44Z] ERROR 05-21 11:59:44 [core.py:493] RuntimeError: CUDA error: an illegal memory access was encountered
[2025-05-21T18:59:44Z] ERROR 05-21 11:59:44 [core.py:493] CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
[2025-05-21T18:59:44Z] ERROR 05-21 11:59:44 [core.py:493] For debugging consider passing CUDA_LAUNCH_BLOCKING=1
[2025-05-21T18:59:44Z] ERROR 05-21 11:59:44 [core.py:493] Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
[2025-05-21T18:59:44Z] ERROR 05-21 11:59:44 [core.py:493]
[2025-05-21T18:59:44Z] Process EngineCore_0:
[2025-05-21T18:59:44Z] Traceback (most recent call last):
[2025-05-21T18:59:44Z] File "/usr/lib/python3.12/multiprocessing/process.py", line 314, in _bootstrap
[2025-05-21T18:59:44Z] self.run()
[2025-05-21T18:59:44Z] File "/usr/lib/python3.12/multiprocessing/process.py", line 108, in run
[2025-05-21T18:59:44Z] self._target(*self._args, **self._kwargs)
[2025-05-21T18:59:44Z] File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 497, in run_engine_core
[2025-05-21T18:59:44Z] raise e
[2025-05-21T18:59:44Z] File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 484, in run_engine_core
[2025-05-21T18:59:44Z] engine_core = EngineCoreProc(*args, **kwargs)
[2025-05-21T18:59:44Z] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2025-05-21T18:59:44Z] File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 383, in __init__
[2025-05-21T18:59:44Z] super().__init__(vllm_config, executor_class, log_stats,
[2025-05-21T18:59:44Z] File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 78, in __init__
[2025-05-21T18:59:44Z] self._initialize_kv_caches(vllm_config)
[2025-05-21T18:59:44Z] File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 164, in _initialize_kv_caches
[2025-05-21T18:59:44Z] self.model_executor.initialize_from_config(kv_cache_configs)
[2025-05-21T18:59:44Z] File "/usr/local/lib/python3.12/dist-packages/vllm/v1/executor/abstract.py", line 65, in initialize_from_config
[2025-05-21T18:59:44Z] self.collective_rpc("compile_or_warm_up_model")
[2025-05-21T18:59:44Z] File "/usr/local/lib/python3.12/dist-packages/vllm/executor/uniproc_executor.py", line 56, in collective_rpc
[2025-05-21T18:59:44Z] answer = run_method(self.driver_worker, method, args, kwargs)
[2025-05-21T18:59:44Z] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2025-05-21T18:59:44Z] File "/usr/local/lib/python3.12/dist-packages/vllm/utils.py", line 2598, in run_method
[2025-05-21T18:59:44Z] return func(*args, **kwargs)
[2025-05-21T18:59:44Z] ^^^^^^^^^^^^^^^^^^^^^
[2025-05-21T18:59:44Z] File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_worker.py", line 254, in compile_or_warm_up_model
[2025-05-21T18:59:44Z] self.model_runner._dummy_sampler_run(
[2025-05-21T18:59:44Z] File "/usr/local/lib/python3.12/dist-packages/torch/utils/_contextlib.py", line 116, in decorate_context
[2025-05-21T18:59:44Z] return func(*args, **kwargs)
[2025-05-21T18:59:44Z] ^^^^^^^^^^^^^^^^^^^^^
[2025-05-21T18:59:44Z] File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_model_runner.py", line 1765, in _dummy_sampler_run
[2025-05-21T18:59:44Z] dummy_spec_decode_metadata = SpecDecodeMetadata.make_dummy(
[2025-05-21T18:59:44Z] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2025-05-21T18:59:44Z] File "/usr/local/lib/python3.12/dist-packages/vllm/v1/spec_decode/metadata.py", line 38, in make_dummy
[2025-05-21T18:59:44Z] draft_token_ids_tensor = torch.tensor(flattened_draft_token_ids,
[2025-05-21T18:59:44Z] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2025-05-21T18:59:44Z] RuntimeError: CUDA error: an illegal memory access was encountered
[2025-05-21T18:59:44Z] CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
[2025-05-21T18:59:44Z] For debugging consider passing CUDA_LAUNCH_BLOCKING=1
[2025-05-21T18:59:44Z] Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
Before submitting a new issue...
Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.
The text was updated successfully, but these errors were encountered:
Your current environment
N/A
🐛 Describe the bug
v1/entrypoints/llm/test_struct_output_generate.py::test_structured_output_with_reasoning_matrices
fails on maine.g. https://buildkite.com/vllm/ci/builds/20477/steps?jid=0196f3e2-128a-409e-bafa-5d676afc9557
Stack:
Before submitting a new issue...
The text was updated successfully, but these errors were encountered: