ImportError: libcudart.so.11.0: cannot open shared object file: No such file or directory #1369

DoliteMatheo · 2023-10-16T08:51:56Z

When I used vllm to serve my local model, the terminal displayed the following message:
ImportError: libcudart.so.11.0: cannot open shared object file: No such file or directory
The traceback pointed to the following code in site-packages/vllm/utils.py and the execution of the single line could also trigger the same error:

"from vllm import cuda_utils"

I suppose it may be caused by the mismatch between vllm and my CUDA version or Pytorch version. The CUDA version is 12.2 (only this version installed) on my machine and installing a new version 11 is not so convenient, the Pytorch version is 2.1.0, vllm version is 0.2.0
How could I solve the problem without re-install CUDA 11?
Many thanks!

alan1989 · 2023-10-16T08:54:14Z

I encountered the same problem, did you solve it?

DoliteMatheo · 2023-10-16T09:04:14Z

I encountered the same problem, did you solve it?

Unfortunately, I didn't find a better solution than installing CUDA 11, but I don't want to make any change about the CUDA version since the machine is not my private, and re-installing CUDA often cause many more unexpected problems. If you have got any solution, please tell me, much appreciated.

bhupendrathore · 2023-10-16T10:51:33Z

I tried with cuda 12.2. I get the same error, when trying with cuda 11.7 getting the following error :
RuntimeError: The NVIDIA driver on your system is too old (found version 11070). Please update your GPU driver by downloading and installing a new version from the URL: http://www.nvidia.com/Download/index.aspx Alternatively, go to: https://pytorch.org/ to install a PyTorch version that has been compiled with your version of the CUDA driver.

i updated my xformers with pip install xformers==v0.0.22 and works fine.

i am using cuda11.7 docker image.

alan1989 · 2023-10-16T12:41:10Z

i have sloved it. first find the libcudart.so.11.0 path on your disk.then write it into LD_LIBRARY_PATH

locate libcudart.so.11.0
export LD_LIBRARY_PATH=(the libcudart.so.11.0 path you find):$LD_LIBRARY_PATH

vemonet · 2023-10-16T13:06:17Z

We are getting the same "error", but with CUDA 12.1

I am not sure who's fault is it, but throwing an error for the reason that "we cannot find a file we installed ourselves so we crash everything" is a bit ridiculous .

I did not see any restriction against using CUDA 12 in vLLM docs. So we can expect vLLM works for the latest CUDA version

Here is the code to reproduce (cf. below to see in which docker image to run this)

from langchain.chains import ConversationChain
from langchain.memory import ConversationBufferMemory
from langchain.llms import VLLM

llm = VLLM(
    model="mistralai/Mistral-7B-Instruct-v0.1",
    max_new_tokens=8000,
    top_k=10,
    top_p=0.95,
    temperature=0.8,
)

conversation = ConversationChain(
    llm=llm, verbose=True, memory=ConversationBufferMemory()
)

print(conversation.predict(input="Hi mom!"))

Here is the full error we get:

Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/langchain/llms/vllm.py", line 79, in validate_environment
    from vllm import LLM as VLLModel
  File "/usr/local/lib/python3.10/dist-packages/vllm/__init__.py", line 3, in <module>
    from vllm.engine.arg_utils import AsyncEngineArgs, EngineArgs
  File "/usr/local/lib/python3.10/dist-packages/vllm/engine/arg_utils.py", line 6, in <module>
    from vllm.config import (CacheConfig, ModelConfig, ParallelConfig,
  File "/usr/local/lib/python3.10/dist-packages/vllm/config.py", line 8, in <module>
    from vllm.utils import get_cpu_memory
  File "/usr/local/lib/python3.10/dist-packages/vllm/utils.py", line 8, in <module>
    from vllm import cuda_utils
ImportError: libcudart.so.11.0: cannot open shared object file: No such file or directory

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/workspace/share/code-llama/run_vllm.py", line 5, in <module>
    llm = VLLM(
  File "/usr/local/lib/python3.10/dist-packages/langchain/load/serializable.py", line 97, in __init__
    super().__init__(**kwargs)
  File "pydantic/main.py", line 339, in pydantic.main.BaseModel.__init__
  File "pydantic/main.py", line 1102, in pydantic.main.validate_model
  File "/usr/local/lib/python3.10/dist-packages/langchain/llms/vllm.py", line 81, in validate_environment
    raise ImportError(
ImportError: Could not import vllm python package. Please install it with `pip install vllm`.

1. We use official nvidia images

Here is our setup: We are literally using the official CUDA image from nvidia: nvcr.io/nvidia/cuda:12.1.0-devel-ubuntu22.04

Starting from that the message "can't find libcudart" has no reason to exist

2. We made sure to have the right CUDA version

We don't use the pytorch one like recommended by vllm docs because with the pytorch one we can't control exactly which CUDA version gets installed, and then we get error like "muuuuh pytorch was compiled with a different CUDA version" . Also pytorch GPU image is like ~9G vs ~3G for CUDA

nvidia-smi shows we are using CUDA 12.1:

| NVIDIA-SMI 530.30.02              Driver Version: 530.30.02    CUDA Version: 12.1     |

pip list | grep cuda shows we have the CUDA version 12.1 installed everywhere:

nvidia-cuda-cupti-cu12    12.1.105
nvidia-cuda-nvrtc-cu12    12.1.105
nvidia-cuda-runtime-cu12  12.1.105

3. It was working on CUDA 12.1 last week 🫠

We managed to make it work last week when running in a old pytorch docker image that was still on py 3.8. But now it is broken when running on up-to-date images (what a mess), always complaining about this non-existing error with libcudart location

And last week when it was working the main GPU was still on CUDA 12.1 (according to nvidia-smi), but with some old 11.7 pip packages installed (as I say it's an old pytorch image running py3.8, what a wonderful mess, but it seems like vLLM is thriving in the mess since it's the only time it worked!):

cuda-python                   12.1.0rc5+1.gc7fd38c.dirty
cupy-cuda12x                  12.0.0b3
dask-cuda                     23.2.0
nvidia-cuda-cupti-cu11        11.7.101
nvidia-cuda-nvrtc-cu11        11.7.99
nvidia-cuda-runtime-cu11      11.7.99
nvidia-dali-cuda110           1.23.0

Meaning vLLM can work on CUDA 12 drivers, and we don't need to reinstall CUDA 11, only some CUDA 11 runtime libs should be good: nvidia-cuda-runtime-cu11, or nvidia-cuda-nvrtc-cu11, or nvidia-cuda-cupti-cu11

4. Trying the `locate libcudart` fix

Running locate libcudart.so.11.0 does not find anything inside the CUDA docker image. Because we have CUDA 12 installed I guess

5. Conclusion

CUDA version is really sensible for vLLM to work. It would be really helpful for vLLM to provide a bit of documentation around it, e.g.:

vLLM only works on CUDA 11 by default
Steps to make vLLM work on CUDA 12
Ideally, stop recommending to use the Pytorch docker image, it is really hard to find out which exact CUDA version the image is using. It is much more reliable to start from the CUDA image, and then add pytorch (pip install is as good).

vemonet · 2023-10-16T14:06:13Z

A potential approach to fix it: it could be due to the torch version which is fixed to 2.0.1 : https://github.com/vllm-project/vllm/blob/main/pyproject.toml#L6

Because torch 2.0.1 does not have a variant for CUDA 12 (only CUDA 11)
Maybe installing a new torch version should work

I'll try to re-build vllm without version limitations for torch to see if that helps

copasseron · 2023-10-16T14:24:23Z

A potential approach to fix it: it could be due to the torch version which is fixed to 2.0.1 : https://github.com/vllm-project/vllm/blob/main/pyproject.toml#L6

Because torch 2.0.1 does not have a variant for CUDA 12 (only CUDA 11) Maybe installing a new torch version should work

I'll try to re-build vllm without version limitations for torch to see if that helps

let me know if you've got any news here.

I've got the same problem since this morning, with nvidia image as well nvcr.io/nvidia/tritonserver:23.09-py3.

vemonet · 2023-10-16T16:25:56Z

It's weird because the latest vllm release actually uses torch >= 2.0.0, so I should be able to use torch 2.1.0 with vllm 0.2.0.

But installing vllm always installs torch 2.0.1, and it due to:

xformers 0.0.22 requires torch==2.0.1, but you have torch 2.1.0 which is incompatible.

If we try to pip install --upgrade xformers:

vllm 0.2.0 requires xformers==0.0.22, but you have xformers 0.0.22.post4 which is incompatible.

But the requirements.txt of release v0.2.0 indicates xformers >= 0.0.22

And whatever combination tried I am always gettings errors, most of the time this one:

INFO 10-16 16:23:57 llm_engine.py:72] Initializing an LLM engine with config: model='mistralai/Mistral-7B-Instruct-v0.1', tokenizer='mistralai/Mistral-7B-Instruct-v0.1', tokenizer_mode=auto, revision=None, tokenizer_revision=None, trust_remote_code=False, dtype=torch.bfloat16, max_seq_len=32768, download_dir=None, load_format=auto, tensor_parallel_size=1, quantization=None, seed=0)
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Traceback (most recent call last):
  File "/workspace/share/code-llama/run_vllm.py", line 5, in <module>
    llm = VLLM(
  File "/usr/local/lib/python3.10/dist-packages/langchain/load/serializable.py", line 97, in __init__
    super().__init__(**kwargs)
  File "pydantic/main.py", line 339, in pydantic.main.BaseModel.__init__
  File "pydantic/main.py", line 1102, in pydantic.main.validate_model
  File "/usr/local/lib/python3.10/dist-packages/langchain/llms/vllm.py", line 86, in validate_environment
    values["client"] = VLLModel(
  File "/usr/local/lib/python3.10/dist-packages/vllm/entrypoints/llm.py", line 93, in __init__
    self.llm_engine = LLMEngine.from_engine_args(engine_args)
  File "/usr/local/lib/python3.10/dist-packages/vllm/engine/llm_engine.py", line 231, in from_engine_args
    engine = cls(*engine_configs,
  File "/usr/local/lib/python3.10/dist-packages/vllm/engine/llm_engine.py", line 110, in __init__
    self._init_workers(distributed_init_method)
  File "/usr/local/lib/python3.10/dist-packages/vllm/engine/llm_engine.py", line 128, in _init_workers
    from vllm.worker.worker import Worker  # pylint: disable=import-outside-toplevel
  File "/usr/local/lib/python3.10/dist-packages/vllm/worker/worker.py", line 10, in <module>
    from vllm.model_executor import get_model, InputMetadata, set_random_seed
  File "/usr/local/lib/python3.10/dist-packages/vllm/model_executor/__init__.py", line 2, in <module>
    from vllm.model_executor.model_loader import get_model
  File "/usr/local/lib/python3.10/dist-packages/vllm/model_executor/model_loader.py", line 10, in <module>
    from vllm.model_executor.models import *  # pylint: disable=wildcard-import
  File "/usr/local/lib/python3.10/dist-packages/vllm/model_executor/models/__init__.py", line 1, in <module>
    from vllm.model_executor.models.aquila import AquilaForCausalLM
  File "/usr/local/lib/python3.10/dist-packages/vllm/model_executor/models/aquila.py", line 35, in <module>
    from vllm.model_executor.layers.attention import PagedAttentionWithRoPE
  File "/usr/local/lib/python3.10/dist-packages/vllm/model_executor/layers/attention.py", line 10, in <module>
    from vllm import attention_ops
ImportError: /usr/local/lib/python3.10/dist-packages/vllm/attention_ops.cpython-310-x86_64-linux-gnu.so: undefined symbol: _ZNK3c1010TensorImpl27throw_data_ptr_access_errorEv

WoosukKwon · 2023-10-17T01:24:27Z

If I understand the problem correctly, the issue was because v0.2.0 didn't fix the pytorch and xformers version. In v0.2.1 which was released today, we pinned their versions. So the error should not happen as long as you use CUDA 11.8.

We will support CUDA 12 once xformers releases a new stable version with CUDA 12 support. (While xformers==0.0.22.post4 seems to include CUDA 12 binaries, I feel it's a bit unstable at the moment).

gesanqiu · 2023-10-17T03:52:12Z

If I understand the problem correctly, the issue was because v0.2.0 didn't fix the pytorch and xformers version. In v0.2.1 which was released today, we pinned their versions. So the error should not happen as long as you use CUDA 11.8.

We will support CUDA 12 once xformers releases a new stable version with CUDA 12 support. (While xformers==0.0.22.post4 seems to include CUDA 12 binaries, I feel it's a bit unstable at the moment).

Right now Pytorch 2.0.1 is binded to CUDA 11.7, compile vLLM with CUDA 11.8 will get fialed, just the same issue like #1283
I use nvidia/cuda:11.7.1-cudnn8-devel-ubuntu20.04 install vLLM with pip install -e . succeed.

DoliteMatheo · 2023-10-17T10:31:33Z

If I understand the problem correctly, the issue was because v0.2.0 didn't fix the pytorch and xformers version. In v0.2.1 which was released today, we pinned their versions. So the error should not happen as long as you use CUDA 11.8.

We will support CUDA 12 once xformers releases a new stable version with CUDA 12 support. (While xformers==0.0.22.post4 seems to include CUDA 12 binaries, I feel it's a bit unstable at the moment).

Finally I installed CUDA 11.7 manually and the problem got fixed immediately. It seemed that vllm cannot work if there is only CUDA 12 installed on the machine.

s-natsubori · 2023-10-17T12:14:43Z

I encountered the same problem,

docker base image : nvcr.io/nvidia/pytorch:22.12-py3
vllm==0.2.0

and I solved this error with
add requirements.txt
xformers==0.0.22

xformers==0.0.22 requires nvidia-cuda-runtime-cu11==11.7.99 and etc.
Unfortunately it uninstall PyTorch2.1.0(originally installed)
but my code is working!!

bitsnaps · 2023-11-08T08:10:12Z

I'm getting the same error on colab with TheBloke-Dolphin-2.1-mistral-7B-GPTQ which was working...

sanjana-sudo · 2023-11-09T05:22:14Z

I'm getting the same error on colab with TheBloke-Dolphin-2.1-mistral-7B-GPTQ which was working...

@bitsnaps same problem here. did you find any solution?

bitsnaps · 2023-11-09T08:24:54Z

I'm getting the same error on colab with TheBloke-Dolphin-2.1-mistral-7B-GPTQ which was working...

@bitsnaps same problem here. did you find any solution?

Not yet, I believe this is something to do with mistral/transformer/huggingface issue (not vllm), I'm not even able to run mistral-7b on colab which was working fine last week.

sanjana-sudo · 2023-11-16T11:16:49Z

I'm getting the same error on colab with TheBloke-Dolphin-2.1-mistral-7B-GPTQ which was working...

@bitsnaps same problem here. did you find any solution?

Not yet, I believe this is something to do with mistral/transformer/huggingface issue (not vllm), I'm not even able to run mistral-7b on colab which was working fine last week.

@bitsnaps I tried to run the Mistral_7B_Instruct_v0_1_GGUF now and its working. I just downgraded gradio to gradio==3.32.0 and did not change anything related to flash-attn.

s-natsubori · 2023-11-16T11:37:30Z

Currently, AutoAWQ deliver two versions. (cuda11 and cuda12）
I recommend that you try a combination of both to suit your environment.
and chek other pakeges too.

pip install autoawq (torch 2.1.0 + CUDA 12.1.1)

from github(torch20 + cuda11）
pip install https://github.com/casper-hansen/AutoAWQ/releases/download/v0.1.6/autoawq-0.1.6+cu118-cp310-cp310-linux_x86_64.whl

D-Octopus · 2023-11-28T07:07:38Z

Have a go at updating vllm to v0.2.2. Looks like they've sorted out this issue in that version.
v0.2.2 Major changes
Upgrade to CUDA 12 #1527

LI-ZHAODONG · 2023-11-29T12:22:23Z

I'm using llmware library and was facing the same error. I upgraded toch (2.0.1 -> 2.1.0) and solved the prob.

ibnzahoor98 · 2024-01-24T11:57:19Z

pip install xformers==v0.0.22

Thank you! Worked like a charm!

Provemj · 2024-10-17T09:07:11Z

It's probably just a cuda or torch version problem, try downgrading it

HarikrishnanK9 · 2025-03-17T07:27:07Z

i have sloved it. first find the libcudart.so.11.0 path on your disk.then write it into LD_LIBRARY_PATH

locate libcudart.so.11.0 export LD_LIBRARY_PATH=(the libcudart.so.11.0 path you find):$LD_LIBRARY_PATH
Thank You @alan1989 It Worked for me. export LD_LIBRARY_PATH=/home/hari/anaconda3/envs/prod_env/lib

WoosukKwon added the installation Installation problems label Oct 16, 2023

MMelQin mentioned this issue Nov 17, 2023

[BUG] ImportError: libcudart.so.11.0: cannot open shared object file: No such file or directory Project-MONAI/monai-deploy-app-sdk#461

Closed

hmellor closed this as completed Apr 4, 2024

zkyseu mentioned this issue May 30, 2025

cuda version zkyntu/UnLanedet#47

Closed

Uh oh!

ImportError: libcudart.so.11.0: cannot open shared object file: No such file or directory #1369

ImportError: libcudart.so.11.0: cannot open shared object file: No such file or directory #1369

Comments

DoliteMatheo commented Oct 16, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

alan1989 commented Oct 16, 2023

Uh oh!

DoliteMatheo commented Oct 16, 2023

Uh oh!

bhupendrathore commented Oct 16, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

alan1989 commented Oct 16, 2023

Uh oh!

vemonet commented Oct 16, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

1. We use official nvidia images

2. We made sure to have the right CUDA version

3. It was working on CUDA 12.1 last week 🫠

4. Trying the locate libcudart fix

5. Conclusion

Uh oh!

vemonet commented Oct 16, 2023

Uh oh!

copasseron commented Oct 16, 2023

Uh oh!

vemonet commented Oct 16, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

WoosukKwon commented Oct 17, 2023

Uh oh!

gesanqiu commented Oct 17, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DoliteMatheo commented Oct 17, 2023

Uh oh!

s-natsubori commented Oct 17, 2023

Uh oh!

bitsnaps commented Nov 8, 2023

Uh oh!

sanjana-sudo commented Nov 9, 2023

Uh oh!

bitsnaps commented Nov 9, 2023

Uh oh!

sanjana-sudo commented Nov 16, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

s-natsubori commented Nov 16, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

D-Octopus commented Nov 28, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

LI-ZHAODONG commented Nov 29, 2023

Uh oh!

ibnzahoor98 commented Jan 24, 2024

Uh oh!

Provemj commented Oct 17, 2024

Uh oh!

HarikrishnanK9 commented Mar 17, 2025

Uh oh!

DoliteMatheo commented Oct 16, 2023 •

edited

Loading

bhupendrathore commented Oct 16, 2023 •

edited

Loading

vemonet commented Oct 16, 2023 •

edited

Loading

4. Trying the `locate libcudart` fix

vemonet commented Oct 16, 2023 •

edited

Loading

gesanqiu commented Oct 17, 2023 •

edited

Loading

sanjana-sudo commented Nov 16, 2023 •

edited

Loading

s-natsubori commented Nov 16, 2023 •

edited

Loading

D-Octopus commented Nov 28, 2023 •

edited

Loading