exponentially increasing libnvidia-allocator mount points on host #697

BryanQuigley · 2024-09-17T21:32:21Z

This appears to be the same root issue as #660 or #663 but wanted to detail the impact we are seeing from it.
Is there any timeline for the backported fix to be released?

Our symptoms

When mounting with a specific container image and a shared mount we get it creating an exponentially increasing amount of mounts on the host.

It actually creates 2^x - 1 number of mount points, so 1, 3 ... .255.. 32767. Where x is the number of containers causing the issue have run.

When it gets to the higher numbers it starts causing significant system issues due to the very high number of mounts.

The mount point is basically the same as from the other issue - copying here for searchability (although we are on nvidia 560)

/dev/md0 on /usr/lib/x86_64-linux-gnu/libnvidia-allocator.so.550.54.15 type ext4 (rw,relatime,nodelalloc,errors=remount-ro)```

The text was updated successfully, but these errors were encountered:

BryanQuigley · 2024-09-17T22:32:38Z

@elezar was this what you were seeing too?

elezar · 2024-09-18T21:07:11Z

@BryanQuigley yes, this is the known issue and is triggered when NVIDIA_DRIVER_CAPABILITIES includes graphics or display. It should only be triggered if a container is started with a bidirectional mount propagation.

We have a fix in place and this will be included in the next patch release. For the time being, please downgrade to a v1.15.x version if possible.

BryanQuigley · 2024-09-27T01:21:36Z

Thanks! I see it was released in 1.16.2.

elezar self-assigned this Sep 18, 2024

BryanQuigley closed this as completed Sep 27, 2024

elezar mentioned this issue Mar 10, 2025

Add rprivate to CDI mount options #980

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

exponentially increasing libnvidia-allocator mount points on host #697

exponentially increasing libnvidia-allocator mount points on host #697

BryanQuigley commented Sep 17, 2024

BryanQuigley commented Sep 17, 2024

elezar commented Sep 18, 2024

BryanQuigley commented Sep 27, 2024

exponentially increasing libnvidia-allocator mount points on host #697

exponentially increasing libnvidia-allocator mount points on host #697

Comments

BryanQuigley commented Sep 17, 2024

Our symptoms

BryanQuigley commented Sep 17, 2024

elezar commented Sep 18, 2024

BryanQuigley commented Sep 27, 2024