You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
What happened:
If we deploy the blob-csi-driver with enabled blobfuseproxy we are seeing the error EMFILE: too many open files after opening a few thousand files in our deployed nodejs application: helm install blob-csi-driver blob-csi-driver/blob-csi-driver --namespace kube-system --version v1.15.0 --set node.enableBlobfuseProxy=true
If we deploy the blob-csi-driver without enabling the blobfuseproxy we do not have the problem: helm install blob-csi-driver blob-csi-driver/blob-csi-driver --namespace kube-system --version v1.15.0
What you expected to happen:
We expect to be able to open more files also when blobfuseproxy is enabled
Deploy an application that is opening a few thousand files inside a mounted blob storage
Environment:
CSI Driver version: v1.15.0
Kubernetes version (use kubectl version): v1.22.6 (AKS)
OS (e.g. from /etc/os-release): Ubuntu 18.04.6 LTS
Kernel (e.g. uname -a): Linux 5.4.0-1083-azure
Anything else we need to know?:
We already tried to increase the maxOpenFileNum settings of the helm-chart but this does not help: helm install blob-csi-driver blob-csi-driver/blob-csi-driver --namespace kube-system --version v1.15.0 --set node.enableBlobfuseProxy=true --set node.blobfuseProxy.setMaxOpenFileNum=true --set node.blobfuseProxy.maxOpenFileNum=999000000
As far as I could see the current blobfuse-proxy service is described like this:
[Unit]
Description=Blobfuse proxy service
[Service]
ExecStart=/usr/bin/blobfuse-proxy --v=5 --blobfuse-proxy-endpoint=unix://var/lib/kubelet/plugins/blob.csi.azure.com/blobfuse-proxy.sock
[Install]
WantedBy=multi-user.target
That means systemd-limits are not explicitly set and are defaulting maybe to a low value. Setting sysctl -w fs.file-max=9000000 which the driver init is doing will not help to get around this systemd-limitation. The containerd-service on AKS nodes is e.g. deployed like this:
[Unit]
Description=containerd daemon
After=network.target
[Service]
ExecStartPre=/sbin/modprobe overlay
ExecStart=/usr/bin/containerd
Delegate=yes
KillMode=process
Restart=always
OOMScoreAdjust=-999
# Having non-zero Limit*s causes performance problems due to accounting overhead
# in the kernel. We recommend using cgroups to do container-local accounting.
LimitNPROC=infinity
LimitCORE=infinity
LimitNOFILE=infinity
TasksMax=infinity
[Install]
WantedBy=multi-user.target
As one can see the LimitNOFILE and also other configuration values are explicitly set to infinity which allows the deployments (running in the same context) to open more files than the systemd default values.
I could not test but maybe somewhere in this systemd limitations the problem can be found. Let me know if I can give further useful information.
The text was updated successfully, but these errors were encountered:
@andyzhangx thank you very much for this speedy PR. Could you already check if this fixes issues with too much open files? I can test it as soon as an release is available. Is there any time-schedule for the next release? Is there any reason why the 1.15 release is not yet shown under github releases?
What happened:
If we deploy the blob-csi-driver with enabled blobfuseproxy we are seeing the error
EMFILE: too many open files
after opening a few thousand files in our deployed nodejs application:helm install blob-csi-driver blob-csi-driver/blob-csi-driver --namespace kube-system --version v1.15.0 --set node.enableBlobfuseProxy=true
If we deploy the blob-csi-driver without enabling the blobfuseproxy we do not have the problem:
helm install blob-csi-driver blob-csi-driver/blob-csi-driver --namespace kube-system --version v1.15.0
What you expected to happen:
We expect to be able to open more files also when blobfuseproxy is enabled
How to reproduce it:
helm install blob-csi-driver blob-csi-driver/blob-csi-driver --namespace kube-system --version v1.15.0 --set node.enableBlobfuseProxy=true
Environment:
kubectl version
): v1.22.6 (AKS)uname -a
): Linux 5.4.0-1083-azureAnything else we need to know?:
helm install blob-csi-driver blob-csi-driver/blob-csi-driver --namespace kube-system --version v1.15.0 --set node.enableBlobfuseProxy=true --set node.blobfuseProxy.setMaxOpenFileNum=true --set node.blobfuseProxy.maxOpenFileNum=999000000
lsof | wc -l
is only counting 69136 elements which is far below the maxOpenFileNum.Possible reason:
So far I could see, the blobfuse-proxy is installed as an systemd-service on the nodes. I am not sure if everybody is aware that processes started by systemd are limited regarding open-files from the systemd-daemon itself, see https://manpages.ubuntu.com/manpages/bionic/man5/systemd.exec.5.html#process%20properties
As far as I could see the current blobfuse-proxy service is described like this:
That means systemd-limits are not explicitly set and are defaulting maybe to a low value. Setting
sysctl -w fs.file-max=9000000
which the driver init is doing will not help to get around this systemd-limitation. The containerd-service on AKS nodes is e.g. deployed like this:As one can see the LimitNOFILE and also other configuration values are explicitly set to infinity which allows the deployments (running in the same context) to open more files than the systemd default values.
I could not test but maybe somewhere in this systemd limitations the problem can be found. Let me know if I can give further useful information.
The text was updated successfully, but these errors were encountered: