Description
We are experiencing performance issues when doing API calls in many threads at the same time. The issue only occurs on certain clusters and we suspect it is happens when a high number of API calls occur simultaneously.
When looking at stack traces, we see contention around a lock in the logging module.
Thread XYZ (most recent call first):
File \"/usr/local/lib/python3.9/logging/__init__.py\", line 225 in _acquireLock
File \"/usr/local/lib/python3.9/logging/__init__.py\", line 1298 in getLogger
File \"/usr/local/lib/python3.9/logging/__init__.py\", line 2042 in getLogger
File \"/usr/local/lib/python3.9/site-packages/kubernetes/client/configuration.py\", line 112 in __init__
File \"/usr/local/lib/python3.9/site-packages/kubernetes/client/models/v1_container_image.py\", line 48 in __init__
File \"/usr/local/lib/python3.9/site-packages/kubernetes/client/api_client.py\", line 641 in __deserialize_model
File \"/usr/local/lib/python3.9/site-packages/kubernetes/client/api_client.py\", line 303 in __deserialize
File \"/usr/local/lib/python3.9/site-packages/kubernetes/client/api_client.py\", line 280 in <listcomp>
File \"/usr/local/lib/python3.9/site-packages/kubernetes/client/api_client.py\", line 280 in __deserialize
File \"/usr/local/lib/python3.9/site-packages/kubernetes/client/api_client.py\", line 639 in __deserialize_model
File \"/usr/local/lib/python3.9/site-packages/kubernetes/client/api_client.py\", line 303 in __deserialize
File \"/usr/local/lib/python3.9/site-packages/kubernetes/client/api_client.py\", line 639 in __deserialize_model
File \"/usr/local/lib/python3.9/site-packages/kubernetes/client/api_client.py\", line 303 in __deserialize
File \"/usr/local/lib/python3.9/site-packages/kubernetes/client/api_client.py\", line 280 in <listcomp>
File \"/usr/local/lib/python3.9/site-packages/kubernetes/client/api_client.py\", line 280 in __deserialize
File \"/usr/local/lib/python3.9/site-packages/kubernetes/client/api_client.py\", line 639 in __deserialize_model
File \"/usr/local/lib/python3.9/site-packages/kubernetes/client/api_client.py\", line 303 in __deserialize
File \"/usr/local/lib/python3.9/site-packages/kubernetes/client/api_client.py\", line 264 in deserialize
File \"/usr/local/lib/python3.9/site-packages/kubernetes/client/api_client.py\", line 192 in __call_api
File \"/usr/local/lib/python3.9/site-packages/kubernetes/client/api_client.py\", line 348 in call_api
File \"/usr/local/lib/python3.9/site-packages/kubernetes/client/api/core_v1_api.py\", line 16517 in list_node_with_http_info
File \"/usr/local/lib/python3.9/site-packages/hikaru/model/rel_1_16/v1/v1.py\", line 46011 in listNode
... application specific stuff
The above is from a single thread. Each stack trace contains about 20 threads like that, all stuck on different API calls in the client library, all waiting in _acquireLock
. (There are several paths to _acquireLock
- it's not just getLogger
but also functions like _clear_cache()
, setLeve()
etc.)
Is there a fundamental reason why deserializing a Kubernetes object must involve a call to getLogger()
and other logging functions?
If not, it would be great to improve the performance here and allow not calling any lockful logging functions during deserialization.