Skip to content

Extremely slow deserialization due to lock contention and logging module #1867

Open
@aantn

Description

@aantn

We are experiencing performance issues when doing API calls in many threads at the same time. The issue only occurs on certain clusters and we suspect it is happens when a high number of API calls occur simultaneously.

When looking at stack traces, we see contention around a lock in the logging module.

Thread XYZ (most recent call first):
  File \"/usr/local/lib/python3.9/logging/__init__.py\", line 225 in _acquireLock
  File \"/usr/local/lib/python3.9/logging/__init__.py\", line 1298 in getLogger
  File \"/usr/local/lib/python3.9/logging/__init__.py\", line 2042 in getLogger
  File \"/usr/local/lib/python3.9/site-packages/kubernetes/client/configuration.py\", line 112 in __init__
  File \"/usr/local/lib/python3.9/site-packages/kubernetes/client/models/v1_container_image.py\", line 48 in __init__
  File \"/usr/local/lib/python3.9/site-packages/kubernetes/client/api_client.py\", line 641 in __deserialize_model
  File \"/usr/local/lib/python3.9/site-packages/kubernetes/client/api_client.py\", line 303 in __deserialize
  File \"/usr/local/lib/python3.9/site-packages/kubernetes/client/api_client.py\", line 280 in <listcomp>
  File \"/usr/local/lib/python3.9/site-packages/kubernetes/client/api_client.py\", line 280 in __deserialize
  File \"/usr/local/lib/python3.9/site-packages/kubernetes/client/api_client.py\", line 639 in __deserialize_model
  File \"/usr/local/lib/python3.9/site-packages/kubernetes/client/api_client.py\", line 303 in __deserialize
  File \"/usr/local/lib/python3.9/site-packages/kubernetes/client/api_client.py\", line 639 in __deserialize_model
  File \"/usr/local/lib/python3.9/site-packages/kubernetes/client/api_client.py\", line 303 in __deserialize
  File \"/usr/local/lib/python3.9/site-packages/kubernetes/client/api_client.py\", line 280 in <listcomp>
  File \"/usr/local/lib/python3.9/site-packages/kubernetes/client/api_client.py\", line 280 in __deserialize
  File \"/usr/local/lib/python3.9/site-packages/kubernetes/client/api_client.py\", line 639 in __deserialize_model
  File \"/usr/local/lib/python3.9/site-packages/kubernetes/client/api_client.py\", line 303 in __deserialize
  File \"/usr/local/lib/python3.9/site-packages/kubernetes/client/api_client.py\", line 264 in deserialize
  File \"/usr/local/lib/python3.9/site-packages/kubernetes/client/api_client.py\", line 192 in __call_api
  File \"/usr/local/lib/python3.9/site-packages/kubernetes/client/api_client.py\", line 348 in call_api
  File \"/usr/local/lib/python3.9/site-packages/kubernetes/client/api/core_v1_api.py\", line 16517 in list_node_with_http_info
  File \"/usr/local/lib/python3.9/site-packages/hikaru/model/rel_1_16/v1/v1.py\", line 46011 in listNode
  ... application specific stuff

The above is from a single thread. Each stack trace contains about 20 threads like that, all stuck on different API calls in the client library, all waiting in _acquireLock. (There are several paths to _acquireLock - it's not just getLogger but also functions like _clear_cache(), setLeve() etc.)

Is there a fundamental reason why deserializing a Kubernetes object must involve a call to getLogger() and other logging functions?

If not, it would be great to improve the performance here and allow not calling any lockful logging functions during deserialization.

Metadata

Metadata

Assignees

No one assigned

    Labels

    kind/bugCategorizes issue or PR as related to a bug.lifecycle/rottenDenotes an issue or PR that has aged beyond stale and will be auto-closed.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions