configure_azure_monitor() takes abnormally long time #34902
Labels
Client
This issue points to a problem in the data-plane of the library.
customer-reported
Issues that are reported by GitHub users external to the Azure organization.
feature-request
This issue requires a new behavior in the product in order be resolved.
Monitor - Exporter
Monitor OpenTelemetry Exporter
needs-team-attention
Workflow: This issue needs attention from Azure service team or SDK team
question
The issue doesn't require a change to the product in order to be resolved. Most issues start as that
Service Attention
Workflow: This issue is responsible by Azure service team.
Uh oh!
There was an error while loading. Please reload this page.
Describe the bug
With default configuration (absence of configuration), time of execution of
configure_azure_monitor()
takes abnormally long time: ~10 seconds.To Reproduce
long.py:
APPLICATIONINSIGHTS_CONNECTION_STRING="..." python long.py
Expected behavior
Reasonable time to configure (< 1 s).
Additional context
After running in debugger I discovered two main code places contributing to the delay, and both are related to checking the fact of running in an Azure VM.
Location: https://github.com/open-telemetry/opentelemetry-python-contrib/blob/37aba928d45713842941c7efc992726a79ea7d8a/resource/opentelemetry-resource-detector-azure/src/opentelemetry/resource/detector/azure/vm.py#L77
The way code gets there:
Then in https://github.com/open-telemetry/opentelemetry-python-contrib/blob/main/resource/opentelemetry-resource-detector-azure/src/opentelemetry/resource/detector/azure/vm.py
2. Statsbeat metrics
Location:
azure-sdk-for-python/sdk/monitor/azure-monitor-opentelemetry-exporter/azure/monitor/opentelemetry/exporter/statsbeat/_statsbeat_metrics.py
Lines 212 to 215 in a9b8513
Call stack:
In both cases the delay is related to requests to this endpoint:
http://169.254.169.254/metadata/instance/compute
though, to different API versions. The first place has request timeout of 4 seconds, and the second place has 5 seconds, which together constitute almost the entire time of the startup delay.
Workarounds
OTEL_EXPERIMENTAL_RESOURCE_DETECTORS=otel
environment variable. If not set, the library sets the default value, that includes App Service and Azure VM.APPLICATIONINSIGHTS_STATSBEAT_DISABLED_ALL=TRUE
The above tweaks bring the configuration time down to ~0.8 s (and with
OTEL_PYTHON_DISABLED_INSTRUMENTATIONS
set toazure_sdk,django,fastapi,flask,psycopg2,requests,urllib,urllib3
it completes under 30 ms).It took me hours to find the above options for fixing the startup time without touching the code. I think we need to make the library friendlier to running in non-Azure environments.
The text was updated successfully, but these errors were encountered: