Skip to content

Commit 4ebca9c

Browse files
authored
Auto-enable Azure AI Inference instrumentation in Azure Monitor, update docs (#38071)
* Auto-enable Azure AI Inference instrumentation in Azure Monitor, update docs
1 parent 13c2dc8 commit 4ebca9c

25 files changed

+672
-397
lines changed

.vscode/cspell.json

+2
Original file line numberDiff line numberDiff line change
@@ -337,6 +337,8 @@
337337
"onmicrosoft",
338338
"openai",
339339
"OPENAI",
340+
"otlp",
341+
"OTLP",
340342
"owasp",
341343
"ownerid",
342344
"PBYTE",

sdk/ai/azure-ai-inference/README.md

+40-19
Original file line numberDiff line numberDiff line change
@@ -224,7 +224,7 @@ The `EmbeddingsClient` has a method named `embedding`. The method makes a REST A
224224

225225
See simple text embedding example below. More can be found in the [samples](https://github.com/Azure/azure-sdk-for-python/tree/main/sdk/ai/azure-ai-inference/samples) folder.
226226

227-
<!--
227+
<!--
228228
### Image Embeddings
229229
230230
TODO: Add overview and link to explain image embeddings.
@@ -242,7 +242,7 @@ In the following sections you will find simple examples of:
242242
* [Text Embeddings](#text-embeddings-example)
243243
<!-- * [Image Embeddings](#image-embeddings-example) -->
244244

245-
The examples create a synchronous client assuming a Serverless API or Managed Compute endpoint. Modify client
245+
The examples create a synchronous client assuming a Serverless API or Managed Compute endpoint. Modify client
246246
construction code as descirbed in [Key concepts](#key-concepts) to have it work with GitHub Models endpoint or Azure OpenAI
247247
endpoint. Only mandatory input settings are shown for simplicity.
248248

@@ -275,7 +275,7 @@ print(response.choices[0].message.content)
275275

276276
The following types or messages are supported: `SystemMessage`,`UserMessage`, `AssistantMessage`, `ToolMessage`. See also samples:
277277

278-
* [sample_chat_completions_with_tools.py](https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/ai/azure-ai-inference/samples/sample_chat_completions_with_tools.py) for usage of `ToolMessage`.
278+
* [sample_chat_completions_with_tools.py](https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/ai/azure-ai-inference/samples/sample_chat_completions_with_tools.py) for usage of `ToolMessage`.
279279
* [sample_chat_completions_with_image_url.py](https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/ai/azure-ai-inference/samples/sample_chat_completions_with_image_url.py) for usage of `UserMessage` that
280280
includes sending an image URL.
281281
* [sample_chat_completions_with_image_data.py](https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/ai/azure-ai-inference/samples/sample_chat_completions_with_image_data.py) for usage of `UserMessage` that
@@ -535,15 +535,44 @@ For more information, see [Configure logging in the Azure libraries for Python](
535535

536536
To report issues with the client library, or request additional features, please open a GitHub issue [here](https://github.com/Azure/azure-sdk-for-python/issues)
537537

538-
## Tracing
538+
## Observability With OpenTelemetry
539+
540+
The Azure AI Inference client library provides experimental support for tracing with OpenTelemetry.
541+
542+
You can capture prompt and completion contents by setting `AZURE_TRACING_GEN_AI_CONTENT_RECORDING_ENABLED` environment to `true` (case insensitive).
543+
By default prompts, completions, function name, parameters or outputs are not recorded.
539544

540-
The Azure AI Inferencing API Tracing library provides tracing for Azure AI Inference client library for Python. Refer to Installation chapter above for installation instructions.
545+
### Setup with Azure Monitor
541546

542-
### Setup
547+
When using Azure AI Inference library with [Azure Monitor OpenTelemetry Distro](https://learn.microsoft.com/azure/azure-monitor/app/opentelemetry-enable?tabs=python),
548+
distributed tracing for Azure AI Inference calls is enabled by default when using latest version of the distro.
543549

544-
The environment variable AZURE_TRACING_GEN_AI_CONTENT_RECORDING_ENABLED controls whether the actual message contents will be recorded in the traces or not. By default, the message contents are not recorded as part of the trace. When message content recording is disabled any function call tool related function names, function parameter names and function parameter values are also not recorded in the trace. Set the value of the environment variable to "true" (case insensitive) for the message contents to be recorded as part of the trace. Any other value will cause the message contents not to be recorded.
550+
### Setup with OpenTelemetry
545551

546-
You also need to configure the tracing implementation in your code by setting `AZURE_SDK_TRACING_IMPLEMENTATION` to `opentelemetry` or configuring it in the code with the following snippet:
552+
Check out your observability vendor documentation on how to configure OpenTelemetry or refer to the [official OpenTelemetry documentation](https://opentelemetry.io/docs/languages/python/).
553+
554+
#### Installation
555+
556+
Make sure to install OpenTelemetry and the Azure SDK tracing plugin via
557+
558+
```bash
559+
pip install opentelemetry
560+
pip install azure-core-tracing-opentelemetry
561+
```
562+
563+
You will also need an exporter to send telemetry to your observability backend. You can print traces to the console or use a local viewer such as [Aspire Dashboard](https://learn.microsoft.com/dotnet/aspire/fundamentals/dashboard/standalone?tabs=bash).
564+
565+
To connect to Aspire Dashboard or another OpenTelemetry compatible backend, install OTLP exporter:
566+
567+
```bash
568+
pip install opentelemetry-exporter-otlp
569+
```
570+
571+
#### Configuration
572+
573+
To enable Azure SDK tracing set `AZURE_SDK_TRACING_IMPLEMENTATION` environment variable to `opentelemetry`.
574+
575+
Or configure it in the code with the following snippet:
547576

548577
<!-- SNIPPET:sample_chat_completions_with_tracing.trace_setting -->
549578

@@ -556,16 +585,7 @@ settings.tracing_implementation = "opentelemetry"
556585

557586
Please refer to [azure-core-tracing-documentation](https://learn.microsoft.com/python/api/overview/azure/core-tracing-opentelemetry-readme) for more information.
558587

559-
### Exporting Traces with OpenTelemetry
560-
561-
Azure AI Inference is instrumented with OpenTelemetry. In order to enable tracing you need to configure OpenTelemetry to export traces to your observability backend.
562-
Refer to [Azure SDK tracing in Python](https://learn.microsoft.com/python/api/overview/azure/core-tracing-opentelemetry-readme?view=azure-python-preview) for more details.
563-
564-
Refer to [Azure Monitor OpenTelemetry documentation](https://learn.microsoft.com/azure/azure-monitor/app/opentelemetry-enable?tabs=python) for the details on how to send Azure AI Inference traces to Azure Monitor and create Azure Monitor resource.
565-
566-
### Instrumentation
567-
568-
Use the AIInferenceInstrumentor to instrument the Azure AI Inferencing API for LLM tracing, this will cause the LLM traces to be emitted from Azure AI Inferencing API.
588+
The final step is to enable Azure AI Inference instrumentation with the following code snippet:
569589

570590
<!-- SNIPPET:sample_chat_completions_with_tracing.instrument_inferencing -->
571591

@@ -589,7 +609,8 @@ AIInferenceInstrumentor().uninstrument()
589609
<!-- END SNIPPET -->
590610

591611
### Tracing Your Own Functions
592-
The @tracer.start_as_current_span decorator can be used to trace your own functions. This will trace the function parameters and their values. You can also add further attributes to the span in the function implementation as demonstrated below. Note that you will have to setup the tracer in your code before using the decorator. More information is available [here](https://opentelemetry.io/docs/languages/python/).
612+
613+
The `@tracer.start_as_current_span` decorator can be used to trace your own functions. This will trace the function parameters and their values. You can also add further attributes to the span in the function implementation as demonstrated below. Note that you will have to setup the tracer in your code before using the decorator. More information is available [here](https://opentelemetry.io/docs/languages/python/).
593614

594615
<!-- SNIPPET:sample_chat_completions_with_tracing.trace_function -->
595616

sdk/ai/azure-ai-inference/azure/ai/inference/_patch.py

-1
Original file line numberDiff line numberDiff line change
@@ -261,7 +261,6 @@ def __init__(
261261

262262
super().__init__(endpoint, credential, **kwargs)
263263

264-
265264
@overload
266265
def complete(
267266
self,
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,6 @@
11
-e ../../../tools/azure-sdk-tools
22
../../core/azure-core
33
../../core/azure-core-tracing-opentelemetry
4+
../../monitor/azure-monitor-opentelemetry
45
aiohttp
56
opentelemetry-sdk

sdk/ai/azure-ai-inference/samples/async_samples/sample_chat_completions_from_input_json_async.py

+1-4
Original file line numberDiff line numberDiff line change
@@ -58,10 +58,7 @@ async def sample_chat_completions_from_input_json_async():
5858
"role": "assistant",
5959
"content": "The main construction of the International Space Station (ISS) was completed between 1998 and 2011. During this period, more than 30 flights by US space shuttles and 40 by Russian rockets were conducted to transport components and modules to the station.",
6060
},
61-
{
62-
"role": "user",
63-
"content": "And what was the estimated cost to build it?"
64-
},
61+
{"role": "user", "content": "And what was the estimated cost to build it?"},
6562
]
6663
}
6764

sdk/ai/azure-ai-inference/samples/sample_chat_completions_azure_openai.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -65,7 +65,7 @@ def sample_chat_completions_azure_openai():
6565
endpoint=endpoint,
6666
credential=DefaultAzureCredential(exclude_interactive_browser_credential=False),
6767
credential_scopes=["https://cognitiveservices.azure.com/.default"],
68-
api_version="2024-06-01", # Azure OpenAI api-version. See https://aka.ms/azsdk/azure-ai-inference/azure-openai-api-versions
68+
api_version="2024-06-01", # Azure OpenAI api-version. See https://aka.ms/azsdk/azure-ai-inference/azure-openai-api-versions
6969
)
7070

7171
response = client.complete(

sdk/ai/azure-ai-inference/samples/sample_chat_completions_from_input_json.py

+1-4
Original file line numberDiff line numberDiff line change
@@ -58,10 +58,7 @@ def sample_chat_completions_from_input_json():
5858
"role": "assistant",
5959
"content": "The main construction of the International Space Station (ISS) was completed between 1998 and 2011. During this period, more than 30 flights by US space shuttles and 40 by Russian rockets were conducted to transport components and modules to the station.",
6060
},
61-
{
62-
"role": "user",
63-
"content": "And what was the estimated cost to build it?"
64-
},
61+
{"role": "user", "content": "And what was the estimated cost to build it?"},
6562
]
6663
}
6764
)

sdk/ai/azure-ai-inference/samples/sample_chat_completions_from_input_json_with_image_url.py

+2-7
Original file line numberDiff line numberDiff line change
@@ -54,9 +54,7 @@ def sample_chat_completions_from_input_json_with_image_url():
5454
model_deployment = None
5555

5656
client = ChatCompletionsClient(
57-
endpoint=endpoint,
58-
credential=AzureKeyCredential(key),
59-
headers={"azureml-model-deployment": model_deployment}
57+
endpoint=endpoint, credential=AzureKeyCredential(key), headers={"azureml-model-deployment": model_deployment}
6058
)
6159

6260
response = client.complete(
@@ -69,10 +67,7 @@ def sample_chat_completions_from_input_json_with_image_url():
6967
{
7068
"role": "user",
7169
"content": [
72-
{
73-
"type": "text",
74-
"text": "What's in this image?"
75-
},
70+
{"type": "text", "text": "What's in this image?"},
7671
{
7772
"type": "image_url",
7873
"image_url": {

sdk/ai/azure-ai-inference/samples/sample_chat_completions_streaming_with_tools.py

+9-29
Original file line numberDiff line numberDiff line change
@@ -35,6 +35,7 @@
3535

3636
use_azure_openai_endpoint = True
3737

38+
3839
def sample_chat_completions_streaming_with_tools():
3940
import os
4041
import json
@@ -79,11 +80,9 @@ def get_flight_info(origin_city: str, destination_city: str):
7980
str: The airline name, fight number, date and time of the next flight between the cities, in JSON format.
8081
"""
8182
if origin_city == "Seattle" and destination_city == "Miami":
82-
return json.dumps({
83-
"airline": "Delta",
84-
"flight_number": "DL123",
85-
"flight_date": "May 7th, 2024",
86-
"flight_time": "10:00AM"})
83+
return json.dumps(
84+
{"airline": "Delta", "flight_number": "DL123", "flight_date": "May 7th, 2024", "flight_time": "10:00AM"}
85+
)
8786
return json.dumps({"error": "No flights found between the cities"})
8887

8988
# Define a function 'tool' that the model can use to retrieves flight information
@@ -117,21 +116,15 @@ def get_flight_info(origin_city: str, destination_city: str):
117116
)
118117
else:
119118
# Create a chat completions client for Serverless API endpoint or Managed Compute endpoint
120-
client = ChatCompletionsClient(
121-
endpoint=endpoint,
122-
credential=AzureKeyCredential(key)
123-
)
119+
client = ChatCompletionsClient(endpoint=endpoint, credential=AzureKeyCredential(key))
124120

125121
# Make a streaming chat completions call asking for flight information, while providing a tool to handle the request
126122
messages = [
127123
SystemMessage(content="You an assistant that helps users find flight information."),
128124
UserMessage(content="What is the next flights from Seattle to Miami?"),
129125
]
130126

131-
response = client.complete(
132-
messages=messages,
133-
tools=[flight_info],
134-
stream=True)
127+
response = client.complete(messages=messages, tools=[flight_info], stream=True)
135128

136129
# Note that in the above call we did not specify `tool_choice`. The service defaults to a setting equivalent
137130
# to specifying `tool_choice=ChatCompletionsToolChoicePreset.AUTO`. Other than ChatCompletionsToolChoicePreset
@@ -158,11 +151,7 @@ def get_flight_info(origin_city: str, destination_city: str):
158151
AssistantMessage(
159152
tool_calls=[
160153
ChatCompletionsToolCall(
161-
id=tool_call_id,
162-
function=FunctionCall(
163-
name=function_name,
164-
arguments=function_args
165-
)
154+
id=tool_call_id, function=FunctionCall(name=function_name, arguments=function_args)
166155
)
167156
]
168157
)
@@ -176,19 +165,10 @@ def get_flight_info(origin_city: str, destination_city: str):
176165
print(f"Function response = {function_response}")
177166

178167
# Append the function response as a tool message to the chat history
179-
messages.append(
180-
ToolMessage(
181-
tool_call_id=tool_call_id,
182-
content=function_response
183-
)
184-
)
168+
messages.append(ToolMessage(tool_call_id=tool_call_id, content=function_response))
185169

186170
# With the additional tools information on hand, get another streaming response from the model
187-
response = client.complete(
188-
messages=messages,
189-
tools=[flight_info],
190-
stream=True
191-
)
171+
response = client.complete(messages=messages, tools=[flight_info], stream=True)
192172

193173
print("Model response = ", end="")
194174
for update in response:

0 commit comments

Comments
 (0)