diff --git a/.vscode/cspell.json b/.vscode/cspell.json index 8002a431d349..2f0c5e620306 100644 --- a/.vscode/cspell.json +++ b/.vscode/cspell.json @@ -337,6 +337,8 @@ "onmicrosoft", "openai", "OPENAI", + "otlp", + "OTLP", "owasp", "ownerid", "PBYTE", diff --git a/sdk/ai/azure-ai-inference/README.md b/sdk/ai/azure-ai-inference/README.md index 8f3652318525..49a9e6acdeb8 100644 --- a/sdk/ai/azure-ai-inference/README.md +++ b/sdk/ai/azure-ai-inference/README.md @@ -224,7 +224,7 @@ The `EmbeddingsClient` has a method named `embedding`. The method makes a REST A See simple text embedding example below. More can be found in the [samples](https://github.com/Azure/azure-sdk-for-python/tree/main/sdk/ai/azure-ai-inference/samples) folder. - -The examples create a synchronous client assuming a Serverless API or Managed Compute endpoint. Modify client +The examples create a synchronous client assuming a Serverless API or Managed Compute endpoint. Modify client construction code as descirbed in [Key concepts](#key-concepts) to have it work with GitHub Models endpoint or Azure OpenAI endpoint. Only mandatory input settings are shown for simplicity. @@ -275,7 +275,7 @@ print(response.choices[0].message.content) The following types or messages are supported: `SystemMessage`,`UserMessage`, `AssistantMessage`, `ToolMessage`. See also samples: -* [sample_chat_completions_with_tools.py](https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/ai/azure-ai-inference/samples/sample_chat_completions_with_tools.py) for usage of `ToolMessage`. +* [sample_chat_completions_with_tools.py](https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/ai/azure-ai-inference/samples/sample_chat_completions_with_tools.py) for usage of `ToolMessage`. * [sample_chat_completions_with_image_url.py](https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/ai/azure-ai-inference/samples/sample_chat_completions_with_image_url.py) for usage of `UserMessage` that includes sending an image URL. * [sample_chat_completions_with_image_data.py](https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/ai/azure-ai-inference/samples/sample_chat_completions_with_image_data.py) for usage of `UserMessage` that @@ -535,15 +535,44 @@ For more information, see [Configure logging in the Azure libraries for Python]( To report issues with the client library, or request additional features, please open a GitHub issue [here](https://github.com/Azure/azure-sdk-for-python/issues) -## Tracing +## Observability With OpenTelemetry + +The Azure AI Inference client library provides experimental support for tracing with OpenTelemetry. + +You can capture prompt and completion contents by setting `AZURE_TRACING_GEN_AI_CONTENT_RECORDING_ENABLED` environment to `true` (case insensitive). +By default prompts, completions, function name, parameters or outputs are not recorded. -The Azure AI Inferencing API Tracing library provides tracing for Azure AI Inference client library for Python. Refer to Installation chapter above for installation instructions. +### Setup with Azure Monitor -### Setup +When using Azure AI Inference library with [Azure Monitor OpenTelemetry Distro](https://learn.microsoft.com/azure/azure-monitor/app/opentelemetry-enable?tabs=python), +distributed tracing for Azure AI Inference calls is enabled by default when using latest version of the distro. -The environment variable AZURE_TRACING_GEN_AI_CONTENT_RECORDING_ENABLED controls whether the actual message contents will be recorded in the traces or not. By default, the message contents are not recorded as part of the trace. When message content recording is disabled any function call tool related function names, function parameter names and function parameter values are also not recorded in the trace. Set the value of the environment variable to "true" (case insensitive) for the message contents to be recorded as part of the trace. Any other value will cause the message contents not to be recorded. +### Setup with OpenTelemetry -You also need to configure the tracing implementation in your code by setting `AZURE_SDK_TRACING_IMPLEMENTATION` to `opentelemetry` or configuring it in the code with the following snippet: +Check out your observability vendor documentation on how to configure OpenTelemetry or refer to the [official OpenTelemetry documentation](https://opentelemetry.io/docs/languages/python/). + +#### Installation + +Make sure to install OpenTelemetry and the Azure SDK tracing plugin via + +```bash +pip install opentelemetry +pip install azure-core-tracing-opentelemetry +``` + +You will also need an exporter to send telemetry to your observability backend. You can print traces to the console or use a local viewer such as [Aspire Dashboard](https://learn.microsoft.com/dotnet/aspire/fundamentals/dashboard/standalone?tabs=bash). + +To connect to Aspire Dashboard or another OpenTelemetry compatible backend, install OTLP exporter: + +```bash +pip install opentelemetry-exporter-otlp +``` + +#### Configuration + +To enable Azure SDK tracing set `AZURE_SDK_TRACING_IMPLEMENTATION` environment variable to `opentelemetry`. + +Or configure it in the code with the following snippet: @@ -556,16 +585,7 @@ settings.tracing_implementation = "opentelemetry" Please refer to [azure-core-tracing-documentation](https://learn.microsoft.com/python/api/overview/azure/core-tracing-opentelemetry-readme) for more information. -### Exporting Traces with OpenTelemetry - -Azure AI Inference is instrumented with OpenTelemetry. In order to enable tracing you need to configure OpenTelemetry to export traces to your observability backend. -Refer to [Azure SDK tracing in Python](https://learn.microsoft.com/python/api/overview/azure/core-tracing-opentelemetry-readme?view=azure-python-preview) for more details. - -Refer to [Azure Monitor OpenTelemetry documentation](https://learn.microsoft.com/azure/azure-monitor/app/opentelemetry-enable?tabs=python) for the details on how to send Azure AI Inference traces to Azure Monitor and create Azure Monitor resource. - -### Instrumentation - -Use the AIInferenceInstrumentor to instrument the Azure AI Inferencing API for LLM tracing, this will cause the LLM traces to be emitted from Azure AI Inferencing API. +The final step is to enable Azure AI Inference instrumentation with the following code snippet: @@ -589,7 +609,8 @@ AIInferenceInstrumentor().uninstrument() ### Tracing Your Own Functions -The @tracer.start_as_current_span decorator can be used to trace your own functions. This will trace the function parameters and their values. You can also add further attributes to the span in the function implementation as demonstrated below. Note that you will have to setup the tracer in your code before using the decorator. More information is available [here](https://opentelemetry.io/docs/languages/python/). + +The `@tracer.start_as_current_span` decorator can be used to trace your own functions. This will trace the function parameters and their values. You can also add further attributes to the span in the function implementation as demonstrated below. Note that you will have to setup the tracer in your code before using the decorator. More information is available [here](https://opentelemetry.io/docs/languages/python/). diff --git a/sdk/ai/azure-ai-inference/azure/ai/inference/_patch.py b/sdk/ai/azure-ai-inference/azure/ai/inference/_patch.py index 050a8d1ab96c..2adfd99ecc43 100644 --- a/sdk/ai/azure-ai-inference/azure/ai/inference/_patch.py +++ b/sdk/ai/azure-ai-inference/azure/ai/inference/_patch.py @@ -261,7 +261,6 @@ def __init__( super().__init__(endpoint, credential, **kwargs) - @overload def complete( self, diff --git a/sdk/ai/azure-ai-inference/dev_requirements.txt b/sdk/ai/azure-ai-inference/dev_requirements.txt index 4f5b55a5a48a..9c82a165e327 100644 --- a/sdk/ai/azure-ai-inference/dev_requirements.txt +++ b/sdk/ai/azure-ai-inference/dev_requirements.txt @@ -1,5 +1,6 @@ -e ../../../tools/azure-sdk-tools ../../core/azure-core ../../core/azure-core-tracing-opentelemetry +../../monitor/azure-monitor-opentelemetry aiohttp opentelemetry-sdk \ No newline at end of file diff --git a/sdk/ai/azure-ai-inference/samples/async_samples/sample_chat_completions_from_input_json_async.py b/sdk/ai/azure-ai-inference/samples/async_samples/sample_chat_completions_from_input_json_async.py index ec2dd6afae75..25d6ce20cce7 100644 --- a/sdk/ai/azure-ai-inference/samples/async_samples/sample_chat_completions_from_input_json_async.py +++ b/sdk/ai/azure-ai-inference/samples/async_samples/sample_chat_completions_from_input_json_async.py @@ -58,10 +58,7 @@ async def sample_chat_completions_from_input_json_async(): "role": "assistant", "content": "The main construction of the International Space Station (ISS) was completed between 1998 and 2011. During this period, more than 30 flights by US space shuttles and 40 by Russian rockets were conducted to transport components and modules to the station.", }, - { - "role": "user", - "content": "And what was the estimated cost to build it?" - }, + {"role": "user", "content": "And what was the estimated cost to build it?"}, ] } diff --git a/sdk/ai/azure-ai-inference/samples/sample_chat_completions_azure_openai.py b/sdk/ai/azure-ai-inference/samples/sample_chat_completions_azure_openai.py index f025eea212cb..e4b03dbe50f9 100644 --- a/sdk/ai/azure-ai-inference/samples/sample_chat_completions_azure_openai.py +++ b/sdk/ai/azure-ai-inference/samples/sample_chat_completions_azure_openai.py @@ -65,7 +65,7 @@ def sample_chat_completions_azure_openai(): endpoint=endpoint, credential=DefaultAzureCredential(exclude_interactive_browser_credential=False), credential_scopes=["https://cognitiveservices.azure.com/.default"], - api_version="2024-06-01", # Azure OpenAI api-version. See https://aka.ms/azsdk/azure-ai-inference/azure-openai-api-versions + api_version="2024-06-01", # Azure OpenAI api-version. See https://aka.ms/azsdk/azure-ai-inference/azure-openai-api-versions ) response = client.complete( diff --git a/sdk/ai/azure-ai-inference/samples/sample_chat_completions_from_input_json.py b/sdk/ai/azure-ai-inference/samples/sample_chat_completions_from_input_json.py index 925583af4772..78a9b9a42690 100644 --- a/sdk/ai/azure-ai-inference/samples/sample_chat_completions_from_input_json.py +++ b/sdk/ai/azure-ai-inference/samples/sample_chat_completions_from_input_json.py @@ -58,10 +58,7 @@ def sample_chat_completions_from_input_json(): "role": "assistant", "content": "The main construction of the International Space Station (ISS) was completed between 1998 and 2011. During this period, more than 30 flights by US space shuttles and 40 by Russian rockets were conducted to transport components and modules to the station.", }, - { - "role": "user", - "content": "And what was the estimated cost to build it?" - }, + {"role": "user", "content": "And what was the estimated cost to build it?"}, ] } ) diff --git a/sdk/ai/azure-ai-inference/samples/sample_chat_completions_from_input_json_with_image_url.py b/sdk/ai/azure-ai-inference/samples/sample_chat_completions_from_input_json_with_image_url.py index 912b98afccb8..83f3afceaa19 100644 --- a/sdk/ai/azure-ai-inference/samples/sample_chat_completions_from_input_json_with_image_url.py +++ b/sdk/ai/azure-ai-inference/samples/sample_chat_completions_from_input_json_with_image_url.py @@ -54,9 +54,7 @@ def sample_chat_completions_from_input_json_with_image_url(): model_deployment = None client = ChatCompletionsClient( - endpoint=endpoint, - credential=AzureKeyCredential(key), - headers={"azureml-model-deployment": model_deployment} + endpoint=endpoint, credential=AzureKeyCredential(key), headers={"azureml-model-deployment": model_deployment} ) response = client.complete( @@ -69,10 +67,7 @@ def sample_chat_completions_from_input_json_with_image_url(): { "role": "user", "content": [ - { - "type": "text", - "text": "What's in this image?" - }, + {"type": "text", "text": "What's in this image?"}, { "type": "image_url", "image_url": { diff --git a/sdk/ai/azure-ai-inference/samples/sample_chat_completions_streaming_with_tools.py b/sdk/ai/azure-ai-inference/samples/sample_chat_completions_streaming_with_tools.py index dfa62afa2127..8eb5c7472af4 100644 --- a/sdk/ai/azure-ai-inference/samples/sample_chat_completions_streaming_with_tools.py +++ b/sdk/ai/azure-ai-inference/samples/sample_chat_completions_streaming_with_tools.py @@ -35,6 +35,7 @@ use_azure_openai_endpoint = True + def sample_chat_completions_streaming_with_tools(): import os import json @@ -79,11 +80,9 @@ def get_flight_info(origin_city: str, destination_city: str): str: The airline name, fight number, date and time of the next flight between the cities, in JSON format. """ if origin_city == "Seattle" and destination_city == "Miami": - return json.dumps({ - "airline": "Delta", - "flight_number": "DL123", - "flight_date": "May 7th, 2024", - "flight_time": "10:00AM"}) + return json.dumps( + {"airline": "Delta", "flight_number": "DL123", "flight_date": "May 7th, 2024", "flight_time": "10:00AM"} + ) return json.dumps({"error": "No flights found between the cities"}) # Define a function 'tool' that the model can use to retrieves flight information @@ -117,10 +116,7 @@ def get_flight_info(origin_city: str, destination_city: str): ) else: # Create a chat completions client for Serverless API endpoint or Managed Compute endpoint - client = ChatCompletionsClient( - endpoint=endpoint, - credential=AzureKeyCredential(key) - ) + client = ChatCompletionsClient(endpoint=endpoint, credential=AzureKeyCredential(key)) # Make a streaming chat completions call asking for flight information, while providing a tool to handle the request messages = [ @@ -128,10 +124,7 @@ def get_flight_info(origin_city: str, destination_city: str): UserMessage(content="What is the next flights from Seattle to Miami?"), ] - response = client.complete( - messages=messages, - tools=[flight_info], - stream=True) + response = client.complete(messages=messages, tools=[flight_info], stream=True) # Note that in the above call we did not specify `tool_choice`. The service defaults to a setting equivalent # to specifying `tool_choice=ChatCompletionsToolChoicePreset.AUTO`. Other than ChatCompletionsToolChoicePreset @@ -158,11 +151,7 @@ def get_flight_info(origin_city: str, destination_city: str): AssistantMessage( tool_calls=[ ChatCompletionsToolCall( - id=tool_call_id, - function=FunctionCall( - name=function_name, - arguments=function_args - ) + id=tool_call_id, function=FunctionCall(name=function_name, arguments=function_args) ) ] ) @@ -176,19 +165,10 @@ def get_flight_info(origin_city: str, destination_city: str): print(f"Function response = {function_response}") # Append the function response as a tool message to the chat history - messages.append( - ToolMessage( - tool_call_id=tool_call_id, - content=function_response - ) - ) + messages.append(ToolMessage(tool_call_id=tool_call_id, content=function_response)) # With the additional tools information on hand, get another streaming response from the model - response = client.complete( - messages=messages, - tools=[flight_info], - stream=True - ) + response = client.complete(messages=messages, tools=[flight_info], stream=True) print("Model response = ", end="") for update in response: diff --git a/sdk/ai/azure-ai-inference/samples/sample_chat_completions_with_azure_monitor_tracing.py b/sdk/ai/azure-ai-inference/samples/sample_chat_completions_with_azure_monitor_tracing.py new file mode 100644 index 000000000000..cde4505f3e83 --- /dev/null +++ b/sdk/ai/azure-ai-inference/samples/sample_chat_completions_with_azure_monitor_tracing.py @@ -0,0 +1,161 @@ +# ------------------------------------ +# Copyright (c) Microsoft Corporation. +# Licensed under the MIT License. +# ------------------------------------ +""" +DESCRIPTION: + This sample demonstrates how to enable distributed tracing with OpenTelemetry + in Azure AI Inference client library and export traces to Azure Monitor. + + This sample assumes the AI model is hosted on a Serverless API or + Managed Compute endpoint. For GitHub Models or Azure OpenAI endpoints, + the client constructor needs to be modified. See package documentation: + https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/ai/azure-ai-inference/README.md#key-concepts + +USAGE: + python sample_chat_completions_with_azure_monitor_tracing.py + + Set these two environment variables before running the sample: + 1) AZURE_AI_CHAT_ENDPOINT - Your endpoint URL, in the form + https://..models.ai.azure.com + where `your-deployment-name` is your unique AI Model deployment name, and + `your-azure-region` is the Azure region where your model is deployed. + 2) AZURE_AI_CHAT_KEY - Your model key (a 32-character string). Keep it secret. + 3) APPLICATIONINSIGHTS_CONNECTION_STRING - Your Azure Monitor (Application Insights) connection string. + 4) AZURE_TRACING_GEN_AI_CONTENT_RECORDING_ENABLED - Set to 'true' to enable content recording. +""" + + +import os +from opentelemetry import trace +from azure.ai.inference import ChatCompletionsClient +from azure.ai.inference.models import SystemMessage, UserMessage, CompletionsFinishReason +from azure.core.credentials import AzureKeyCredential +from azure.monitor.opentelemetry import configure_azure_monitor + + +# [START trace_function] +from opentelemetry.trace import get_tracer + +tracer = get_tracer(__name__) + + +# The tracer.start_as_current_span decorator will trace the function call and enable adding additional attributes +# to the span in the function implementation. Note that this will trace the function parameters and their values. +@tracer.start_as_current_span("get_temperature") # type: ignore +def get_temperature(city: str) -> str: + + # Adding attributes to the current span + span = trace.get_current_span() + span.set_attribute("requested_city", city) + + if city == "Seattle": + return "75" + elif city == "New York City": + return "80" + else: + return "Unavailable" + + +# [END trace_function] + + +def get_weather(city: str) -> str: + if city == "Seattle": + return "Nice weather" + elif city == "New York City": + return "Good weather" + else: + return "Unavailable" + + +def chat_completion_with_function_call(key, endpoint): + import json + from azure.ai.inference.models import ( + ToolMessage, + AssistantMessage, + ChatCompletionsToolCall, + ChatCompletionsToolDefinition, + FunctionDefinition, + ) + + weather_description = ChatCompletionsToolDefinition( + function=FunctionDefinition( + name="get_weather", + description="Returns description of the weather in the specified city", + parameters={ + "type": "object", + "properties": { + "city": { + "type": "string", + "description": "The name of the city for which weather info is requested", + }, + }, + "required": ["city"], + }, + ) + ) + + temperature_in_city = ChatCompletionsToolDefinition( + function=FunctionDefinition( + name="get_temperature", + description="Returns the current temperature for the specified city", + parameters={ + "type": "object", + "properties": { + "city": { + "type": "string", + "description": "The name of the city for which temperature info is requested", + }, + }, + "required": ["city"], + }, + ) + ) + + client = ChatCompletionsClient(endpoint=endpoint, credential=AzureKeyCredential(key), model="gpt-4o-mini") + messages = [ + SystemMessage(content="You are a helpful assistant."), + UserMessage(content="What is the weather and temperature in Seattle?"), + ] + + response = client.complete(messages=messages, tools=[weather_description, temperature_in_city]) + + if response.choices[0].finish_reason == CompletionsFinishReason.TOOL_CALLS: + # Append the previous model response to the chat history + messages.append(AssistantMessage(tool_calls=response.choices[0].message.tool_calls)) + # The tool should be of type function call. + if response.choices[0].message.tool_calls is not None and len(response.choices[0].message.tool_calls) > 0: + for tool_call in response.choices[0].message.tool_calls: + if type(tool_call) is ChatCompletionsToolCall: + function_args = json.loads(tool_call.function.arguments.replace("'", '"')) + print(f"Calling function `{tool_call.function.name}` with arguments {function_args}") + callable_func = globals()[tool_call.function.name] + function_response = callable_func(**function_args) + print(f"Function response = {function_response}") + # Provide the tool response to the model, by appending it to the chat history + messages.append(ToolMessage(tool_call_id=tool_call.id, content=function_response)) + # With the additional tools information on hand, get another response from the model + response = client.complete(messages=messages, tools=[weather_description, temperature_in_city]) + + print(f"Model response = {response.choices[0].message.content}") + + +def main(): + # Make sure to set APPLICATIONINSIGHTS_CONNECTION_STRING environment variable before running this sample. + # Or pass the value as an argument to the configure_azure_monitor function. + configure_azure_monitor() + + try: + endpoint = os.environ["AZURE_AI_CHAT_ENDPOINT"] + key = os.environ["AZURE_AI_CHAT_KEY"] + except KeyError: + print("Missing environment variable 'AZURE_AI_CHAT_ENDPOINT' or 'AZURE_AI_CHAT_KEY'") + print("Set them before running this sample.") + exit() + + chat_completion_with_function_call(key, endpoint) + + +if __name__ == "__main__": + main() diff --git a/sdk/ai/azure-ai-inference/samples/sample_chat_completions_with_defaults.py b/sdk/ai/azure-ai-inference/samples/sample_chat_completions_with_defaults.py index 011735a7e61f..269ce2d232de 100644 --- a/sdk/ai/azure-ai-inference/samples/sample_chat_completions_with_defaults.py +++ b/sdk/ai/azure-ai-inference/samples/sample_chat_completions_with_defaults.py @@ -43,10 +43,7 @@ def sample_chat_completions_with_defaults(): # Create a client with default chat completions settings client = ChatCompletionsClient( - endpoint=endpoint, - credential=AzureKeyCredential(key), - temperature=0.5, - max_tokens=1000 + endpoint=endpoint, credential=AzureKeyCredential(key), temperature=0.5, max_tokens=1000 ) # Call the service with the defaults specified above diff --git a/sdk/ai/azure-ai-inference/samples/sample_chat_completions_with_tools.py b/sdk/ai/azure-ai-inference/samples/sample_chat_completions_with_tools.py index 3d14a550ab68..2074c447fdfe 100644 --- a/sdk/ai/azure-ai-inference/samples/sample_chat_completions_with_tools.py +++ b/sdk/ai/azure-ai-inference/samples/sample_chat_completions_with_tools.py @@ -64,14 +64,11 @@ def get_flight_info(origin_city: str, destination_city: str): str: The airline name, fight number, date and time of the next flight between the cities, in JSON format. """ if origin_city == "Seattle" and destination_city == "Miami": - return json.dumps({ - "airline": "Delta", - "flight_number": "DL123", - "flight_date": "May 7th, 2024", - "flight_time": "10:00AM"}) + return json.dumps( + {"airline": "Delta", "flight_number": "DL123", "flight_date": "May 7th, 2024", "flight_time": "10:00AM"} + ) return json.dumps({"error": "No flights found between the cities"}) - # Define a function 'tool' that the model can use to retrieves flight information flight_info = ChatCompletionsToolDefinition( function=FunctionDefinition( @@ -95,10 +92,7 @@ def get_flight_info(origin_city: str, destination_city: str): ) # Create a chat completion client. Make sure you selected a model that supports tools. - client = ChatCompletionsClient( - endpoint=endpoint, - credential=AzureKeyCredential(key) - ) + client = ChatCompletionsClient(endpoint=endpoint, credential=AzureKeyCredential(key)) # Make a chat completions call asking for flight information, while providing a tool to handle the request messages = [ diff --git a/sdk/ai/azure-ai-inference/samples/sample_chat_completions_with_tracing.py b/sdk/ai/azure-ai-inference/samples/sample_chat_completions_with_tracing.py index 875010ebbd26..e2d40ca2f575 100644 --- a/sdk/ai/azure-ai-inference/samples/sample_chat_completions_with_tracing.py +++ b/sdk/ai/azure-ai-inference/samples/sample_chat_completions_with_tracing.py @@ -18,7 +18,7 @@ python sample_chat_completions_with_tracing.py Set these two environment variables before running the sample: - 1) AZURE_AI_CHAT_ENDPOINT - Your endpoint URL, in the form + 1) AZURE_AI_CHAT_ENDPOINT - Your endpoint URL, in the form https://..models.ai.azure.com where `your-deployment-name` is your unique AI Model deployment name, and `your-azure-region` is the Azure region where your model is deployed. @@ -28,34 +28,36 @@ import os from opentelemetry import trace -# opentelemetry-sdk is required for the opentelemetry.sdk imports. -# You can install it with command "pip install opentelemetry-sdk". -#from opentelemetry.sdk.trace import TracerProvider -#from opentelemetry.sdk.trace.export import SimpleSpanProcessor, ConsoleSpanExporter + +# Install opentelemetry with command "pip install opentelemetry-sdk". +from opentelemetry.sdk.trace import TracerProvider +from opentelemetry.sdk.trace.export import SimpleSpanProcessor, ConsoleSpanExporter from azure.ai.inference import ChatCompletionsClient from azure.ai.inference.models import SystemMessage, UserMessage, CompletionsFinishReason from azure.core.credentials import AzureKeyCredential - # [START trace_setting] +# [START trace_setting] from azure.core.settings import settings + settings.tracing_implementation = "opentelemetry" # [END trace_setting] # Setup tracing to console # Requires opentelemetry-sdk -#exporter = ConsoleSpanExporter() -#trace.set_tracer_provider(TracerProvider()) -#tracer = trace.get_tracer(__name__) -#trace.get_tracer_provider().add_span_processor(SimpleSpanProcessor(exporter)) - +span_exporter = ConsoleSpanExporter() +tracer_provider = TracerProvider() +tracer_provider.add_span_processor(SimpleSpanProcessor(span_exporter)) +trace.set_tracer_provider(tracer_provider) - # [START trace_function] +# [START trace_function] from opentelemetry.trace import get_tracer + tracer = get_tracer(__name__) + # The tracer.start_as_current_span decorator will trace the function call and enable adding additional attributes # to the span in the function implementation. Note that this will trace the function parameters and their values. -@tracer.start_as_current_span("get_temperature") # type: ignore +@tracer.start_as_current_span("get_temperature") # type: ignore def get_temperature(city: str) -> str: # Adding attributes to the current span @@ -68,7 +70,9 @@ def get_temperature(city: str) -> str: return "80" else: return "Unavailable" - # [END trace_function] + + +# [END trace_function] def get_weather(city: str) -> str: @@ -82,7 +86,13 @@ def get_weather(city: str) -> str: def chat_completion_with_function_call(key, endpoint): import json - from azure.ai.inference.models import ToolMessage, AssistantMessage, ChatCompletionsToolCall, ChatCompletionsToolDefinition, FunctionDefinition + from azure.ai.inference.models import ( + ToolMessage, + AssistantMessage, + ChatCompletionsToolCall, + ChatCompletionsToolDefinition, + FunctionDefinition, + ) weather_description = ChatCompletionsToolDefinition( function=FunctionDefinition( @@ -119,7 +129,7 @@ def chat_completion_with_function_call(key, endpoint): ) client = ChatCompletionsClient(endpoint=endpoint, credential=AzureKeyCredential(key)) - messages=[ + messages = [ SystemMessage(content="You are a helpful assistant."), UserMessage(content="What is the weather and temperature in Seattle?"), ] @@ -142,13 +152,14 @@ def chat_completion_with_function_call(key, endpoint): messages.append(ToolMessage(tool_call_id=tool_call.id, content=function_response)) # With the additional tools information on hand, get another response from the model response = client.complete(messages=messages, tools=[weather_description, temperature_in_city]) - + print(f"Model response = {response.choices[0].message.content}") def main(): # [START instrument_inferencing] from azure.ai.inference.tracing import AIInferenceInstrumentor + # Instrument AI Inference API AIInferenceInstrumentor().instrument() # [END instrument_inferencing] diff --git a/sdk/ai/azure-ai-inference/samples/sample_embeddings_with_base64_encoding.py b/sdk/ai/azure-ai-inference/samples/sample_embeddings_with_base64_encoding.py index 9d9ec9c5c492..248bccb83a55 100644 --- a/sdk/ai/azure-ai-inference/samples/sample_embeddings_with_base64_encoding.py +++ b/sdk/ai/azure-ai-inference/samples/sample_embeddings_with_base64_encoding.py @@ -44,13 +44,15 @@ def sample_embeddings_with_base64_encoding(): # Request embeddings as base64 encoded strings response = client.embed( - input=["first phrase", "second phrase", "third phrase"], - encoding_format=EmbeddingEncodingFormat.BASE64) + input=["first phrase", "second phrase", "third phrase"], encoding_format=EmbeddingEncodingFormat.BASE64 + ) for item in response.data: # Display the start and end of the resulting base64 string - print(f"data[{item.index}] encoded (string length={len(item.embedding)}): " - f"\"{item.embedding[:32]}...{item.embedding[-32:]}\"") + print( + f"data[{item.index}] encoded (string length={len(item.embedding)}): " + f'"{item.embedding[:32]}...{item.embedding[-32:]}"' + ) # For display purposes, decode the string into a list of floating point numbers. # Display the first and last two elements of the list. diff --git a/sdk/ai/azure-ai-inference/samples/sample_image_embeddings_with_defaults.py b/sdk/ai/azure-ai-inference/samples/sample_image_embeddings_with_defaults.py index 3ce84554ab4d..5282f22e4f45 100644 --- a/sdk/ai/azure-ai-inference/samples/sample_image_embeddings_with_defaults.py +++ b/sdk/ai/azure-ai-inference/samples/sample_image_embeddings_with_defaults.py @@ -49,10 +49,7 @@ def sample_image_embeddings_with_defaults(): # Create a client with default embeddings settings client = ImageEmbeddingsClient( - endpoint=endpoint, - credential=AzureKeyCredential(key), - dimensions=1024, - input_type=EmbeddingInputType.QUERY + endpoint=endpoint, credential=AzureKeyCredential(key), dimensions=1024, input_type=EmbeddingInputType.QUERY ) # Call the service with the defaults specified above diff --git a/sdk/ai/azure-ai-inference/setup.py b/sdk/ai/azure-ai-inference/setup.py index c264ae00239e..f6a2bea03eb4 100644 --- a/sdk/ai/azure-ai-inference/setup.py +++ b/sdk/ai/azure-ai-inference/setup.py @@ -68,7 +68,5 @@ "typing-extensions>=4.6.0", ], python_requires=">=3.8", - extras_require={ - 'opentelemetry': ['azure-core-tracing-opentelemetry'] - } + extras_require={"opentelemetry": ["azure-core-tracing-opentelemetry"]}, ) diff --git a/sdk/ai/azure-ai-inference/tests/gen_ai_trace_verifier.py b/sdk/ai/azure-ai-inference/tests/gen_ai_trace_verifier.py index 29bb2ef57f47..a105b60cf8ac 100644 --- a/sdk/ai/azure-ai-inference/tests/gen_ai_trace_verifier.py +++ b/sdk/ai/azure-ai-inference/tests/gen_ai_trace_verifier.py @@ -10,11 +10,11 @@ class GenAiTraceVerifier: def check_span_attributes(self, span, attributes): - # Convert the list of tuples to a dictionary for easier lookup + # Convert the list of tuples to a dictionary for easier lookup attribute_dict = dict(attributes) - + for attribute_name in span.attributes.keys(): - # Check if the attribute name exists in the input attributes + # Check if the attribute name exists in the input attributes if attribute_name not in attribute_dict: return False @@ -26,7 +26,7 @@ def check_span_attributes(self, span, attributes): elif isinstance(attribute_value, tuple): # Check if the attribute value in the span matches the provided list if span.attributes[attribute_name] != attribute_value: - return False + return False else: # Check if the attribute value matches the provided value if attribute_value == "+": @@ -62,7 +62,7 @@ def check_event_attributes(self, expected_dict, actual_dict): return False for key, expected_val in expected_dict.items(): if key not in actual_dict: - return False + return False actual_val = actual_dict[key] if self.is_valid_json(expected_val): @@ -72,17 +72,17 @@ def check_event_attributes(self, expected_dict, actual_dict): return False elif isinstance(expected_val, dict): if not isinstance(actual_val, dict): - return False + return False if not self.check_event_attributes(expected_val, actual_val): return False - elif isinstance(expected_val, list): - if not isinstance(actual_val, list): + elif isinstance(expected_val, list): + if not isinstance(actual_val, list): return False if len(expected_val) != len(actual_val): return False - for expected_list, actual_list in zip(expected_val, actual_val): - if not self.check_event_attributes(expected_list, actual_list): - return False + for expected_list, actual_list in zip(expected_val, actual_val): + if not self.check_event_attributes(expected_list, actual_list): + return False elif isinstance(expected_val, str) and expected_val == "*": if actual_val == "": return False @@ -95,8 +95,8 @@ def check_span_events(self, span, expected_events): for expected_event in expected_events: for actual_event in span_events: - if expected_event['name'] == actual_event.name: - if not self.check_event_attributes(expected_event['attributes'], actual_event.attributes): + if expected_event["name"] == actual_event.name: + if not self.check_event_attributes(expected_event["attributes"], actual_event.attributes): return False span_events.remove(actual_event) # Remove the matched event from the span_events break diff --git a/sdk/ai/azure-ai-inference/tests/memory_trace_exporter.py b/sdk/ai/azure-ai-inference/tests/memory_trace_exporter.py index 7b609fbf5724..d0007f6f1bdc 100644 --- a/sdk/ai/azure-ai-inference/tests/memory_trace_exporter.py +++ b/sdk/ai/azure-ai-inference/tests/memory_trace_exporter.py @@ -34,6 +34,6 @@ def get_spans_by_name_starts_with(self, name_prefix: str) -> List[Span]: def get_spans_by_name(self, name: str) -> List[Span]: return [span for span in self._trace_list if span.name == name] - + def get_spans(self) -> List[Span]: - return [span for span in self._trace_list] \ No newline at end of file + return [span for span in self._trace_list] diff --git a/sdk/ai/azure-ai-inference/tests/test_model_inference_async_client.py b/sdk/ai/azure-ai-inference/tests/test_model_inference_async_client.py index 3be34667d424..e0f7360dc476 100644 --- a/sdk/ai/azure-ai-inference/tests/test_model_inference_async_client.py +++ b/sdk/ai/azure-ai-inference/tests/test_model_inference_async_client.py @@ -28,6 +28,7 @@ CONTENT_TRACING_ENV_VARIABLE = "AZURE_TRACING_GEN_AI_CONTENT_RECORDING_ENABLED" content_tracing_initial_value = os.getenv(CONTENT_TRACING_ENV_VARIABLE) + # The test class name needs to start with "Test" to get collected by pytest class TestModelAsyncClient(ModelClientTestBase): @@ -492,7 +493,7 @@ async def test_async_load_chat_completions_client(self, **kwargs): response1 = await client.get_model_info() self._print_model_info_result(response1) self._validate_model_info_result( - response1, "chat-completion" # TODO: This should be chat_completions based on REST API spec... + response1, "chat-completion" # TODO: This should be chat_completions based on REST API spec... ) # TODO: This should be ModelType.CHAT once the model is fixed await client.close() @@ -737,27 +738,29 @@ async def test_chat_completion_async_tracing_content_recording_disabled(self, ** spans = exporter.get_spans_by_name("chat") assert len(spans) == 1 span = spans[0] - expected_attributes = [('gen_ai.operation.name', 'chat'), - ('gen_ai.system', 'az.ai.inference'), - ('gen_ai.request.model', 'chat'), - ('server.address', ''), - ('gen_ai.response.id', ''), - ('gen_ai.response.model', 'mistral-large'), - ('gen_ai.usage.input_tokens', '+'), - ('gen_ai.usage.output_tokens', '+'), - ('gen_ai.response.finish_reasons', ('stop',))] + expected_attributes = [ + ("gen_ai.operation.name", "chat"), + ("gen_ai.system", "az.ai.inference"), + ("gen_ai.request.model", "chat"), + ("server.address", ""), + ("gen_ai.response.id", ""), + ("gen_ai.response.model", "mistral-large"), + ("gen_ai.usage.input_tokens", "+"), + ("gen_ai.usage.output_tokens", "+"), + ("gen_ai.response.finish_reasons", ("stop",)), + ] attributes_match = GenAiTraceVerifier().check_span_attributes(span, expected_attributes) assert attributes_match == True expected_events = [ - { - 'name': 'gen_ai.choice', - 'attributes': { - 'gen_ai.system': 'az.ai.inference', - 'gen_ai.event.content': '{"finish_reason": "stop", "index": 0}' - } + { + "name": "gen_ai.choice", + "attributes": { + "gen_ai.system": "az.ai.inference", + "gen_ai.event.content": '{"finish_reason": "stop", "index": 0}', + }, } ] events_match = GenAiTraceVerifier().check_span_events(span, expected_events) assert events_match == True - AIInferenceInstrumentor().uninstrument() \ No newline at end of file + AIInferenceInstrumentor().uninstrument() diff --git a/sdk/ai/azure-ai-inference/tests/test_model_inference_client.py b/sdk/ai/azure-ai-inference/tests/test_model_inference_client.py index a6cfffea8e8a..a26e8c247258 100644 --- a/sdk/ai/azure-ai-inference/tests/test_model_inference_client.py +++ b/sdk/ai/azure-ai-inference/tests/test_model_inference_client.py @@ -27,6 +27,7 @@ CONTENT_TRACING_ENV_VARIABLE = "AZURE_TRACING_GEN_AI_CONTENT_RECORDING_ENABLED" content_tracing_initial_value = os.getenv(CONTENT_TRACING_ENV_VARIABLE) + # The test class name needs to start with "Test" to get collected by pytest class TestModelClient(ModelClientTestBase): @@ -559,7 +560,7 @@ def test_get_model_info_on_chat_client(self, **kwargs): self._print_model_info_result(response1) self._validate_model_info_result( - response1, "chat-completion" # TODO: This should be chat_comletions according to REST API spec... + response1, "chat-completion" # TODO: This should be chat_comletions according to REST API spec... ) # TODO: This should be ModelType.CHAT once the model is fixed # Get the model info again. No network calls should be made here, @@ -810,7 +811,6 @@ def test_embeddings_on_chat_completion_endpoint(self, **kwargs): client.close() assert exception_caught - # ********************************************************************************** # # TRACING TESTS - CHAT COMPLETIONS @@ -942,25 +942,27 @@ def test_chat_completion_tracing_content_recording_disabled(self, **kwargs): spans = exporter.get_spans_by_name("chat") assert len(spans) == 1 span = spans[0] - expected_attributes = [('gen_ai.operation.name', 'chat'), - ('gen_ai.system', 'az.ai.inference'), - ('gen_ai.request.model', 'chat'), - ('server.address', ''), - ('gen_ai.response.id', ''), - ('gen_ai.response.model', 'mistral-large'), - ('gen_ai.usage.input_tokens', '+'), - ('gen_ai.usage.output_tokens', '+'), - ('gen_ai.response.finish_reasons', ('stop',))] + expected_attributes = [ + ("gen_ai.operation.name", "chat"), + ("gen_ai.system", "az.ai.inference"), + ("gen_ai.request.model", "chat"), + ("server.address", ""), + ("gen_ai.response.id", ""), + ("gen_ai.response.model", "mistral-large"), + ("gen_ai.usage.input_tokens", "+"), + ("gen_ai.usage.output_tokens", "+"), + ("gen_ai.response.finish_reasons", ("stop",)), + ] attributes_match = GenAiTraceVerifier().check_span_attributes(span, expected_attributes) assert attributes_match == True expected_events = [ - { - 'name': 'gen_ai.choice', - 'attributes': { - 'gen_ai.system': 'az.ai.inference', - 'gen_ai.event.content': '{"finish_reason": "stop", "index": 0}' - } + { + "name": "gen_ai.choice", + "attributes": { + "gen_ai.system": "az.ai.inference", + "gen_ai.event.content": '{"finish_reason": "stop", "index": 0}', + }, } ] events_match = GenAiTraceVerifier().check_span_events(span, expected_events) @@ -991,40 +993,42 @@ def test_chat_completion_tracing_content_recording_enabled(self, **kwargs): spans = exporter.get_spans_by_name("chat") assert len(spans) == 1 span = spans[0] - expected_attributes = [('gen_ai.operation.name', 'chat'), - ('gen_ai.system', 'az.ai.inference'), - ('gen_ai.request.model', 'chat'), - ('server.address', ''), - ('gen_ai.response.id', ''), - ('gen_ai.response.model', 'mistral-large'), - ('gen_ai.usage.input_tokens', '+'), - ('gen_ai.usage.output_tokens', '+'), - ('gen_ai.response.finish_reasons', ('stop',))] + expected_attributes = [ + ("gen_ai.operation.name", "chat"), + ("gen_ai.system", "az.ai.inference"), + ("gen_ai.request.model", "chat"), + ("server.address", ""), + ("gen_ai.response.id", ""), + ("gen_ai.response.model", "mistral-large"), + ("gen_ai.usage.input_tokens", "+"), + ("gen_ai.usage.output_tokens", "+"), + ("gen_ai.response.finish_reasons", ("stop",)), + ] attributes_match = GenAiTraceVerifier().check_span_attributes(span, expected_attributes) assert attributes_match == True expected_events = [ { - 'name': 'gen_ai.system.message', - 'attributes': { - 'gen_ai.system': 'az.ai.inference', - 'gen_ai.event.content': '{"role": "system", "content": "You are a helpful assistant."}' - } + "name": "gen_ai.system.message", + "attributes": { + "gen_ai.system": "az.ai.inference", + "gen_ai.event.content": '{"role": "system", "content": "You are a helpful assistant."}', + }, }, { - 'name': 'gen_ai.user.message', - 'attributes': { - 'gen_ai.system': 'az.ai.inference', - 'gen_ai.event.content': '{"role": "user", "content": "What is the capital of France?"}' - } + "name": "gen_ai.user.message", + "attributes": { + "gen_ai.system": "az.ai.inference", + "gen_ai.event.content": '{"role": "user", "content": "What is the capital of France?"}', + }, }, { - 'name': 'gen_ai.choice', - 'attributes': { - 'gen_ai.system': 'az.ai.inference', - 'gen_ai.event.content': '{"message": {"content": "*"}, "finish_reason": "stop", "index": 0}' - } - } + "name": "gen_ai.choice", + "attributes": { + "gen_ai.system": "az.ai.inference", + "gen_ai.event.content": '{"message": {"content": "*"}, "finish_reason": "stop", "index": 0}', + }, + }, ] events_match = GenAiTraceVerifier().check_span_events(span, expected_events) assert events_match == True @@ -1047,7 +1051,7 @@ def test_chat_completion_streaming_tracing_content_recording_disabled(self, **kw sdk.models.SystemMessage(content="You are a helpful assistant."), sdk.models.UserMessage(content="What is the capital of France?"), ], - stream=True + stream=True, ) response_content = "" for update in response: @@ -1061,25 +1065,27 @@ def test_chat_completion_streaming_tracing_content_recording_disabled(self, **kw spans = exporter.get_spans_by_name("chat") assert len(spans) == 1 span = spans[0] - expected_attributes = [('gen_ai.operation.name', 'chat'), - ('gen_ai.system', 'az.ai.inference'), - ('gen_ai.request.model', 'chat'), - ('server.address', ''), - ('gen_ai.response.id', ''), - ('gen_ai.response.model', 'mistral-large'), - ('gen_ai.usage.input_tokens', '+'), - ('gen_ai.usage.output_tokens', '+'), - ('gen_ai.response.finish_reasons', ('stop',))] + expected_attributes = [ + ("gen_ai.operation.name", "chat"), + ("gen_ai.system", "az.ai.inference"), + ("gen_ai.request.model", "chat"), + ("server.address", ""), + ("gen_ai.response.id", ""), + ("gen_ai.response.model", "mistral-large"), + ("gen_ai.usage.input_tokens", "+"), + ("gen_ai.usage.output_tokens", "+"), + ("gen_ai.response.finish_reasons", ("stop",)), + ] attributes_match = GenAiTraceVerifier().check_span_attributes(span, expected_attributes) assert attributes_match == True expected_events = [ { - 'name': 'gen_ai.choice', - 'attributes': { - 'gen_ai.system': 'az.ai.inference', - 'gen_ai.event.content': '{"finish_reason": "stop", "index": 0}' - } + "name": "gen_ai.choice", + "attributes": { + "gen_ai.system": "az.ai.inference", + "gen_ai.event.content": '{"finish_reason": "stop", "index": 0}', + }, } ] events_match = GenAiTraceVerifier().check_span_events(span, expected_events) @@ -1103,7 +1109,7 @@ def test_chat_completion_streaming_tracing_content_recording_enabled(self, **kwa sdk.models.SystemMessage(content="You are a helpful assistant."), sdk.models.UserMessage(content="What is the capital of France?"), ], - stream=True + stream=True, ) response_content = "" for update in response: @@ -1117,40 +1123,42 @@ def test_chat_completion_streaming_tracing_content_recording_enabled(self, **kwa spans = exporter.get_spans_by_name("chat") assert len(spans) == 1 span = spans[0] - expected_attributes = [('gen_ai.operation.name', 'chat'), - ('gen_ai.system', 'az.ai.inference'), - ('gen_ai.request.model', 'chat'), - ('server.address', ''), - ('gen_ai.response.id', ''), - ('gen_ai.response.model', 'mistral-large'), - ('gen_ai.usage.input_tokens', '+'), - ('gen_ai.usage.output_tokens', '+'), - ('gen_ai.response.finish_reasons', ('stop',))] + expected_attributes = [ + ("gen_ai.operation.name", "chat"), + ("gen_ai.system", "az.ai.inference"), + ("gen_ai.request.model", "chat"), + ("server.address", ""), + ("gen_ai.response.id", ""), + ("gen_ai.response.model", "mistral-large"), + ("gen_ai.usage.input_tokens", "+"), + ("gen_ai.usage.output_tokens", "+"), + ("gen_ai.response.finish_reasons", ("stop",)), + ] attributes_match = GenAiTraceVerifier().check_span_attributes(span, expected_attributes) assert attributes_match == True expected_events = [ { - 'name': 'gen_ai.system.message', - 'attributes': { - 'gen_ai.system': 'az.ai.inference', - 'gen_ai.event.content': '{"role": "system", "content": "You are a helpful assistant."}' - } + "name": "gen_ai.system.message", + "attributes": { + "gen_ai.system": "az.ai.inference", + "gen_ai.event.content": '{"role": "system", "content": "You are a helpful assistant."}', + }, }, { - 'name': 'gen_ai.user.message', - 'attributes': { - 'gen_ai.system': 'az.ai.inference', - 'gen_ai.event.content': '{"role": "user", "content": "What is the capital of France?"}' - } + "name": "gen_ai.user.message", + "attributes": { + "gen_ai.system": "az.ai.inference", + "gen_ai.event.content": '{"role": "user", "content": "What is the capital of France?"}', + }, }, { - 'name': 'gen_ai.choice', - 'attributes': { - 'gen_ai.system': 'az.ai.inference', - 'gen_ai.event.content': '{"message": {"content": "*"}, "finish_reason": "stop", "index": 0}' - } - } + "name": "gen_ai.choice", + "attributes": { + "gen_ai.system": "az.ai.inference", + "gen_ai.event.content": '{"message": {"content": "*"}, "finish_reason": "stop", "index": 0}', + }, + }, ] events_match = GenAiTraceVerifier().check_span_events(span, expected_events) assert events_match == True @@ -1165,7 +1173,16 @@ def test_chat_completion_with_function_call_tracing_content_recording_enabled(se except RuntimeError as e: pass import json - from azure.ai.inference.models import SystemMessage, UserMessage, CompletionsFinishReason, ToolMessage, AssistantMessage, ChatCompletionsToolCall, ChatCompletionsToolDefinition, FunctionDefinition + from azure.ai.inference.models import ( + SystemMessage, + UserMessage, + CompletionsFinishReason, + ToolMessage, + AssistantMessage, + ChatCompletionsToolCall, + ChatCompletionsToolDefinition, + FunctionDefinition, + ) from azure.ai.inference import ChatCompletionsClient self.modify_env_var(CONTENT_TRACING_ENV_VARIABLE, "True") @@ -1197,7 +1214,7 @@ def get_weather(city: str) -> str: }, ) ) - messages=[ + messages = [ sdk.models.SystemMessage(content="You are a helpful assistant."), sdk.models.UserMessage(content="What is the weather in Seattle?"), ] @@ -1225,26 +1242,30 @@ def get_weather(city: str) -> str: if len(spans) == 0: spans = exporter.get_spans_by_name("chat") assert len(spans) == 2 - expected_attributes = [('gen_ai.operation.name', 'chat'), - ('gen_ai.system', 'az.ai.inference'), - ('gen_ai.request.model', 'chat'), - ('server.address', ''), - ('gen_ai.response.id', ''), - ('gen_ai.response.model', 'mistral-large'), - ('gen_ai.usage.input_tokens', '+'), - ('gen_ai.usage.output_tokens', '+'), - ('gen_ai.response.finish_reasons', ('tool_calls',))] + expected_attributes = [ + ("gen_ai.operation.name", "chat"), + ("gen_ai.system", "az.ai.inference"), + ("gen_ai.request.model", "chat"), + ("server.address", ""), + ("gen_ai.response.id", ""), + ("gen_ai.response.model", "mistral-large"), + ("gen_ai.usage.input_tokens", "+"), + ("gen_ai.usage.output_tokens", "+"), + ("gen_ai.response.finish_reasons", ("tool_calls",)), + ] attributes_match = GenAiTraceVerifier().check_span_attributes(spans[0], expected_attributes) assert attributes_match == True - expected_attributes = [('gen_ai.operation.name', 'chat'), - ('gen_ai.system', 'az.ai.inference'), - ('gen_ai.request.model', 'chat'), - ('server.address', ''), - ('gen_ai.response.id', ''), - ('gen_ai.response.model', 'mistral-large'), - ('gen_ai.usage.input_tokens', '+'), - ('gen_ai.usage.output_tokens', '+'), - ('gen_ai.response.finish_reasons', ('stop',))] + expected_attributes = [ + ("gen_ai.operation.name", "chat"), + ("gen_ai.system", "az.ai.inference"), + ("gen_ai.request.model", "chat"), + ("server.address", ""), + ("gen_ai.response.id", ""), + ("gen_ai.response.model", "mistral-large"), + ("gen_ai.usage.input_tokens", "+"), + ("gen_ai.usage.output_tokens", "+"), + ("gen_ai.response.finish_reasons", ("stop",)), + ] attributes_match = GenAiTraceVerifier().check_span_attributes(spans[1], expected_attributes) assert attributes_match == True @@ -1254,25 +1275,25 @@ def get_weather(city: str) -> str: "timestamp": "*", "attributes": { "gen_ai.system": "az.ai.inference", - "gen_ai.event.content": "{\"role\": \"system\", \"content\": \"You are a helpful assistant.\"}" - } + "gen_ai.event.content": '{"role": "system", "content": "You are a helpful assistant."}', + }, }, { "name": "gen_ai.user.message", "timestamp": "*", "attributes": { "gen_ai.system": "az.ai.inference", - "gen_ai.event.content": "{\"role\": \"user\", \"content\": \"What is the weather in Seattle?\"}" - } + "gen_ai.event.content": '{"role": "user", "content": "What is the weather in Seattle?"}', + }, }, { "name": "gen_ai.choice", "timestamp": "*", "attributes": { "gen_ai.system": "az.ai.inference", - "gen_ai.event.content": "{\"message\": {\"content\": \"\", \"tool_calls\": [{\"function\": {\"arguments\": \"{\\\"city\\\":\\\"Seattle\\\"}\", \"call_id\": null, \"name\": \"get_weather\"}, \"id\": \"*\", \"type\": \"function\"}]}, \"finish_reason\": \"tool_calls\", \"index\": 0}" - } - } + "gen_ai.event.content": '{"message": {"content": "", "tool_calls": [{"function": {"arguments": "{\\"city\\":\\"Seattle\\"}", "call_id": null, "name": "get_weather"}, "id": "*", "type": "function"}]}, "finish_reason": "tool_calls", "index": 0}', + }, + }, ] events_match = GenAiTraceVerifier().check_span_events(spans[0], expected_events) assert events_match == True @@ -1283,43 +1304,43 @@ def get_weather(city: str) -> str: "timestamp": "*", "attributes": { "gen_ai.system": "az.ai.inference", - "gen_ai.event.content": "{\"role\": \"system\", \"content\": \"You are a helpful assistant.\"}" - } + "gen_ai.event.content": '{"role": "system", "content": "You are a helpful assistant."}', + }, }, { "name": "gen_ai.user.message", "timestamp": "*", "attributes": { "gen_ai.system": "az.ai.inference", - "gen_ai.event.content": "{\"role\": \"user\", \"content\": \"What is the weather in Seattle?\"}" - } + "gen_ai.event.content": '{"role": "user", "content": "What is the weather in Seattle?"}', + }, }, { "name": "gen_ai.assistant.message", "timestamp": "*", "attributes": { "gen_ai.system": "az.ai.inference", - "gen_ai.event.content": "{\"role\": \"assistant\", \"tool_calls\": [{\"function\": {\"arguments\": \"{\\\"city\\\": \\\"Seattle\\\"}\", \"call_id\": null, \"name\": \"get_weather\"}, \"id\": \"*\", \"type\": \"function\"}]}" - } + "gen_ai.event.content": '{"role": "assistant", "tool_calls": [{"function": {"arguments": "{\\"city\\": \\"Seattle\\"}", "call_id": null, "name": "get_weather"}, "id": "*", "type": "function"}]}', + }, }, { "name": "gen_ai.tool.message", "timestamp": "*", "attributes": { "gen_ai.system": "az.ai.inference", - "gen_ai.event.content": "{\"role\": \"tool\", \"tool_call_id\": \"*\", \"content\": \"Nice weather\"}" - } + "gen_ai.event.content": '{"role": "tool", "tool_call_id": "*", "content": "Nice weather"}', + }, }, { "name": "gen_ai.choice", "timestamp": "*", "attributes": { "gen_ai.system": "az.ai.inference", - "gen_ai.event.content": "{\"message\": {\"content\": \"*\"}, \"finish_reason\": \"stop\", \"index\": 0}" - } - } - ] - events_match = GenAiTraceVerifier().check_span_events(spans[1], expected_events) + "gen_ai.event.content": '{"message": {"content": "*"}, "finish_reason": "stop", "index": 0}', + }, + }, + ] + events_match = GenAiTraceVerifier().check_span_events(spans[1], expected_events) assert events_match == True AIInferenceInstrumentor().uninstrument() @@ -1333,7 +1354,16 @@ def test_chat_completion_with_function_call_tracing_content_recording_disabled(s except RuntimeError as e: pass import json - from azure.ai.inference.models import SystemMessage, UserMessage, CompletionsFinishReason, ToolMessage, AssistantMessage, ChatCompletionsToolCall, ChatCompletionsToolDefinition, FunctionDefinition + from azure.ai.inference.models import ( + SystemMessage, + UserMessage, + CompletionsFinishReason, + ToolMessage, + AssistantMessage, + ChatCompletionsToolCall, + ChatCompletionsToolDefinition, + FunctionDefinition, + ) from azure.ai.inference import ChatCompletionsClient self.modify_env_var(CONTENT_TRACING_ENV_VARIABLE, "False") @@ -1365,7 +1395,7 @@ def get_weather(city: str) -> str: }, ) ) - messages=[ + messages = [ sdk.models.SystemMessage(content="You are a helpful assistant."), sdk.models.UserMessage(content="What is the weather in Seattle?"), ] @@ -1393,26 +1423,30 @@ def get_weather(city: str) -> str: if len(spans) == 0: spans = exporter.get_spans_by_name("chat") assert len(spans) == 2 - expected_attributes = [('gen_ai.operation.name', 'chat'), - ('gen_ai.system', 'az.ai.inference'), - ('gen_ai.request.model', 'chat'), - ('server.address', ''), - ('gen_ai.response.id', ''), - ('gen_ai.response.model', 'mistral-large'), - ('gen_ai.usage.input_tokens', '+'), - ('gen_ai.usage.output_tokens', '+'), - ('gen_ai.response.finish_reasons', ('tool_calls',))] + expected_attributes = [ + ("gen_ai.operation.name", "chat"), + ("gen_ai.system", "az.ai.inference"), + ("gen_ai.request.model", "chat"), + ("server.address", ""), + ("gen_ai.response.id", ""), + ("gen_ai.response.model", "mistral-large"), + ("gen_ai.usage.input_tokens", "+"), + ("gen_ai.usage.output_tokens", "+"), + ("gen_ai.response.finish_reasons", ("tool_calls",)), + ] attributes_match = GenAiTraceVerifier().check_span_attributes(spans[0], expected_attributes) assert attributes_match == True - expected_attributes = [('gen_ai.operation.name', 'chat'), - ('gen_ai.system', 'az.ai.inference'), - ('gen_ai.request.model', 'chat'), - ('server.address', ''), - ('gen_ai.response.id', ''), - ('gen_ai.response.model', 'mistral-large'), - ('gen_ai.usage.input_tokens', '+'), - ('gen_ai.usage.output_tokens', '+'), - ('gen_ai.response.finish_reasons', ('stop',))] + expected_attributes = [ + ("gen_ai.operation.name", "chat"), + ("gen_ai.system", "az.ai.inference"), + ("gen_ai.request.model", "chat"), + ("server.address", ""), + ("gen_ai.response.id", ""), + ("gen_ai.response.model", "mistral-large"), + ("gen_ai.usage.input_tokens", "+"), + ("gen_ai.usage.output_tokens", "+"), + ("gen_ai.response.finish_reasons", ("stop",)), + ] attributes_match = GenAiTraceVerifier().check_span_attributes(spans[1], expected_attributes) assert attributes_match == True @@ -1422,8 +1456,8 @@ def get_weather(city: str) -> str: "timestamp": "*", "attributes": { "gen_ai.system": "az.ai.inference", - "gen_ai.event.content": "{\"finish_reason\": \"tool_calls\", \"index\": 0, \"message\": {\"tool_calls\": [{\"function\": {\"call_id\": null}, \"id\": \"*\", \"type\": \"function\"}]}}" - } + "gen_ai.event.content": '{"finish_reason": "tool_calls", "index": 0, "message": {"tool_calls": [{"function": {"call_id": null}, "id": "*", "type": "function"}]}}', + }, } ] events_match = GenAiTraceVerifier().check_span_events(spans[0], expected_events) @@ -1435,11 +1469,11 @@ def get_weather(city: str) -> str: "timestamp": "*", "attributes": { "gen_ai.system": "az.ai.inference", - "gen_ai.event.content": "{\"finish_reason\": \"stop\", \"index\": 0}" - } + "gen_ai.event.content": '{"finish_reason": "stop", "index": 0}', + }, } - ] - events_match = GenAiTraceVerifier().check_span_events(spans[1], expected_events) + ] + events_match = GenAiTraceVerifier().check_span_events(spans[1], expected_events) assert events_match == True AIInferenceInstrumentor().uninstrument() @@ -1453,7 +1487,17 @@ def test_chat_completion_with_function_call_streaming_tracing_content_recording_ except RuntimeError as e: pass import json - from azure.ai.inference.models import SystemMessage, UserMessage, CompletionsFinishReason, FunctionCall, ToolMessage, AssistantMessage, ChatCompletionsToolCall, ChatCompletionsToolDefinition, FunctionDefinition + from azure.ai.inference.models import ( + SystemMessage, + UserMessage, + CompletionsFinishReason, + FunctionCall, + ToolMessage, + AssistantMessage, + ChatCompletionsToolCall, + ChatCompletionsToolDefinition, + FunctionDefinition, + ) from azure.ai.inference import ChatCompletionsClient self.modify_env_var(CONTENT_TRACING_ENV_VARIABLE, "True") @@ -1485,15 +1529,12 @@ def get_weather(city: str) -> str: }, ) ) - messages=[ + messages = [ sdk.models.SystemMessage(content="You are a helpful AI assistant."), sdk.models.UserMessage(content="What is the weather in Seattle?"), ] - response = client.complete( - messages=messages, - tools=[weather_description], - stream=True) + response = client.complete(messages=messages, tools=[weather_description], stream=True) # At this point we expect a function tool call in the model response tool_call_id: str = "" @@ -1506,17 +1547,13 @@ def get_weather(city: str) -> str: if update.choices[0].delta.tool_calls[0].id is not None: tool_call_id = update.choices[0].delta.tool_calls[0].id function_args += update.choices[0].delta.tool_calls[0].function.arguments or "" - + # Append the previous model response to the chat history messages.append( AssistantMessage( tool_calls=[ ChatCompletionsToolCall( - id=tool_call_id, - function=FunctionCall( - name=function_name, - arguments=function_args - ) + id=tool_call_id, function=FunctionCall(name=function_name, arguments=function_args) ) ] ) @@ -1528,19 +1565,10 @@ def get_weather(city: str) -> str: function_response = callable_func(**function_args_mapping) # Append the function response as a tool message to the chat history - messages.append( - ToolMessage( - tool_call_id=tool_call_id, - content=function_response - ) - ) + messages.append(ToolMessage(tool_call_id=tool_call_id, content=function_response)) # With the additional tools information on hand, get another streaming response from the model - response = client.complete( - messages=messages, - tools=[weather_description], - stream=True - ) + response = client.complete(messages=messages, tools=[weather_description], stream=True) content = "" for update in response: @@ -1551,26 +1579,30 @@ def get_weather(city: str) -> str: if len(spans) == 0: spans = exporter.get_spans_by_name("chat") assert len(spans) == 2 - expected_attributes = [('gen_ai.operation.name', 'chat'), - ('gen_ai.system', 'az.ai.inference'), - ('gen_ai.request.model', 'chat'), - ('server.address', ''), - ('gen_ai.response.id', ''), - ('gen_ai.response.model', 'mistral-large'), - ('gen_ai.usage.input_tokens', '+'), - ('gen_ai.usage.output_tokens', '+'), - ('gen_ai.response.finish_reasons', ('tool_calls',))] + expected_attributes = [ + ("gen_ai.operation.name", "chat"), + ("gen_ai.system", "az.ai.inference"), + ("gen_ai.request.model", "chat"), + ("server.address", ""), + ("gen_ai.response.id", ""), + ("gen_ai.response.model", "mistral-large"), + ("gen_ai.usage.input_tokens", "+"), + ("gen_ai.usage.output_tokens", "+"), + ("gen_ai.response.finish_reasons", ("tool_calls",)), + ] attributes_match = GenAiTraceVerifier().check_span_attributes(spans[0], expected_attributes) assert attributes_match == True - expected_attributes = [('gen_ai.operation.name', 'chat'), - ('gen_ai.system', 'az.ai.inference'), - ('gen_ai.request.model', 'chat'), - ('server.address', ''), - ('gen_ai.response.id', ''), - ('gen_ai.response.model', 'mistral-large'), - ('gen_ai.usage.input_tokens', '+'), - ('gen_ai.usage.output_tokens', '+'), - ('gen_ai.response.finish_reasons', ('stop',))] + expected_attributes = [ + ("gen_ai.operation.name", "chat"), + ("gen_ai.system", "az.ai.inference"), + ("gen_ai.request.model", "chat"), + ("server.address", ""), + ("gen_ai.response.id", ""), + ("gen_ai.response.model", "mistral-large"), + ("gen_ai.usage.input_tokens", "+"), + ("gen_ai.usage.output_tokens", "+"), + ("gen_ai.response.finish_reasons", ("stop",)), + ] attributes_match = GenAiTraceVerifier().check_span_attributes(spans[1], expected_attributes) assert attributes_match == True @@ -1580,25 +1612,25 @@ def get_weather(city: str) -> str: "timestamp": "*", "attributes": { "gen_ai.system": "az.ai.inference", - "gen_ai.event.content": "{\"role\": \"system\", \"content\": \"You are a helpful AI assistant.\"}" - } + "gen_ai.event.content": '{"role": "system", "content": "You are a helpful AI assistant."}', + }, }, { "name": "gen_ai.user.message", "timestamp": "*", "attributes": { "gen_ai.system": "az.ai.inference", - "gen_ai.event.content": "{\"role\": \"user\", \"content\": \"What is the weather in Seattle?\"}" - } + "gen_ai.event.content": '{"role": "user", "content": "What is the weather in Seattle?"}', + }, }, { "name": "gen_ai.choice", "timestamp": "*", "attributes": { "gen_ai.system": "az.ai.inference", - "gen_ai.event.content": "{\"finish_reason\": \"tool_calls\", \"message\": {\"tool_calls\": [{\"id\": \"*\", \"type\": \"function\", \"function\": {\"name\": \"get_weather\", \"arguments\": \"{\\\"city\\\": \\\"Seattle\\\"}\"}}]}, \"index\": 0}" - } - } + "gen_ai.event.content": '{"finish_reason": "tool_calls", "message": {"tool_calls": [{"id": "*", "type": "function", "function": {"name": "get_weather", "arguments": "{\\"city\\": \\"Seattle\\"}"}}]}, "index": 0}', + }, + }, ] events_match = GenAiTraceVerifier().check_span_events(spans[0], expected_events) assert events_match == True @@ -1609,43 +1641,43 @@ def get_weather(city: str) -> str: "timestamp": "*", "attributes": { "gen_ai.system": "az.ai.inference", - "gen_ai.event.content": "{\"role\": \"system\", \"content\": \"You are a helpful AI assistant.\"}" - } + "gen_ai.event.content": '{"role": "system", "content": "You are a helpful AI assistant."}', + }, }, { "name": "gen_ai.user.message", "timestamp": "*", "attributes": { "gen_ai.system": "az.ai.inference", - "gen_ai.event.content": "{\"role\": \"user\", \"content\": \"What is the weather in Seattle?\"}" - } + "gen_ai.event.content": '{"role": "user", "content": "What is the weather in Seattle?"}', + }, }, { "name": "gen_ai.assistant.message", "timestamp": "*", "attributes": { "gen_ai.system": "az.ai.inference", - "gen_ai.event.content": "{\"role\": \"assistant\", \"tool_calls\": [{\"id\": \"*\", \"function\": {\"name\": \"get_weather\", \"arguments\": \"{\\\"city\\\": \\\"Seattle\\\"}\"}, \"type\": \"function\"}]}" - } + "gen_ai.event.content": '{"role": "assistant", "tool_calls": [{"id": "*", "function": {"name": "get_weather", "arguments": "{\\"city\\": \\"Seattle\\"}"}, "type": "function"}]}', + }, }, { "name": "gen_ai.tool.message", "timestamp": "*", "attributes": { "gen_ai.system": "az.ai.inference", - "gen_ai.event.content": "{\"role\": \"tool\", \"tool_call_id\": \"*\", \"content\": \"Nice weather\"}" - } + "gen_ai.event.content": '{"role": "tool", "tool_call_id": "*", "content": "Nice weather"}', + }, }, { "name": "gen_ai.choice", "timestamp": "*", "attributes": { "gen_ai.system": "az.ai.inference", - "gen_ai.event.content": "{\"message\": {\"content\": \"*\"}, \"finish_reason\": \"stop\", \"index\": 0}" - } - } - ] - events_match = GenAiTraceVerifier().check_span_events(spans[1], expected_events) + "gen_ai.event.content": '{"message": {"content": "*"}, "finish_reason": "stop", "index": 0}', + }, + }, + ] + events_match = GenAiTraceVerifier().check_span_events(spans[1], expected_events) assert events_match == True AIInferenceInstrumentor().uninstrument() @@ -1659,7 +1691,17 @@ def test_chat_completion_with_function_call_streaming_tracing_content_recording_ except RuntimeError as e: pass import json - from azure.ai.inference.models import SystemMessage, UserMessage, CompletionsFinishReason, FunctionCall, ToolMessage, AssistantMessage, ChatCompletionsToolCall, ChatCompletionsToolDefinition, FunctionDefinition + from azure.ai.inference.models import ( + SystemMessage, + UserMessage, + CompletionsFinishReason, + FunctionCall, + ToolMessage, + AssistantMessage, + ChatCompletionsToolCall, + ChatCompletionsToolDefinition, + FunctionDefinition, + ) from azure.ai.inference import ChatCompletionsClient self.modify_env_var(CONTENT_TRACING_ENV_VARIABLE, "False") @@ -1691,15 +1733,12 @@ def get_weather(city: str) -> str: }, ) ) - messages=[ + messages = [ sdk.models.SystemMessage(content="You are a helpful assistant."), sdk.models.UserMessage(content="What is the weather in Seattle?"), ] - response = client.complete( - messages=messages, - tools=[weather_description], - stream=True) + response = client.complete(messages=messages, tools=[weather_description], stream=True) # At this point we expect a function tool call in the model response tool_call_id: str = "" @@ -1712,17 +1751,13 @@ def get_weather(city: str) -> str: if update.choices[0].delta.tool_calls[0].id is not None: tool_call_id = update.choices[0].delta.tool_calls[0].id function_args += update.choices[0].delta.tool_calls[0].function.arguments or "" - + # Append the previous model response to the chat history messages.append( AssistantMessage( tool_calls=[ ChatCompletionsToolCall( - id=tool_call_id, - function=FunctionCall( - name=function_name, - arguments=function_args - ) + id=tool_call_id, function=FunctionCall(name=function_name, arguments=function_args) ) ] ) @@ -1734,19 +1769,10 @@ def get_weather(city: str) -> str: function_response = callable_func(**function_args_mapping) # Append the function response as a tool message to the chat history - messages.append( - ToolMessage( - tool_call_id=tool_call_id, - content=function_response - ) - ) + messages.append(ToolMessage(tool_call_id=tool_call_id, content=function_response)) # With the additional tools information on hand, get another streaming response from the model - response = client.complete( - messages=messages, - tools=[weather_description], - stream=True - ) + response = client.complete(messages=messages, tools=[weather_description], stream=True) content = "" for update in response: @@ -1757,26 +1783,30 @@ def get_weather(city: str) -> str: if len(spans) == 0: spans = exporter.get_spans_by_name("chat") assert len(spans) == 2 - expected_attributes = [('gen_ai.operation.name', 'chat'), - ('gen_ai.system', 'az.ai.inference'), - ('gen_ai.request.model', 'chat'), - ('server.address', ''), - ('gen_ai.response.id', ''), - ('gen_ai.response.model', 'mistral-large'), - ('gen_ai.usage.input_tokens', '+'), - ('gen_ai.usage.output_tokens', '+'), - ('gen_ai.response.finish_reasons', ('tool_calls',))] + expected_attributes = [ + ("gen_ai.operation.name", "chat"), + ("gen_ai.system", "az.ai.inference"), + ("gen_ai.request.model", "chat"), + ("server.address", ""), + ("gen_ai.response.id", ""), + ("gen_ai.response.model", "mistral-large"), + ("gen_ai.usage.input_tokens", "+"), + ("gen_ai.usage.output_tokens", "+"), + ("gen_ai.response.finish_reasons", ("tool_calls",)), + ] attributes_match = GenAiTraceVerifier().check_span_attributes(spans[0], expected_attributes) assert attributes_match == True - expected_attributes = [('gen_ai.operation.name', 'chat'), - ('gen_ai.system', 'az.ai.inference'), - ('gen_ai.request.model', 'chat'), - ('server.address', ''), - ('gen_ai.response.id', ''), - ('gen_ai.response.model', 'mistral-large'), - ('gen_ai.usage.input_tokens', '+'), - ('gen_ai.usage.output_tokens', '+'), - ('gen_ai.response.finish_reasons', ('stop',))] + expected_attributes = [ + ("gen_ai.operation.name", "chat"), + ("gen_ai.system", "az.ai.inference"), + ("gen_ai.request.model", "chat"), + ("server.address", ""), + ("gen_ai.response.id", ""), + ("gen_ai.response.model", "mistral-large"), + ("gen_ai.usage.input_tokens", "+"), + ("gen_ai.usage.output_tokens", "+"), + ("gen_ai.response.finish_reasons", ("stop",)), + ] attributes_match = GenAiTraceVerifier().check_span_attributes(spans[1], expected_attributes) assert attributes_match == True @@ -1786,8 +1816,8 @@ def get_weather(city: str) -> str: "timestamp": "*", "attributes": { "gen_ai.system": "az.ai.inference", - "gen_ai.event.content": "{\"finish_reason\": \"tool_calls\", \"message\": {\"tool_calls\": [{\"id\": \"*\", \"type\": \"function\"}]}, \"index\": 0}" - } + "gen_ai.event.content": '{"finish_reason": "tool_calls", "message": {"tool_calls": [{"id": "*", "type": "function"}]}, "index": 0}', + }, } ] events_match = GenAiTraceVerifier().check_span_events(spans[0], expected_events) @@ -1799,11 +1829,11 @@ def get_weather(city: str) -> str: "timestamp": "*", "attributes": { "gen_ai.system": "az.ai.inference", - "gen_ai.event.content": "{\"finish_reason\": \"stop\", \"index\": 0}" - } + "gen_ai.event.content": '{"finish_reason": "stop", "index": 0}', + }, } - ] - events_match = GenAiTraceVerifier().check_span_events(spans[1], expected_events) + ] + events_match = GenAiTraceVerifier().check_span_events(spans[1], expected_events) assert events_match == True - AIInferenceInstrumentor().uninstrument() \ No newline at end of file + AIInferenceInstrumentor().uninstrument() diff --git a/sdk/monitor/azure-monitor-opentelemetry/CHANGELOG.md b/sdk/monitor/azure-monitor-opentelemetry/CHANGELOG.md index 8b830841cc34..b7a4e8ba0a25 100644 --- a/sdk/monitor/azure-monitor-opentelemetry/CHANGELOG.md +++ b/sdk/monitor/azure-monitor-opentelemetry/CHANGELOG.md @@ -4,6 +4,9 @@ ### Features Added +- Enable Azure AI Inference instrumentation + ([38071](https://github.com/Azure/azure-sdk-for-python/pull/38071)) + ### Breaking Changes ### Bugs Fixed diff --git a/sdk/monitor/azure-monitor-opentelemetry/azure/monitor/opentelemetry/_configure.py b/sdk/monitor/azure-monitor-opentelemetry/azure/monitor/opentelemetry/_configure.py index 871d9bf4f9e2..1136cb9d96e9 100644 --- a/sdk/monitor/azure-monitor-opentelemetry/azure/monitor/opentelemetry/_configure.py +++ b/sdk/monitor/azure-monitor-opentelemetry/azure/monitor/opentelemetry/_configure.py @@ -214,6 +214,7 @@ def _setup_instrumentations(configurations: Dict[str, ConfigurationValue]): lib_name, exc_info=ex, ) + _setup_additional_azure_sdk_instrumentations(configurations) def _send_attach_warning(): @@ -223,3 +224,30 @@ def _send_attach_warning(): "that telemetry is not being duplicated. This may impact your cost.", _DISTRO_DETECTS_ATTACH, ) + + +def _setup_additional_azure_sdk_instrumentations(configurations: Dict[str, ConfigurationValue]): + if _AZURE_SDK_INSTRUMENTATION_NAME not in _ALL_SUPPORTED_INSTRUMENTED_LIBRARIES: + return + + if not _is_instrumentation_enabled(configurations, _AZURE_SDK_INSTRUMENTATION_NAME): + _logger.debug("Instrumentation skipped for library azure_sdk") + return + + try: + from azure.ai.inference.tracing import AIInferenceInstrumentor # pylint: disable=import-error,no-name-in-module + except Exception as ex: # pylint: disable=broad-except + _logger.debug( + "Failed to import AIInferenceInstrumentor from azure-ai-inference", + exc_info=ex, + ) + return + + try: + AIInferenceInstrumentor().instrument() + except Exception as ex: # pylint: disable=broad-except + _logger.warning( + "Exception occurred when instrumenting: %s.", + "azure-ai-inference", + exc_info=ex, + ) diff --git a/sdk/monitor/azure-monitor-opentelemetry/samples/tracing/azure_ai_inference.py b/sdk/monitor/azure-monitor-opentelemetry/samples/tracing/azure_ai_inference.py new file mode 100644 index 000000000000..727e01e36353 --- /dev/null +++ b/sdk/monitor/azure-monitor-opentelemetry/samples/tracing/azure_ai_inference.py @@ -0,0 +1,41 @@ +from os import environ +import os + +from azure.ai.inference import ChatCompletionsClient +from azure.ai.inference.models import SystemMessage, UserMessage, CompletionsFinishReason +from azure.core.credentials import AzureKeyCredential + +from azure.monitor.opentelemetry import configure_azure_monitor +from opentelemetry import trace + +# Set up exporting to Azure Monitor +configure_azure_monitor() + +# Example with Azure AI Inference SDK + +try: + endpoint = os.environ["AZURE_AI_CHAT_ENDPOINT"] + key = os.environ["AZURE_AI_CHAT_KEY"] +except KeyError: + print("Missing environment variable 'AZURE_AI_CHAT_ENDPOINT' or 'AZURE_AI_CHAT_KEY'") + print("Set them before running this sample.") + exit() + +is_content_tracing_enabled = os.environ["AZURE_TRACING_GEN_AI_CONTENT_RECORDING_ENABLED"] +if not is_content_tracing_enabled: + print( + f"Content tracing is disabled. Set 'AZURE_TRACING_GEN_AI_CONTENT_RECORDING_ENABLED' to 'true' to record prompts and completions." + ) + +tracer = trace.get_tracer(__name__) +with tracer.start_as_current_span(name="MyApplication"): + client = ChatCompletionsClient(endpoint=endpoint, credential=AzureKeyCredential(key), model="gpt-4o-mini") + + # Call will be traced + response = client.complete( + messages=[ + UserMessage(content="Tell me a joke"), + ] + ) + + print(response.choices[0].message.content) diff --git a/sdk/monitor/azure-monitor-opentelemetry/samples/tracing/azure_core.py b/sdk/monitor/azure-monitor-opentelemetry/samples/tracing/azure_blob_storage.py similarity index 100% rename from sdk/monitor/azure-monitor-opentelemetry/samples/tracing/azure_core.py rename to sdk/monitor/azure-monitor-opentelemetry/samples/tracing/azure_blob_storage.py diff --git a/sdk/monitor/azure-monitor-opentelemetry/tests/test_configure.py b/sdk/monitor/azure-monitor-opentelemetry/tests/test_configure.py index d2f58671b9a8..e63f65d40df6 100644 --- a/sdk/monitor/azure-monitor-opentelemetry/tests/test_configure.py +++ b/sdk/monitor/azure-monitor-opentelemetry/tests/test_configure.py @@ -487,6 +487,24 @@ def test_setup_instrumentations_lib_not_supported( ep2_mock.load.assert_called_once() instrumentor_mock.instrument.assert_called_once() + @patch("azure.monitor.opentelemetry._configure._setup_additional_azure_sdk_instrumentations") + @patch("azure.monitor.opentelemetry._configure._ALL_SUPPORTED_INSTRUMENTED_LIBRARIES", ("azure_sdk")) + @patch("azure.monitor.opentelemetry._configure._is_instrumentation_enabled") + @patch("azure.monitor.opentelemetry._configure.iter_entry_points") + def test_setup_instrumentations_additional_azure( + self, + iter_mock, + enabled_mock, + additional_instrumentations_mock, + ): + ep_mock = Mock() + ep_mock.name = "azure_sdk" + iter_mock.return_value = (ep_mock,) + + enabled_mock.return_value = True + _setup_instrumentations({}) + additional_instrumentations_mock.assert_called_once() + @patch("azure.monitor.opentelemetry._configure._ALL_SUPPORTED_INSTRUMENTED_LIBRARIES", ("test_instr")) @patch("azure.monitor.opentelemetry._configure._is_instrumentation_enabled") @patch("azure.monitor.opentelemetry._configure._logger")