Skip to content

Use MEAI for AI telemetry#11237

Merged
eerhardt merged 10 commits intomainfrom
jamesnk/ai-telemetry
Sep 5, 2025
Merged

Use MEAI for AI telemetry#11237
eerhardt merged 10 commits intomainfrom
jamesnk/ai-telemetry

Conversation

@JamesNK
Copy link
Member

@JamesNK JamesNK commented Sep 3, 2025

Description

Aspire chat integrations enable telemetry. Each integration enables telemetry in the respective implementation. This PR changes that to always use MEAI for telemetry.

The reason is implementations aren't keeping up with the GenAI semantic convention spec:

The reality is that only MEAI is keeping up with GenAI semantic convention spec.

I don't want to enable both MEAI and OpenAI/Azure because then you'll see duplicate spans in tracing and duplicate counters in metrics.

The downside is there could be some implementation specific telemetry that is missed out on by only recording MEAI. That could be solved longer term by MEAI folks could reach out to implementations and agree on a way for them to add tags to the GenAI span, even if they didn't create it.

For example, Azure has azure.resource_provider.namespace that could be added to the active GenAI span. Azure could check to see if there is a current activity, and check if it is for GenAI (e.g. has gen_ai.system tag), and then add implementation specific metadata.

Checklist

  • Is this feature complete?
    • Yes. Ready to ship.
    • No. Follow-up changes expected.
  • Are you including unit tests for the changes and scenario tests if relevant?
    • Yes
    • No
  • Did you add public API?
    • Yes
      • If yes, did you have an API Review for it?
        • Yes
        • No
      • Did you add <remarks /> and <code /> elements on your triple slash comments?
        • Yes
        • No
    • No
  • Does the change make any security assumptions or guarantees?
    • Yes
      • If yes, have you done a threat model and had a security review?
        • Yes
        • No
    • No
  • Does the change require an update in our Aspire docs?

Copilot AI review requested due to automatic review settings September 3, 2025 10:27
@github-actions
Copy link
Contributor

github-actions bot commented Sep 3, 2025

🚀 Dogfood this PR with:

curl -fsSL https://raw.githubusercontent.com/dotnet/aspire/main/eng/scripts/get-aspire-cli-pr.sh | bash -s -- 11237

Or

  • Run remotely in PowerShell:
iex "& { $(irm https://raw.githubusercontent.com/dotnet/aspire/main/eng/scripts/get-aspire-cli-pr.ps1) } 11237"

@github-actions github-actions bot added the area-integrations Issues pertaining to Aspire Integrations packages label Sep 3, 2025
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR updates the Aspire OpenAI, Azure OpenAI, and Azure AI Inference components to use Microsoft.Extensions.AI (MEAI) for AI telemetry instead of the previous experimental OpenAI telemetry implementation. The change introduces a new EnableSensitiveTelemetryData configuration option and updates telemetry source names to work with both experimental and stable MEAI versions.

Key changes:

  • Adds EnableSensitiveTelemetryData configuration option across all AI components
  • Updates telemetry source names from "OpenAI.*" to "Experimental.Microsoft.Extensions.AI" and "Microsoft.Extensions.AI"
  • Removes experimental telemetry configuration logic and simplifies settings properties

Reviewed Changes

Copilot reviewed 15 out of 15 changed files in this pull request and generated no comments.

Show a summary per file
File Description
src/Components/Aspire.OpenAI/OpenAISettings.cs Removes experimental telemetry logic, adds EnableSensitiveTelemetryData property
src/Components/Aspire.OpenAI/AspireOpenAIExtensions.cs Updates telemetry source names to MEAI sources
src/Components/Aspire.OpenAI/AspireOpenAIClientBuilder.cs Adds EnableSensitiveTelemetryData parameter to constructor
src/Components/Aspire.OpenAI/AspireOpenAIClientBuilderChatClientExtensions.cs Configures EnableSensitiveData on OpenTelemetryChatClient
src/Components/Aspire.OpenAI/AspireOpenAIClientBuilderEmbeddingGeneratorExtensions.cs Configures EnableSensitiveData on OpenTelemetryEmbeddingGenerator
src/Components/Aspire.Azure.AI.OpenAI/AzureOpenAISettings.cs Similar changes to OpenAI settings with MEAI telemetry
src/Components/Aspire.Azure.AI.OpenAI/AspireAzureOpenAIExtensions.cs Updates to use MEAI telemetry sources
src/Components/Aspire.Azure.AI.Inference/ChatCompletionsClientSettings.cs Adds EnableSensitiveTelemetryData and removes experimental logic
src/Components/Aspire.Azure.AI.Inference/AspireAzureAIInferenceExtensions.cs Updates telemetry sources and configures sensitive data handling
tests/Aspire.OpenAI.Tests/OpenAIPublicApiTests.cs Updates test constructor calls to include new parameter
Configuration schema files Adds EnableSensitiveTelemetryData property definitions

string? deploymentName,
bool disableTracing)
bool disableTracing,
bool enableSensitiveTelemetryData)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor breaking change but I doubt folks are using this class directly. Probably not worth providing a constructor overload?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add an overload and avoid breaking this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh wait, this is prerelease . Carry on

Copy link
Member Author

@JamesNK JamesNK Sep 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do the builders use constructors? There will always be changes to types like these. For example, this PR. And literally every change to this type is going to be breaking.

Wouldn't init properties be better?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Likely

{
builder.Services.AddOpenTelemetry()
.WithTracing(traceBuilder => traceBuilder.AddSource("OpenAI.*"));
.WithTracing(traceBuilder => traceBuilder.AddSource(telemetrySources));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused at what layers of telemetry are the "right ones" to listen to.

Developers can use Aspire.OpenAI and not talk through Microsoft.Extensions.AI. They can grab the OpenAIClient from DI and use it directly. If they did that, they wouldn't get any telemetry right? Since we would only enable MEAI telemetry and MEAI isn't being invoked.

cc @stephentoub @lmolkova

Copy link
Member Author

@JamesNK JamesNK Sep 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we would only enable MEAI telemetry and MEAI isn't being invoked.

Correct. I think the solution is to clarify in the docs that telemetry is only available when OpenAI client is wrapped in Microsoft.Extensions.AI client.

I think the best approach is to standardize on MEAI as the provider of AI telemetry. The reality is that only MEAI is keeping up with GenAI semantic convention spec.

I don't want to enable both MEAI and OpenAI/Azure because then you'll see duplicate spans in tracing and duplicate counters in metrics.

The downside is there could be some implementation specific telemetry that is missed out on by only recording MEAI. That could be solved longer term by MEAI folks could reach out to implementations and agree on a way for them to add tags to the GenAI span, even if they didn't create it.

For example, Azure has azure.resource_provider.namespace that could be added to the active GenAI span. Azure could check to see if there is a current activity, and check if it is for GenAI (e.g. has gen_ai.system tag), and then add implementation specific metadata.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense if those libraries didn't have their own telemetry at all? The current state of the world is rather confusing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's worth considering. Another issue is what happens if implementors have APIs that aren't exposed by MEAI. They might need their own telemetry then. Out of scope of this PR.

I think at least for Aspire integration we can take an opiniated stance that our GenAI telemetry settings are related to MEAI.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets start there an open discussions with the right teams and people.

@eerhardt eerhardt mentioned this pull request Sep 3, 2025
18 tasks
@JamesNK JamesNK requested a review from stephentoub September 4, 2025 03:29
@JamesNK
Copy link
Member Author

JamesNK commented Sep 4, 2025

This is ready to review. It is a blocker for #11227 so should be looked at promptly.

@eerhardt eerhardt enabled auto-merge (squash) September 5, 2025 17:07
@eerhardt eerhardt merged commit 2365bd5 into main Sep 5, 2025
305 checks passed
@eerhardt eerhardt deleted the jamesnk/ai-telemetry branch September 5, 2025 17:55
@dotnet-policy-service dotnet-policy-service bot added this to the 9.5 milestone Sep 5, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

area-integrations Issues pertaining to Aspire Integrations packages

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants