feat: Add OTel Gen AI semantic convention input / output messages#1666
feat: Add OTel Gen AI semantic convention input / output messages#1666brightsparc wants to merge 46 commits intopydantic:mainfrom
Conversation
- Add TypedDicts in semconv.py (TextPart, ToolCallPart, ChatMessage, etc.) - Add convert_anthropic_messages_to_semconv() + _convert_anthropic_content_part() - Add convert_chat_completions_to_semconv() + helpers - Add convert_responses_inputs_to_semconv() - Set gen_ai.input.messages and gen_ai.system_instructions on spans Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add OutputMessage TypedDict in semconv.py - Add convert_anthropic_response_to_semconv() for Anthropic - Add convert_openai_response_to_semconv() for OpenAI chat completions - Add convert_responses_outputs_to_semconv() for OpenAI Responses API - Set gen_ai.output.messages on spans in on_response and stream state Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Anthropic tests still need manual updates due to inline-snapshot issues. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Combines both branches to add semantic convention attributes for: - Input messages (gen_ai.input.messages) - System instructions (gen_ai.system_instructions) - Output messages (gen_ai.output.messages) Resolves merge conflicts in: - semconv.py: Combined InputMessages, SystemInstructions, OutputMessage types - openai.py: Combined imports from both branches - anthropic.py: Kept both input and output conversion functions - test_anthropic.py: Combined expected snapshots with both input and output - test_anthropic_bedrock.py: Combined expected snapshots Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
There was a problem hiding this comment.
Pull request overview
Adds OpenTelemetry Gen AI semantic convention message attributes to LLM provider instrumentation, including conversion of provider-specific message formats into standardized semconv messages/parts structures.
Changes:
- Introduces typed semconv definitions for message parts/messages and new attribute constants for input/output/system instructions.
- Updates OpenAI + Anthropic integrations to emit
gen_ai.input.messages,gen_ai.output.messages, andgen_ai.system_instructionswhere applicable. - Expands and updates OTel integration tests to validate semconv message conversion across text/tool/image and edge cases.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| logfire/_internal/integrations/llm_providers/semconv.py | Adds message attribute constants and TypedDict-based semconv message/part type definitions. |
| logfire/_internal/integrations/llm_providers/openai.py | Emits semconv input/output/system instruction attributes and adds conversion helpers for Chat Completions + Responses API. |
| logfire/_internal/integrations/llm_providers/anthropic.py | Emits semconv input/output/system instruction attributes and adds conversion helpers for Anthropic message formats. |
| tests/otel_integrations/test_openai.py | Updates snapshots and adds coverage for semconv conversion across tool calls/images/empty inputs/None finish_reason. |
| tests/otel_integrations/test_anthropic.py | Updates snapshots and adds new tests for missing system, missing content, and list/multi-part content conversion. |
| tests/otel_integrations/test_anthropic_bedrock.py | Updates mocked responses and snapshots to include semconv message attributes and finish reason. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
The `events` attribute was serialized to a JSON string by `prepare_otlp_attribute`, so the `isinstance(events_attr, list)` check always failed and input events were silently overwritten by output events. Parse the JSON string back to a list when reading existing events. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The non-streaming Responses API tests now correctly show both input and output events in the backward-compat 'events' attribute. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Resolve conflicts between semconv message attributes (INPUT_MESSAGES, OUTPUT_MESSAGES, SYSTEM_INSTRUCTIONS) and upstream's new semconv attributes (INPUT_TOKENS, OUTPUT_TOKENS, RESPONSE_FINISH_REASONS, RESPONSE_ID, RESPONSE_MODEL). Both sets of attributes are now included. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
These paths handle message types (multimodal list content, tool calls in input messages, tool responses, named messages) that are not exercised by the existing VCR-based integration tests. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
|
@alexmojaki ready for final review from you. |
Adds a `version` parameter that accepts `1`, `2`, or `[1, 2]`: - version=1 (default): legacy request_data/response_data format - version=2: structured OTel Gen AI semconv messages (gen_ai.input.messages, gen_ai.output.messages, gen_ai.system_instructions) - version=[1, 2]: both formats simultaneously for migration/testing Built on top of PR #1666 (julian/semconvs-message).
Only include media_type key when the value is actually present, respecting the NotRequired[str] type contract of BlobPart.
|
Acknowledged, I will review this. |
Version 'latest' signals that the semconv format is still unstable and may change between releases. Numbered versions (2, etc.) will be added later once the format is finalized.
Only include media_type key when the value is actually present, respecting the NotRequired[str] type contract of BlobPart.
Version 'latest' signals that the semconv format is still unstable and may change between releases. Numbered versions (2, etc.) will be added later once the format is finalized.
The Anthropic SDK added inference_geo to the Usage model, causing test_request_parameters to fail due to a snapshot mismatch.
…butes Make consistent with all other get_attributes implementations by operating on a shallow copy instead of mutating the input dict.
Update all OpenAI/Anthropic/Bedrock test fixtures and inline instrumentation calls to use version=[1, 'latest'] so semconv code paths get coverage by default. Replace redundant *_version_both tests with minimal *_version_v1_only tests that assert v1-only mode still works.
- Add OUTPUT_MESSAGES handling to OpenaiCompletionStreamState.get_attributes for 'latest' version (streaming text completions were silently dropping output content) - Fix potential TypeError when inputs_to_events returns None by using `or []` instead of default parameter in dict.get()
…to julian/semconvs-message # Conflicts: # logfire/_internal/integrations/llm_providers/openai.py # logfire/db_api.py # tests/otel_integrations/test_anthropic.py # tests/otel_integrations/test_anthropic_bedrock.py # tests/otel_integrations/test_openai.py
These branches gate v1 vs latest attribute emission and are always true in tests since fixtures use version=[1, 'latest']. The false paths (skipping one version) are trivial no-ops, and the gating logic is already verified by dedicated version_latest and version_v1_only tests.
|
Closing in favor of #1705 because I'm having a hard time dealing with the fact that this PR is from a different fork and main is being updated in parallel, having it in the same repo I think will make things simpler. |
Adds OpenTelemetry Gen AI Semantic Convention message attributes including:
gen_ai.system_instructionsgen_ai.input.messagesgen_ai.output.messagesConverts message formats (text, tool_use, tool_result, images) to the standardized semconv message part format.