Skip to content

feat: Add OTel Gen AI semantic convention input / output messages#1666

Closed
brightsparc wants to merge 46 commits intopydantic:mainfrom
brightsparc:julian/semconvs-message
Closed

feat: Add OTel Gen AI semantic convention input / output messages#1666
brightsparc wants to merge 46 commits intopydantic:mainfrom
brightsparc:julian/semconvs-message

Conversation

@brightsparc
Copy link
Copy Markdown
Contributor

Adds OpenTelemetry Gen AI Semantic Convention message attributes including:

gen_ai.system_instructions
gen_ai.input.messages
gen_ai.output.messages

Converts message formats (text, tool_use, tool_result, images) to the standardized semconv message part format.

brightsparc and others added 14 commits January 24, 2026 12:57
- Add TypedDicts in semconv.py (TextPart, ToolCallPart, ChatMessage, etc.)
- Add convert_anthropic_messages_to_semconv() + _convert_anthropic_content_part()
- Add convert_chat_completions_to_semconv() + helpers
- Add convert_responses_inputs_to_semconv()
- Set gen_ai.input.messages and gen_ai.system_instructions on spans

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add OutputMessage TypedDict in semconv.py
- Add convert_anthropic_response_to_semconv() for Anthropic
- Add convert_openai_response_to_semconv() for OpenAI chat completions
- Add convert_responses_outputs_to_semconv() for OpenAI Responses API
- Set gen_ai.output.messages on spans in on_response and stream state

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Anthropic tests still need manual updates due to inline-snapshot issues.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Combines both branches to add semantic convention attributes for:
- Input messages (gen_ai.input.messages)
- System instructions (gen_ai.system_instructions)
- Output messages (gen_ai.output.messages)

Resolves merge conflicts in:
- semconv.py: Combined InputMessages, SystemInstructions, OutputMessage types
- openai.py: Combined imports from both branches
- anthropic.py: Kept both input and output conversion functions
- test_anthropic.py: Combined expected snapshots with both input and output
- test_anthropic_bedrock.py: Combined expected snapshots

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@codecov
Copy link
Copy Markdown

codecov bot commented Jan 27, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds OpenTelemetry Gen AI semantic convention message attributes to LLM provider instrumentation, including conversion of provider-specific message formats into standardized semconv messages/parts structures.

Changes:

  • Introduces typed semconv definitions for message parts/messages and new attribute constants for input/output/system instructions.
  • Updates OpenAI + Anthropic integrations to emit gen_ai.input.messages, gen_ai.output.messages, and gen_ai.system_instructions where applicable.
  • Expands and updates OTel integration tests to validate semconv message conversion across text/tool/image and edge cases.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
logfire/_internal/integrations/llm_providers/semconv.py Adds message attribute constants and TypedDict-based semconv message/part type definitions.
logfire/_internal/integrations/llm_providers/openai.py Emits semconv input/output/system instruction attributes and adds conversion helpers for Chat Completions + Responses API.
logfire/_internal/integrations/llm_providers/anthropic.py Emits semconv input/output/system instruction attributes and adds conversion helpers for Anthropic message formats.
tests/otel_integrations/test_openai.py Updates snapshots and adds coverage for semconv conversion across tool calls/images/empty inputs/None finish_reason.
tests/otel_integrations/test_anthropic.py Updates snapshots and adds new tests for missing system, missing content, and list/multi-part content conversion.
tests/otel_integrations/test_anthropic_bedrock.py Updates mocked responses and snapshots to include semconv message attributes and finish reason.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

brightsparc and others added 6 commits February 3, 2026 12:48
The `events` attribute was serialized to a JSON string by
`prepare_otlp_attribute`, so the `isinstance(events_attr, list)` check
always failed and input events were silently overwritten by output events.
Parse the JSON string back to a list when reading existing events.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The non-streaming Responses API tests now correctly show both input
and output events in the backward-compat 'events' attribute.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Resolve conflicts between semconv message attributes (INPUT_MESSAGES,
OUTPUT_MESSAGES, SYSTEM_INSTRUCTIONS) and upstream's new semconv
attributes (INPUT_TOKENS, OUTPUT_TOKENS, RESPONSE_FINISH_REASONS,
RESPONSE_ID, RESPONSE_MODEL). Both sets of attributes are now included.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
These paths handle message types (multimodal list content, tool calls
in input messages, tool responses, named messages) that are not
exercised by the existing VCR-based integration tests.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@brightsparc
Copy link
Copy Markdown
Contributor Author

@alexmojaki ready for final review from you.

dmontagu added a commit that referenced this pull request Feb 10, 2026
Adds a `version` parameter that accepts `1`, `2`, or `[1, 2]`:
- version=1 (default): legacy request_data/response_data format
- version=2: structured OTel Gen AI semconv messages (gen_ai.input.messages, gen_ai.output.messages, gen_ai.system_instructions)
- version=[1, 2]: both formats simultaneously for migration/testing

Built on top of PR #1666 (julian/semconvs-message).
Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 6 new potential issues.

View 11 additional findings in Devin Review.

Open in Devin Review

@adityaidev
Copy link
Copy Markdown

Acknowledged, I will review this.

Version 'latest' signals that the semconv format is still unstable and
may change between releases. Numbered versions (2, etc.) will be added
later once the format is finalized.
Only include media_type key when the value is actually present,
respecting the NotRequired[str] type contract of BlobPart.
Version 'latest' signals that the semconv format is still unstable and
may change between releases. Numbered versions (2, etc.) will be added
later once the format is finalized.
Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 2 new potential issues.

View 14 additional findings in Devin Review.

Open in Devin Review

The Anthropic SDK added inference_geo to the Usage model, causing
test_request_parameters to fail due to a snapshot mismatch.
…butes

Make consistent with all other get_attributes implementations by
operating on a shallow copy instead of mutating the input dict.
Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 4 new potential issues.

View 17 additional findings in Devin Review.

Open in Devin Review

Update all OpenAI/Anthropic/Bedrock test fixtures and inline
instrumentation calls to use version=[1, 'latest'] so semconv code
paths get coverage by default. Replace redundant *_version_both tests
with minimal *_version_v1_only tests that assert v1-only mode still
works.
- Add OUTPUT_MESSAGES handling to OpenaiCompletionStreamState.get_attributes
  for 'latest' version (streaming text completions were silently dropping
  output content)
- Fix potential TypeError when inputs_to_events returns None by using
  `or []` instead of default parameter in dict.get()
…to julian/semconvs-message

# Conflicts:
#	logfire/_internal/integrations/llm_providers/openai.py
#	logfire/db_api.py
#	tests/otel_integrations/test_anthropic.py
#	tests/otel_integrations/test_anthropic_bedrock.py
#	tests/otel_integrations/test_openai.py
These branches gate v1 vs latest attribute emission and are always
true in tests since fixtures use version=[1, 'latest']. The false
paths (skipping one version) are trivial no-ops, and the gating
logic is already verified by dedicated version_latest and
version_v1_only tests.
@dmontagu
Copy link
Copy Markdown
Contributor

Closing in favor of #1705 because I'm having a hard time dealing with the fact that this PR is from a different fork and main is being updated in parallel, having it in the same repo I think will make things simpler.

@dmontagu dmontagu closed this Feb 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants