Skip to content

Conversation

@galkleinman
Copy link
Contributor

@galkleinman galkleinman commented Nov 25, 2025

Important

Enhance OpenAI API instrumentation by adding request attribute reporting and tests for capturing request parameters.

  • Behavior:
    • Add _set_request_attributes to responses_get_or_create_wrapper, async_responses_get_or_create_wrapper, and responses_cancel_wrapper to report request attributes.
    • Introduce prepare_kwargs_for_shared_attributes() to map max_output_tokens to max_tokens.
  • Functions:
    • Add _extract_model_name_from_provider_format and _set_request_attributes to responses_wrappers.py.
  • Tests:
    • Add test_responses_with_request_params in test_responses.py to verify capturing of request parameters like temperature, max_tokens, and top_p.
    • Add test_responses_with_request_params.yaml cassette for VCR testing.

This description was created by Ellipsis for c739dad. You can customize this summary. It will automatically update as commits are pushed.

Summary by CodeRabbit

  • New Features
    • Enhanced OpenAI instrumentation tracing with improved span attributes, including model identification, detailed token usage metrics (input, output, reasoning tokens), and request parameters (temperature, max output tokens) for comprehensive observability.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link

coderabbitai bot commented Nov 25, 2025

Walkthrough

Introduced a new helper function prepare_kwargs_for_shared_attributes that remaps OpenAI request parameters for compatibility. Updated response attribute extraction to derive model names using a shared formatter. Threaded request attribute setting through synchronous, asynchronous, and cancellation wrapper paths. Added capture of usage-related span attributes (tokens, reasoning metrics) when available.

Changes

Cohort / File(s) Summary
Core instrumentation updates
packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/responses_wrappers.py
Added prepare_kwargs_for_shared_attributes helper to remap max_output_tokens to max_tokens. Updated model name extraction via _extract_model_name_from_provider_format. Replaced direct attribute assignment with GEN_AI_RESPONSE_MODEL, OPENAI_RESPONSE_SERVICE_TIER, and usage metrics. Threaded _set_request_attributes calls through all wrapper paths (sync, async, cancellation). Added span attributes for input/output tokens, total tokens, reasoning tokens, summaries, and efforts.
Test cassette
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_responses/test_responses_with_request_params.yaml
New HTTP interaction cassette recording POST request to OpenAI responses endpoint with parameters (input, max_output_tokens, model, temperature, top_p) and corresponding gzip-compressed 200 OK response with full headers.
Test coverage
packages/opentelemetry-instrumentation-openai/tests/traces/test_responses.py
Added test_responses_with_request_params test function verifying that explicit request parameters (temperature, max_output_tokens, top_p) are captured as span attributes (gen_ai.request.temperature, gen_ai.request.max_tokens, gen_ai.request.top_p) alongside span name, system, and model metadata.

Sequence Diagram(s)

sequenceDiagram
    participant Caller
    participant Wrapper as Wrapper<br/>(sync/async)
    participant Helper as prepare_kwargs_for_shared_attributes
    participant SetAttrs as _set_request_attributes
    participant Span
    participant SharedModule as Shared Module

    Caller->>Wrapper: Invoke wrapper
    Wrapper->>Helper: prepare_kwargs_for_shared_attributes(kwargs)
    Helper->>Helper: Clone kwargs,<br/>remap max_output_tokens→max_tokens
    Helper-->>Wrapper: Return prepared kwargs
    
    Wrapper->>SetAttrs: _set_request_attributes(span,<br/>prepared_kwargs, instance)
    SetAttrs->>SharedModule: Extract model name via<br/>_extract_model_name_from_provider_format
    SetAttrs->>Span: Set GEN_AI_RESPONSE_MODEL
    SetAttrs->>Span: Set GEN_AI_REQUEST_* attributes<br/>(temperature, max_tokens, top_p)
    
    Wrapper->>Wrapper: Execute OpenAI request
    Wrapper->>Span: Set usage metrics<br/>(tokens, reasoning)
    Wrapper-->>Caller: Return response
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

  • Parameter remapping logic: Verify prepare_kwargs_for_shared_attributes correctly handles all parameter conversions and doesn't introduce side effects
  • Model name extraction: Confirm _extract_model_name_from_provider_format properly derives model names across different response formats
  • Wrapper path consistency: Ensure _set_request_attributes calls are correctly positioned across all five wrapper decision points (sync/async/cancellation paths)
  • Usage attribute capture: Validate that tokens and reasoning metrics are correctly extracted from responses when present

Possibly related PRs

Suggested reviewers

  • nirga

Poem

🐰✨ A helper hops through kwargs so bright,
Remapping tokens left and right,
Through sync and async paths it flows,
Setting spans with metrics that grow,
OpenAI's wisdom now on display!

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 22.22% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: fixing the OpenAI responses API instrumentation to report request attributes. This aligns with the core modifications across all three files.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch gk/fix-openai-responses-attrs

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between c11d45a and c739dad.

📒 Files selected for processing (3)
  • packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/responses_wrappers.py (10 hunks)
  • packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_responses/test_responses_with_request_params.yaml (1 hunks)
  • packages/opentelemetry-instrumentation-openai/tests/traces/test_responses.py (1 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/cassettes/**/*.{yaml,yml,json}

📄 CodeRabbit inference engine (CLAUDE.md)

Never commit secrets or PII in VCR cassettes; scrub sensitive data

Files:

  • packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_responses/test_responses_with_request_params.yaml
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: Store API keys only in environment variables/secure vaults; never hardcode secrets in code
Use Flake8 for code linting and adhere to its rules

Files:

  • packages/opentelemetry-instrumentation-openai/tests/traces/test_responses.py
  • packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/responses_wrappers.py
🧬 Code graph analysis (1)
packages/opentelemetry-instrumentation-openai/tests/traces/test_responses.py (2)
packages/traceloop-sdk/traceloop/sdk/utils/in_memory_span_exporter.py (2)
  • InMemorySpanExporter (22-61)
  • get_finished_spans (40-43)
packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/chat_wrappers.py (1)
  • _ (1094-1124)
🪛 Ruff (0.14.5)
packages/opentelemetry-instrumentation-openai/tests/traces/test_responses.py

34-34: Unused function argument: instrument_legacy

(ARG001)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: Test Packages (3.12)
  • GitHub Check: Test Packages (3.11)
  • GitHub Check: Lint
  • GitHub Check: Build Packages (3.11)
  • GitHub Check: Test Packages (3.10)
  • GitHub Check: Analyze (python)
🔇 Additional comments (6)
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_responses/test_responses_with_request_params.yaml (1)

1-111: Cassette properly sanitized.

The VCR cassette does not contain API keys or sensitive credentials. The Authorization header appears to be scrubbed. The file is suitable for version control.

packages/opentelemetry-instrumentation-openai/tests/traces/test_responses.py (1)

32-55: Well-structured test for request parameter capture.

The test correctly validates that temperature, max_output_tokens (mapped to max_tokens), and top_p are captured as span attributes. The instrument_legacy fixture is needed to activate instrumentation even though it's not directly referenced in the test body—this is a standard pytest pattern, so the static analysis warning is a false positive.

packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/responses_wrappers.py (4)

53-58: LGTM!

The new imports from the shared module are correctly added and both functions are utilized throughout the file.


193-206: Clean helper for parameter normalization.

The function correctly maps max_output_tokens (Responses API parameter) to max_tokens (standard shared attribute name). The shallow copy is sufficient since we're only modifying top-level keys.


208-244: Response model and usage attributes properly extracted.

The refactored set_data_attributes correctly:

  1. Extracts response model name using the shared formatter for consistency
  2. Handles usage token details with robust dict/object access patterns

The separation of concerns is good—request attributes are now set via _set_request_attributes, while this function focuses on response-specific attributes.


464-464: Request attributes consistently threaded through all code paths.

The _set_request_attributes call with prepare_kwargs_for_shared_attributes(kwargs) is correctly added across all wrapper paths:

  • Sync/async streaming (lines 464, 613)
  • Sync/async error handling (lines 523, 668)
  • Sync/async completion (lines 589, 735)
  • Sync/async cancellation (lines 760, 787)

This ensures request parameters are captured regardless of execution path.


Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important

Looks good to me! 👍

Reviewed everything up to c739dad in 2 minutes and 3 seconds. Click for details.
  • Reviewed 261 lines of code in 3 files
  • Skipped 0 files when reviewing.
  • Skipped posting 6 draft comments. View those below.
  • Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/responses_wrappers.py:189
  • Draft comment:
    The new function prepare_kwargs_for_shared_attributes maps only 'max_output_tokens' to 'max_tokens'. Consider documenting or expanding this mapping if additional parameters need conversion.
  • Reason this comment was not posted:
    Confidence changes required: 50% <= threshold 50% None
2. packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/responses_wrappers.py:208
  • Draft comment:
    The explicit setting of the 'gen_ai.system' (and previously 'gen_ai.request.model') attribute in set_data_attributes has been removed. Ensure that these required attributes are now set via _set_request_attributes in the wrappers to avoid regression.
  • Reason this comment was not posted:
    Comment looked like it was already resolved.
3. packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/responses_wrappers.py:445
  • Draft comment:
    Multiple wrappers (responses_get_or_create, async_responses_get_or_create, cancel wrappers) call _set_request_attributes with prepare_kwargs_for_shared_attributes. This duplication is acceptable, but consider consolidating the call into a helper if additional shared logic emerges.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.
4. packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_responses/test_responses_with_request_params.yaml:1
  • Draft comment:
    New cassette file correctly includes a test case for capturing request parameters. Ensure that any changes in response serialization are kept in sync with expected binary responses.
  • Reason this comment was not posted:
    Confidence changes required: 50% <= threshold 50% None
5. packages/opentelemetry-instrumentation-openai/tests/traces/test_responses.py:32
  • Draft comment:
    The test 'test_responses_with_request_params' properly verifies that parameters (temperature, max_tokens, top_p) are reported. Confirm that _set_request_attributes in the shared logic processes these keys as expected.
  • Reason this comment was not posted:
    Confidence changes required: 30% <= threshold 50% None
6. packages/opentelemetry-instrumentation-openai/tests/traces/test_responses.py:465
  • Draft comment:
    In test_response_stream_init_with_none_tools, the final assertion allows tools to be an empty list or None. Since the comment suggests tools should default to an empty list, consider enforcing a consistent default instead of allowing None.
  • Reason this comment was not posted:
    Confidence changes required: 50% <= threshold 50% None

Workflow ID: wflow_ZqsVlCWBWqPPxnkv

You can customize Ellipsis by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.

@galkleinman galkleinman merged commit 5bd93d9 into main Nov 25, 2025
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants