-
Notifications
You must be signed in to change notification settings - Fork 868
fix(openai): add support for realtime api (websockets) #3511
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. WalkthroughAdds OpenAI Realtime instrumentation: a new realtime_wrappers module implements session/response/span lifecycle over WebSocket events, registers a connect wrapper during instrumentation, and includes tests plus a sample realtime example script. Changes
Sequence Diagram(s)sequenceDiagram
participant Client as User Client
participant Wrapper as realtime_connect_wrapper
participant ConnMgr as RealtimeConnectionManagerWrapper
participant Session as RealtimeSessionWrapper
participant Events as Realtime Event Stream
participant Tracer as OpenTelemetry Tracer
Client->>Wrapper: connect()
Wrapper->>ConnMgr: return wrapped connection manager
ConnMgr->>Tracer: start openai.session span
ConnMgr->>Session: provide wrapped session
Client->>Session: send conversation.item / response.create (outgoing)
Session->>Tracer: start openai.realtime (response) span
Events->>Session: response.* events (text.delta, audio.delta, function_call, done, error)
Session->>Tracer: accumulate content, annotate spans, record tool calls
Events->>Session: response.done
Session->>Tracer: end response span (status, usage, content)
Client->>ConnMgr: close()
ConnMgr->>Tracer: end openai.session span
Estimated code review effort🎯 4 (Complex) | ⏱️ ~75 minutes
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Caution
Changes requested ❌
Reviewed everything up to b80fdbb in 2 minutes and 32 seconds. Click for details.
- Reviewed
2099lines of code in4files - Skipped
0files when reviewing. - Skipped posting
3draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/realtime_wrappers.py:345
- Draft comment:
Consider adding logging in error handlers (e.g. _handle_error_event and _handle_error) for better traceability of failures. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 20% vs. threshold = 50% This comment suggests adding logging as a "nice to have" for better traceability. However, this is instrumentation code that already captures errors through OpenTelemetry spans (viarecord_exception()and error status). The comment doesn't point to a bug or missing functionality - it's a speculative suggestion about adding additional logging on top of the existing telemetry. The rules state we should not make comments that are purely informative or suggest optional improvements without clear necessity. This falls into the category of "consider doing X" which is not actionable enough - it doesn't say logging is required or that there's a problem, just that it might be nice to have. The comment could be valid if there's a specific debugging scenario where logging would help beyond what spans provide. Perhaps during development or when spans aren't being exported, logging could be useful. However, without evidence that this is actually needed, it's speculative. While logging might be helpful in some scenarios, this is an instrumentation library where the primary observability mechanism is OpenTelemetry spans, which already capture the errors. The comment uses "Consider" which makes it optional/speculative rather than identifying a clear issue. Without strong evidence that logging is needed here, this falls under "nice to have" suggestions that should be removed per the rules. This comment should be deleted. It's a speculative suggestion ("Consider adding...") for an optional improvement rather than identifying a clear issue. The code already captures errors through OpenTelemetry spans, which is the appropriate mechanism for instrumentation code.
2. packages/sample-app/sample_app/openai_realtime_example.py:41
- Draft comment:
The sample app code cleanly demonstrates realtime conversation flows; ensure production implementations add appropriate error handling and retries. - Reason this comment was not posted:
Confidence changes required:30%<= threshold50%None
3. packages/opentelemetry-instrumentation-openai/tests/traces/test_realtime.py:60
- Draft comment:
The tests cover various realtime flows comprehensively. Consider adding tests for edge cases (e.g. partial events or unexpected event types) if not already covered. - Reason this comment was not posted:
Confidence changes required:40%<= threshold50%None
Workflow ID: wflow_sHzA1OILQvYikK3j
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
| self._state.response_span.end() | ||
| self._reset_response_state() | ||
|
|
||
| def _start_response_span(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider refactoring duplicate logic for starting response spans (_start_response_span) across different wrappers (e.g., in RealtimeConnectionWrapper and RealtimeResponseWrapper) to reduce maintenance duplication.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Nitpick comments (2)
packages/opentelemetry-instrumentation-openai/tests/traces/test_realtime.py (1)
402-413: Consider using direct import for better readability.The dynamic
__import__can be replaced with a regular import.@pytest.fixture def tracer_provider_and_exporter(self): """Create a tracer provider with in-memory exporter.""" + from opentelemetry.sdk.trace.export import SimpleSpanProcessor + exporter = InMemorySpanExporter() provider = TracerProvider() provider.add_span_processor( - __import__( - "opentelemetry.sdk.trace.export", - fromlist=["SimpleSpanProcessor"] - ).SimpleSpanProcessor(exporter) + SimpleSpanProcessor(exporter) ) return provider, exporterpackages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/realtime_wrappers.py (1)
241-342: Consider extracting shared response handling logic.The
_handle_response_donemethod is largely duplicated betweenRealtimeConnectionWrapperandRealtimeEventIterator. While code duplication is acceptable in this package based on learnings, extracting the common logic into a helper function inRealtimeSessionStatecould improve maintainability.Also applies to: 568-666
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (4)
packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/__init__.py(3 hunks)packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/realtime_wrappers.py(1 hunks)packages/opentelemetry-instrumentation-openai/tests/traces/test_realtime.py(1 hunks)packages/sample-app/sample_app/openai_realtime_example.py(1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Store API keys only in environment variables/secure vaults; never hardcode secrets in code
Use Flake8 for code linting and adhere to its rules
Files:
packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/realtime_wrappers.pypackages/opentelemetry-instrumentation-openai/tests/traces/test_realtime.pypackages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/__init__.pypackages/sample-app/sample_app/openai_realtime_example.py
🧠 Learnings (1)
📚 Learning: 2025-12-02T21:09:48.690Z
Learnt from: duanyutong
Repo: traceloop/openllmetry PR: 3487
File: packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/utils.py:177-178
Timestamp: 2025-12-02T21:09:48.690Z
Learning: The opentelemetry-instrumentation-openai and opentelemetry-instrumentation-openai-agents packages must remain independent and not share code, so code duplication between them is acceptable.
Applied to files:
packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/__init__.py
🧬 Code graph analysis (3)
packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/realtime_wrappers.py (1)
packages/opentelemetry-semantic-conventions-ai/opentelemetry/semconv_ai/__init__.py (1)
SpanAttributes(64-245)
packages/opentelemetry-instrumentation-openai/tests/traces/test_realtime.py (3)
packages/opentelemetry-semantic-conventions-ai/opentelemetry/semconv_ai/__init__.py (1)
SpanAttributes(64-245)packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/realtime_wrappers.py (19)
RealtimeSessionState(44-59)RealtimeConnectionManagerWrapper(861-944)RealtimeEventIterator(455-686)session(80-82)_process_event(137-158)_process_event(472-499)response(90-92)realtime_connect_wrapper(948-959)RealtimeResponseWrapper(770-858)create(753-767)create(780-785)RealtimeConnectionWrapper(62-452)RealtimeSessionWrapper(689-724)update(699-724)conversation(85-87)item(738-740)close(113-118)RealtimeConversationItemWrapper(743-767)cancel(787-800)packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/__init__.py (1)
OpenAIV1Instrumentor(52-374)
packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/__init__.py (1)
packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/realtime_wrappers.py (1)
realtime_connect_wrapper(948-959)
🪛 Flake8 (7.3.0)
packages/sample-app/sample_app/openai_realtime_example.py
[error] 85-85: f-string is missing placeholders
(F541)
🪛 Ruff (0.14.8)
packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/realtime_wrappers.py
944-944: return inside finally blocks cause exceptions to be silenced
(B012)
948-948: Unused function argument: instance
(ARG001)
packages/opentelemetry-instrumentation-openai/tests/traces/test_realtime.py
672-672: Abstract raise to an inner function
(TRY301)
672-672: Avoid specifying long messages outside the exception class
(TRY003)
packages/sample-app/sample_app/openai_realtime_example.py
85-85: f-string without any placeholders
Remove extraneous f prefix
(F541)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: Test Packages (3.12)
- GitHub Check: Test Packages (3.10)
- GitHub Check: Test Packages (3.11)
- GitHub Check: Build Packages (3.11)
- GitHub Check: Lint
🔇 Additional comments (12)
packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/__init__.py (3)
36-38: LGTM!The import follows the existing pattern in the file and is appropriately placed with other wrapper imports.
371-372: LGTM!The unwrap calls are correctly placed inside the try-except block alongside other beta API unwraps, ensuring graceful handling when the module is not available.
335-345: LGTM!Using
_try_wrapis the correct approach for beta APIs that may not be available in all OpenAI SDK versions. The module pathopenai.resources.beta.realtime.realtimematches the OpenAI SDK structure, and the pattern is consistent with other beta API wrappers in this file. The wrapper gracefully handles cases where the Realtime API classes are unavailable by catchingAttributeErrorandModuleNotFoundError.packages/sample-app/sample_app/openai_realtime_example.py (2)
27-27: LGTM!API key is correctly loaded from an environment variable, following security best practices as per coding guidelines.
30-94: Well-structured demonstration of the Realtime API instrumentation.The function demonstrates session configuration, message sending, response streaming, and usage tracking. The event loop pattern is clear and educational.
packages/opentelemetry-instrumentation-openai/tests/traces/test_realtime.py (3)
20-55: LGTM!The mock classes are well-designed and flexible, using
**kwargsto allow setting arbitrary attributes needed for different test scenarios.
415-474: Comprehensive test for the full conversation flow.The test validates span creation, attributes, parent-child relationships, and finish reasons. Good coverage of the happy path scenario.
670-674: Acceptable test pattern for simulating connection errors.The static analysis hints (TRY301, TRY003) are false positives for test code where the goal is specifically to test error handling behavior.
packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/realtime_wrappers.py (4)
1-14: Excellent module-level documentation.The docstring clearly explains the key concepts and the span hierarchy, which helps maintainers understand the instrumentation design.
44-59: LGTM!Clean state management class with appropriate defaults. The
trace_contextfield is key for establishing proper span parent-child relationships.
344-353: LGTM!Error handling properly distinguishes between API error events (ending only the response span) and connection errors (affecting both response and session spans). Exceptions are recorded appropriately.
Also applies to: 432-446
947-958: LGTM!The
instanceparameter is part of the wrapt wrapper signature and required by the@_with_tracer_wrapperdecorator, so the Ruff ARG001 warning is a false positive. The implementation correctly extracts the model and wraps the connection manager.
...elemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/realtime_wrappers.py
Outdated
Show resolved
Hide resolved
...elemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/realtime_wrappers.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Important
Looks good to me! 👍
Reviewed 9fb7b2b in 38 seconds. Click for details.
- Reviewed
41lines of code in1files - Skipped
0files when reviewing. - Skipped posting
3draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. packages/sample-app/sample_app/openai_realtime_example.py:84
- Draft comment:
Good refactor: removed unnecessary f-string. Using a normal string is cleaner as there's no interpolation. - Reason this comment was not posted:
Confidence changes required:0%<= threshold50%None
2. packages/sample-app/sample_app/openai_realtime_example.py:140
- Draft comment:
Formatting improvement: converting inline array to multi-line improves readability. - Reason this comment was not posted:
Confidence changes required:0%<= threshold50%None
3. packages/sample-app/sample_app/openai_realtime_example.py:201
- Draft comment:
Consistent formatting: multi-line formatting for 'content' in tool message enhances readability. - Reason this comment was not posted:
Confidence changes required:0%<= threshold50%None
Workflow ID: wflow_fEgG7Cv2cTOAtvXr
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Important
Looks good to me! 👍
Reviewed 0c44434 in 1 minute and 6 seconds. Click for details.
- Reviewed
34lines of code in1files - Skipped
0files when reviewing. - Skipped posting
3draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/realtime_wrappers.py:358
- Draft comment:
Explicitly setting the span status to OK before ending an existing response span clarifies completion. Ensure this behavior is intended even if the span might have had an error status. - Reason this comment was not posted:
Confidence changes required:33%<= threshold50%None
2. packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/realtime_wrappers.py:803
- Draft comment:
Similarly, in the cancel method, marking the response span as OK before ending it ensures consistency. Confirm that cancellation events should be marked with an OK status. - Reason this comment was not posted:
Confidence changes required:33%<= threshold50%None
3. packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/realtime_wrappers.py:943
- Draft comment:
Removing the return in aexit allows exceptions from the underlying connection manager to propagate naturally, which aligns with typical async context manager behavior. - Reason this comment was not posted:
Confidence changes required:33%<= threshold50%None
Workflow ID: wflow_ridHlMSu5uLvH6Oo
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (1)
packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/realtime_wrappers.py (1)
344-404: Past review feedback addressed, but code duplication remains.The past review concern about missing status on span end has been resolved (line 359 now sets
OKstatus). However,_start_response_spanlogic is duplicated acrossRealtimeConnectionWrapper(lines 355-404),RealtimeEventIterator(lines 539-568), andRealtimeResponseWrapper(lines 803-851).Consider extracting the shared span creation logic into a helper method on
RealtimeSessionState:# In RealtimeSessionState class def start_response_span(self, input_text: str = None): """Start a new response span with common initialization.""" if self.response_span is not None: self.response_span.set_status(Status(StatusCode.OK)) self.response_span.end() self.reset_response_state() self.response_start_time = time.time_ns() ctx = self.trace_context or context_api.get_current() self.response_span = self.tracer.start_span( SPAN_NAME_RESPONSE, kind=SpanKind.CLIENT, start_time=self.response_start_time, context=ctx, ) if self.response_span.is_recording(): _set_span_attribute(self.response_span, GenAIAttributes.GEN_AI_SYSTEM, "openai") _set_span_attribute(self.response_span, GenAIAttributes.GEN_AI_REQUEST_MODEL, self.model) _set_span_attribute(self.response_span, SpanAttributes.LLM_REQUEST_TYPE, "realtime") if should_send_prompts() and (input_text or self.input_text): text = input_text or self.input_text _set_span_attribute(self.response_span, f"{GenAIAttributes.GEN_AI_PROMPT}.0.content", text) _set_span_attribute(self.response_span, f"{GenAIAttributes.GEN_AI_PROMPT}.0.role", "user")Then replace all three implementations with calls to
self._state.start_response_span().
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/realtime_wrappers.py(1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Store API keys only in environment variables/secure vaults; never hardcode secrets in code
Use Flake8 for code linting and adhere to its rules
Files:
packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/realtime_wrappers.py
🧬 Code graph analysis (1)
packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/realtime_wrappers.py (2)
packages/opentelemetry-semantic-conventions-ai/opentelemetry/semconv_ai/__init__.py (1)
SpanAttributes(64-245)packages/opentelemetry-instrumentation-llamaindex/opentelemetry/instrumentation/llamaindex/dispatcher_wrapper.py (1)
end(103-111)
🪛 Ruff (0.14.8)
packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/realtime_wrappers.py
951-951: Unused function argument: instance
(ARG001)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: Lint
- GitHub Check: Test Packages (3.12)
- GitHub Check: Test Packages (3.10)
- GitHub Check: Test Packages (3.11)
- GitHub Check: Build Packages (3.11)
🔇 Additional comments (9)
packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/realtime_wrappers.py (9)
1-42: LGTM!Module structure, imports, and constants are well-organized and appropriate for the instrumentation purpose.
44-60: LGTM!State tracking class is well-structured with clear attribute initialization for managing session and response lifecycle.
62-93: LGTM!Connection wrapper properly delegates attribute access and wraps nested objects to ensure complete instrumentation coverage.
94-135: LGTM!Event handling, async iteration, and context manager implementation are correct. Proper error propagation in
__aexit__by returningFalse.
136-240: LGTM!Event processing logic comprehensively handles incoming and outgoing events with proper error isolation via
@dont_throwdecorator.
241-343: LGTM!Response completion handling thoroughly captures all relevant telemetry data including usage metrics, content, and tool calls with proper semantic convention attributes.
690-861: LGTM on session, conversation, and response wrappers!The wrapper classes properly intercept method calls to track state:
RealtimeSessionWrapper.updatecaptures session config changesRealtimeConversationItemWrapper.createextracts input textRealtimeResponseWrapper.createstarts response spanRealtimeResponseWrapper.cancelproperly ends span with cancellation status and attribute
863-948: Past review feedback addressed correctly.The previous issue with
returninsidefinallyblock has been resolved. Line 947 now correctly awaits the underlying__aexit__without returning its result, allowing exceptions to propagate naturally.The connection manager wrapper properly:
- Creates session span on entry with trace context propagation (lines 886-898)
- Handles errors and sets appropriate span status (lines 931-938)
- Ensures session span ends in finally block (lines 941-942)
- Delegates to underlying manager without suppressing exceptions (line 947)
950-962: LGTM!Entry point wrapper correctly:
- Respects instrumentation suppression flag
- Extracts model parameter with sensible default
- Returns wrapped connection manager for full instrumentation
The static analysis warning about unused
instanceparameter is a false positive—this parameter is required by thewraptwrapper signature used by@_with_tracer_wrapper, even when not directly referenced.
...elemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/realtime_wrappers.py
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Important
Looks good to me! 👍
Reviewed 27df93c in 2 minutes and 19 seconds. Click for details.
- Reviewed
888lines of code in2files - Skipped
0files when reviewing. - Skipped posting
5draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/realtime_wrappers.py:69
- Draft comment:
Consider adding a type annotation for the 'event' parameter in process_event for improved clarity. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 20% vs. threshold = 50% Looking at the diff, theprocess_eventmethod was renamed from_process_eventand moved fromRealtimeConnectionWrapperto the newRealtimeEventProcessorclass. The method signature itself (def process_event(self, event):) appears in the new code at line 69. This is technically a change since the method was refactored and moved. The comment is suggesting a code quality improvement (adding type annotations) which could be considered actionable. However, the rules state that comments should only be made if there is "clearly a code change required" and should not be "obvious or unimportant." Adding type hints is more of a style preference and not a required code change. The code works fine without the type annotation. This seems like a minor code quality suggestion rather than something that must be addressed. While the method was moved/refactored in this diff, the lack of type annotation existed in the original code as well. The comment is about improving code style rather than fixing a bug or addressing a functional issue introduced by the changes. This might be considered too minor or stylistic to warrant a comment. Even though this is a refactoring where the method was moved to a new class, the comment is suggesting a non-critical style improvement. The rules explicitly state not to make comments that are "obvious or unimportant" and to only comment when "there is clearly a code change required." Type annotations, while helpful, are not required for the code to function correctly. This falls into the category of a minor style suggestion rather than a necessary change. This comment should be deleted. It's a minor style suggestion about adding type hints, which is not a required code change. The code functions correctly without the type annotation, and the comment doesn't identify a bug or functional issue with the refactoring.
2. packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/realtime_wrappers.py:280
- Draft comment:
When an active response span exists and end_existing is false, the method returns silently. Ensure this behavior is documented and intended. - Reason this comment was not posted:
Comment did not seem useful. Confidence is useful =0%<= threshold50%The comment is asking the author to ensure that the behavior is documented and intended, which violates the rule against asking the author to confirm their intention or ensure behavior is intended. The comment does not provide a specific code suggestion or ask for a test to be written, which would have been acceptable.
3. packages/opentelemetry-instrumentation-openai/tests/traces/test_realtime.py:117
- Draft comment:
Tests directly invoke the protected _process_event method. Consider using the public API to reduce coupling to internal implementation details. - Reason this comment was not posted:
Comment was on unchanged code.
4. packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/realtime_wrappers.py:452
- Draft comment:
Use of dependency injection for passing the RealtimeEventProcessor (e.g., into RealtimeResponseWrapper and RealtimeEventIterator) improves testability. Ensure all error paths consistently call reset_response_state. - Reason this comment was not posted:
Comment was on unchanged code.
5. packages/opentelemetry-instrumentation-openai/tests/traces/test_realtime.py:293
- Draft comment:
Consider adding type annotations to test mocks and helper methods for improved maintainability and clarity in tests. - Reason this comment was not posted:
Comment was on unchanged code.
Workflow ID: wflow_iRcc66LVo8oiJTBw
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (2)
packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/realtime_wrappers.py (1)
718-730: Silence ARG001 by marking the unusedinstanceparameter as intentional.Ruff flags
instanceas unused inrealtime_connect_wrapper, but the parameter is required by the_with_tracer_wrapperprotocol. You can satisfy the linter without changing behavior by renaming it to_instance:-@_with_tracer_wrapper -def realtime_connect_wrapper(tracer: Tracer, wrapped, instance, args, kwargs): +@_with_tracer_wrapper +def realtime_connect_wrapper(tracer: Tracer, wrapped, _instance, args, kwargs):This makes the intent explicit and keeps the wrapper signature compatible with the instrumentation helper.
packages/opentelemetry-instrumentation-openai/tests/traces/test_realtime.py (1)
664-686: Ruff TRY003/TRY301 warnings in tests are safe to ignore or locally silence.The explicit
raise ConnectionError("WebSocket disconnected")is clear and idiomatic for this test. If you want a clean lint run, you can either:
- Extract the
raiseinto a small helper used only in the test, or- Add a local
# noqa: TRY003,TRY301comment on that line.Given this is test code and the message is short, keeping it as-is is also reasonable.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (2)
packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/realtime_wrappers.py(1 hunks)packages/opentelemetry-instrumentation-openai/tests/traces/test_realtime.py(1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Store API keys only in environment variables/secure vaults; never hardcode secrets in code
Use Flake8 for code linting and adhere to its rules
Files:
packages/opentelemetry-instrumentation-openai/tests/traces/test_realtime.pypackages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/realtime_wrappers.py
🧠 Learnings (1)
📚 Learning: 2025-12-02T21:09:48.690Z
Learnt from: duanyutong
Repo: traceloop/openllmetry PR: 3487
File: packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/utils.py:177-178
Timestamp: 2025-12-02T21:09:48.690Z
Learning: The opentelemetry-instrumentation-openai and opentelemetry-instrumentation-openai-agents packages must remain independent and not share code, so code duplication between them is acceptable.
Applied to files:
packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/realtime_wrappers.py
🧬 Code graph analysis (1)
packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/realtime_wrappers.py (1)
packages/opentelemetry-semantic-conventions-ai/opentelemetry/semconv_ai/__init__.py (1)
SpanAttributes(64-245)
🪛 Ruff (0.14.8)
packages/opentelemetry-instrumentation-openai/tests/traces/test_realtime.py
684-684: Abstract raise to an inner function
(TRY301)
684-684: Avoid specifying long messages outside the exception class
(TRY003)
packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/realtime_wrappers.py
719-719: Unused function argument: instance
(ARG001)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: Test Packages (3.11)
- GitHub Check: Test Packages (3.10)
- GitHub Check: Lint
- GitHub Check: Test Packages (3.12)
- GitHub Check: Build Packages (3.11)
🔇 Additional comments (2)
packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/realtime_wrappers.py (1)
44-171: Good separation of session state and shared event processing.The
RealtimeSessionState+RealtimeEventProcessorsplit gives a clear single source of truth for session/response state, and bothRealtimeConnectionWrapper.recvandRealtimeEventIterator.__anext__correctly delegate through the same processor, avoiding the duplication noted in earlier iterations. The response lifecycle (response.created→ deltas →response.done/error) is coherently handled here and matches the span behavior exercised in the tests.packages/opentelemetry-instrumentation-openai/tests/traces/test_realtime.py (1)
411-577: End-to-end realtime flow tests provide strong coverage.The mocked full-flow tests (single-turn, multi-turn, function-call, error paths) do a good job asserting span naming, parent/child relationships, and key attributes, which should catch most regressions in the realtime wrappers’ lifecycle logic.
| def _handle_error(self, error): | ||
| """Handle connection errors.""" | ||
| if self._state.response_span and self._state.response_span.is_recording(): | ||
| self._state.response_span.set_status( | ||
| Status(StatusCode.ERROR, str(error)) | ||
| ) | ||
| self._state.response_span.record_exception(error) | ||
| self._state.response_span.end() | ||
| self._processor.reset_response_state() | ||
|
|
||
| if self._state.session_span and self._state.session_span.is_recording(): | ||
| self._state.session_span.set_status( | ||
| Status(StatusCode.ERROR, str(error)) | ||
| ) | ||
| self._state.session_span.record_exception(error) | ||
|
|
||
| def _end_session_span(self): | ||
| """End the session span.""" | ||
| if self._state.session_span and self._state.session_span.is_recording(): | ||
| self._state.session_span.set_status(Status(StatusCode.OK)) | ||
| self._state.session_span.end() | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Preserve error status when ending the session span.
_handle_error sets an ERROR status on self._state.session_span, but _end_session_span unconditionally overwrites it with OK before ending the span. When a connection error occurs followed by connection close (the normal lifecycle via on_error() → _handle_error() then on_close() → _end_session_span()), the error status is masked as success on the session span.
Modify _end_session_span to only set OK status when the current status is UNSET, preserving any earlier error status:
def _end_session_span(self):
"""End the session span."""
if self._state.session_span and self._state.session_span.is_recording():
+ # Preserve any existing non-UNSET status (e.g., ERROR set in _handle_error)
+ span = self._state.session_span
+ if not hasattr(span, 'status') or span.status is None or span.status.status_code is StatusCode.UNSET:
+ span.set_status(Status(StatusCode.OK))
self._state.session_span.end()Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In
packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/realtime_wrappers.py
around lines 451 to 472, _end_session_span currently unconditionally sets the
session span status to OK, which overwrites any prior ERROR set in
_handle_error; change _end_session_span so that it only sets
Status(StatusCode.OK) if the span's current status is UNSET (preserving ERROR or
other statuses), then end the span as before while keeping the is_recording
check.
| # Verify finish_reason is "stop" for text response without tool calls | ||
| attrs = dict(response_span.attributes) | ||
| assert attrs.get("gen_ai.completion.0.finish_reason") == "stop" | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tests assume prompts/tool metadata are always captured; consider decoupling from should_send_prompts() config.
Assertions on gen_ai.completion.* attributes (finish_reason, role, tool_calls, etc.) rely on should_send_prompts() returning True at runtime. If prompt capture is disabled via configuration or environment, these tests will start failing even though the instrumentation is behaving as configured.
To make the tests robust, consider explicitly forcing prompt capture just for these cases (e.g., monkeypatching should_send_prompts to return True or temporarily adjusting the relevant config), or relaxing the assertions when prompt capture is disabled.
Also applies to: 620-627
🤖 Prompt for AI Agents
In packages/opentelemetry-instrumentation-openai/tests/traces/test_realtime.py
around lines 484-487 (and similarly 620-627), the test unconditionally asserts
presence of gen_ai.completion.* attributes which breaks when prompt capture is
disabled; update the tests to either force prompt capture for these cases (e.g.,
monkeypatch should_send_prompts() to return True or set the config/env variable
enabling prompt capture for the duration of the test) before creating the
response_span, or make the assertions conditional by checking
should_send_prompts() (or the same config flag) and only asserting
gen_ai.completion.* attributes when prompt capture is enabled; ensure you
restore the original behavior after the test.
Important
Adds support for OpenAI Realtime API with WebSocket instrumentation, comprehensive tests, and example usage.
realtime_wrappers.py.test_realtime.pyfor session and response handling, error processing, and span attributes.openai_realtime_example.pydemonstrating single/multi-turn conversations and tool-enabled workflows.This description was created by
for 27df93c. You can customize this summary. It will automatically update as commits are pushed.
Summary by CodeRabbit
New Features
Tests
Documentation
✏️ Tip: You can customize this high-level summary in your review settings.