[Feature] Initialize OTEL tracer in scheduler for journey tracing (PR #1/9)#8
Merged
[Feature] Initialize OTEL tracer in scheduler for journey tracing (PR #1/9)#8
Conversation
Update plan document to account for completed work: - Document PR #0 (EngineCoreEvent removal) as completed prerequisite - Clarify that do_tracing() is current OTEL mechanism (not legacy) - Update PR #9 to keep RequestJourneyEvent dataclass (needed for Prometheus) - Fix terminology: 'legacy' = EngineCoreEvent (removed), 'current' = RequestJourneyEvent - Add PR #0 to dependencies, timeline, and progress tracking sections Key corrections: - do_tracing() will NOT be removed (it's the current system) - RequestJourneyEvent dataclass will NOT be removed (needed for metrics) - Only buffering LOGIC will be removed in PR #9 Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Add tracer initialization in Scheduler.__init__() to support dual-stream journey tracing architecture. This is the foundation for PR #2 which will create and manage core spans. Changes: - Add defensive SpanAttributes import with None fallback - Initialize tracer when enable_journey_tracing=True and endpoint configured - Add try/except with warning log for graceful degradation - Add otlp_traces_endpoint parameter to test utilities - Add 4 comprehensive tests with proper mocking Safety guarantees: - Zero per-request state (tracer is class-level only) - Zero overhead when disabled (boolean + endpoint guard) - No spans created (initialization only) - No cleanup needed (shared tracer instance) - Backward compatible (all parameters optional) Test results: All 85 tests passing (81 existing + 4 new) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
This was referenced Jan 27, 2026
sriumcp
added a commit
that referenced
this pull request
Jan 27, 2026
Updates to reflect PR #7 completion: - PR sequence table: Mark #7 as COMPLETED with 12 tests - Dependency chain: Mark #6 and #7 as COMPLETED - PR #7 section: Add completion status with commit hashes - Document deliverables: inject_trace_context(), tests, guarantees Remaining: PRs #8 (API events), #9 (remove buffering) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
sriumcp
added a commit
that referenced
this pull request
Jan 27, 2026
…/9) (#15) * [Feature] Add API↔Engine context propagation for journey tracing (PR #7/9) This PR implements W3C Trace Context propagation from API spans to core spans, enabling parent-child linkage in distributed traces. Completes the handshake between PR #6 (API span lifecycle) and PR #2 (core span lifecycle). Changes: - Add inject_trace_context() helper to vllm/tracing.py - Inject API span context into trace_headers after span creation - Context flows to engine.generate() and scheduler for parent-child linkage - Defensive error handling: injection failures never break requests - Zero overhead when tracing disabled (early return) Behavioral guarantees verified by tests: - G1: Trace ID continuity (API and core spans share same trace_id) - G2: W3C Trace Context format (traceparent header valid) - G3: Trace continuation (trace_id preserved through Client→API→Core) - G4: Graceful degradation (request continues on injection failure) - G5: No exception propagation (injection failures caught) - G6: Conditional injection (only when API span exists) Invariants: - I1: Backward compatibility (early return when tracing disabled) - I2: Zero overhead when disabled (no propagator/allocation access) - I3: No resource leaks (only modifies existing trace_headers dict) Test coverage: - 12 new tests (100% pass) covering all unit-testable properties - 17 existing API span lifecycle tests pass (no regressions) - Tests focus on behavioral properties, not implementation details Safety properties: - Zero new resources (only modifies existing dict) - No cleanup obligations (dict managed by request lifecycle) - Stateless transformation (span context → headers) - Single injection point (strict ordering preserved) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> * [Polish] Improve inject_trace_context docstring and strengthen test Two quality improvements following code review: 1. Clarify inject_trace_context() docstring: - Previous: "or None if injection failed" (misleading) - Now: Explicitly documents when carrier is returned unchanged - Details all three early-return paths (OTEL unavailable, span None, exception) 2. Strengthen test_trace_id_preserved_through_chain(): - Mock propagator now actually reads span.get_span_context() - Extracts trace_id and span_id from span context - Generates traceparent using those values (simulates real OTEL behavior) - Asserts get_span_context() was called - Better proves G1/G3 guarantees without requiring real OTLP exporter Test results: All 29 tests pass (12 context propagation + 17 lifecycle) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> * [Docs] Mark PR #7 as completed in journey tracing plan Updates to reflect PR #7 completion: - PR sequence table: Mark #7 as COMPLETED with 12 tests - Dependency chain: Mark #6 and #7 as COMPLETED - PR #7 section: Add completion status with commit hashes - Document deliverables: inject_trace_context(), tests, guarantees Remaining: PRs #8 (API events), #9 (remove buffering) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> * removing PR7_summary Signed-off-by: Srinivasan Parthasarathy <spartha@us.ibm.com> --------- Signed-off-by: Srinivasan Parthasarathy <spartha@us.ibm.com> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
sriumcp
added a commit
that referenced
this pull request
Jan 27, 2026
Implements journey tracing PR #8: - Add EVENT_TS_MONOTONIC attribute for API event timestamps - Emit HANDOFF_TO_CORE event after engine.generate() - Emit FIRST_RESPONSE_FROM_CORE event on first response (streaming and non-streaming) - Set request attributes on API spans (model, prompt tokens, sampling params) - Add _update_first_response_time() helper to track first response timing - All span operations wrapped defensively (G7 compliance) - Zero overhead when span not recording (G6 compliance) - 12 behavioral tests covering G1, G3-G7 (G2 verified by code inspection) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
sriumcp
added a commit
that referenced
this pull request
Jan 27, 2026
sriumcp
added a commit
that referenced
this pull request
Jan 27, 2026
* [Feature] Add API lifecycle events and request attributes (PR #8) Implements journey tracing PR #8: - Add EVENT_TS_MONOTONIC attribute for API event timestamps - Emit HANDOFF_TO_CORE event after engine.generate() - Emit FIRST_RESPONSE_FROM_CORE event on first response (streaming and non-streaming) - Set request attributes on API spans (model, prompt tokens, sampling params) - Add _update_first_response_time() helper to track first response timing - All span operations wrapped defensively (G7 compliance) - Zero overhead when span not recording (G6 compliance) - 12 behavioral tests covering G1, G3-G7 (G2 verified by code inspection) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> * Update master plan: Mark PR #8 as completed --------- Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
This was referenced Jan 28, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Overview
This is the 1st of 9 PRs in the journey tracing dual-stream architecture implementation. This PR initializes an OpenTelemetry tracer in the scheduler without creating any per-request state or spans. It establishes the foundation for the next PR, which will create and manage core spans.
Branch:
pr1ofjourneyDepends on: #7 (EngineCoreEvent removal - already merged)
Next: The next PR will use this tracer to create core spans with complete lifecycle management
What This PR Does
Adds tracer initialization to
Scheduler.__init__()that:enable_journey_tracing=TrueANDotlp_traces_endpointis configuredChanges
Production Code (19 lines)
vllm/v1/core/sched/scheduler.py:Test Changes (112 lines)
tests/v1/core/utils.py(2 lines):otlp_traces_endpoint: str | None = Noneparameter tocreate_scheduler()ObservabilityConfigtests/v1/core/test_scheduler.py(110 lines):patchimport for mockingtest_tracer_init_when_endpoint_set()- Positive pathtest_tracer_none_when_endpoint_not_set()- Negative paths (3 cases)test_scheduler_init_succeeds_with_tracing_enabled()- Smoke testtest_tracer_init_handles_failure_gracefully()- Error handlingSafety Guarantees
✅ No Per-Request State
self.tracer(class-level instance variable)✅ Zero Overhead When Disabled
enable_journey_tracing=False→ tracer stays Noneotlp_traces_endpoint is None→ tracer stays None✅ No Spans Created
✅ Graceful Degradation
SpanAttributesimport wrapped in try/except (None fallback)init_tracer()wrapped in try/except (warning log on failure)✅ Backward Compatible
otlp_traces_endpointdefaults to None✅ Legacy Tracing Untouched
RequestJourneyEventbuffering still worksOutputProcessor.do_tracing()still functionalTest Results
All 85 tests passing (81 existing + 4 new):
pytest tests/v1/core/test_scheduler.py -v # 85 passed, 16 warnings in 24.53sTest Coverage:
Test Quality:
Code Review Notes
Issue identified during review: Test 3 initially called real
init_tracer()Fix applied: Added
@patchdecorator for deterministic testingResult: All 4 tests now properly mocked and consistent
Resource Safety Checklist
Architecture Context
This PR is part of the dual-stream journey tracing architecture (9 PRs total):
This PR establishes the foundation for core layer tracing by initializing the tracer that will be used in the next PR.
See
JOURNEY_TRACING_PR_PLAN.mdfor the complete implementation roadmap.Next Steps
The next PR (2nd of 9) will:
self.tracerto create core spans inadd_request()_core_spans: dict[str, Span]to track active spans_end_core_span_and_cleanup()for all termination pathsRelated Documentation
JOURNEY_TRACING_PR_PLAN.md(updated with completion status)JOURNEY_TRACING.md(no changes needed - internal only)Reviewer Checklist
When reviewing, please verify:
Size: 4 files changed, 236 insertions(+), 32 deletions(-)
Review Time: ~10 minutes
Safe to merge: Yes - no per-request state, no spans, complete test coverage