[Feature] Emit journey events to core spans (PR #4/9)#12
Merged
Conversation
Add journey event emission directly to OpenTelemetry spans in parallel with existing buffering. Events (QUEUED, SCHEDULED, PREEMPTED, FIRST_TOKEN, FINISHED) are now emitted to core spans with full progress snapshots. Changes: - Extended _emit_journey_event() to accept optional span parameter - Added span emission logic with defensive error handling - Updated all 6 call sites to pass span from _core_spans dict - Added FINISHED emission in natural completion path (update_from_output) - Extended _compute_progress_snapshot() to support WAITING phase - Changed QUEUED scheduler_step from None to counter (typically 0) - Added 9 comprehensive tests covering all event types and edge cases Safety properties: - No new resources created (uses existing spans from PR#2) - Defensive programming (try/except around all OTEL calls) - Zero overhead when disabled (feature flag gate) - Legacy buffering preserved (parallel operation until PR#9) Tests: 9 new tests (328 lines), all passing Size: ~113 lines production code, 328 lines test code Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Updated JOURNEY_TRACING_PR_PLAN.md to reflect PR #4 completion: - Updated PR sequence summary table (PR #4: COMPLETED) - Updated PR dependencies diagram (PR #4: ✅ COMPLETED) - Added detailed completion status to PR #4 section - Listed all 9 tests implemented - Documented actual sizes: ~113 lines production, 328 lines test code
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Extends journey tracing to emit events directly to OpenTelemetry spans, in parallel with existing buffering. This is PR #4 in the 9-PR journey tracing dual-stream architecture sequence.
Part of: Journey Tracing Dual-Stream Architecture (9-PR sequence)
Depends on: PR #3 (journey state cleanup) ✅ Merged
Next: PR #5 (API span tracking dict)
Changes
Core Implementation (
vllm/v1/core/sched/scheduler.py)_emit_journey_event(): Accept optionalspanparameterspan.add_event()with full attributesspan=self._core_spans.get(request_id)update_from_output()Nonetoself.scheduler_step_counter(typically 0)Event Emission Details
Events emitted with comprehensive attributes:
journey.<EVENT_TYPE>(e.g.,journey.QUEUED)time.time_ns()(epoch nanoseconds)Call Sites Updated
Why Safe
No New Resources ✅
_core_spansdict)_first_token_emitted,_journey_prefill_hiwater)Defensive Programming ✅
is_recording())Performance ✅
enable_journey_tracing=False(single boolean check)Backward Compatibility ✅
Resource Safety Checklist
Termination Paths Verified
All request termination paths emit FINISHED before cleanup:
_end_core_span_and_cleanup()in finally blocksTests Added (9 tests, 328 lines)
test_events_emitted_to_span()- Verify QUEUED, SCHEDULED emittedtest_event_attributes_complete()- Verify all attributes presenttest_defensive_error_handling()- Verify request continues when add_event raisestest_no_events_when_span_none()- Verify graceful handling when tracer=Nonetest_legacy_buffering_still_works()- Verify parallel buffering unchangedtest_first_token_dedup_set()- Verify FIRST_TOKEN deduplicationtest_first_token_transition_emitted()- Verify FIRST_TOKEN on 0→N transitiontest_finished_emitted_to_span()- Verify FINISHED emission on natural completiontest_preempted_event_emitted()- Verify PREEMPTED eventTesting
Expected: All tests pass ✅ (9 new + 11 journey events = 20 total passing)
Rollback Notes
Safe to revert: Yes, reverts to silent core spans (spans exist but no events emitted)
Impact of revert:
To revert:
git revert <commit-sha>Size
Next Steps
After merge: