[Feature] Remove journey event buffering (PR #9/9)#17
Merged
Conversation
Completes migration to OTEL-based journey tracing by removing all intermediate buffering and export mechanisms. Journey events are now emitted exclusively as OTEL spans in real-time, while Prometheus metrics capture timestamps directly on Request objects using monotonic time. Changes: - Remove journey event buffer dictionary and flushing logic from scheduler - Remove journey event export from output processor - Add direct timestamp capture (queued_ts, scheduled_ts) to Request - Preserve backward compatibility with deprecated journey_events parameters - Add 16 comprehensive tests verifying no buffering, span infrastructure, metrics independence, and backward compatibility All 16 PR #9 tests pass. All existing scheduler tests pass. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
…tats Update plan document to reflect actual implementation results vs estimates: Changes: - Update total line counts: ~7,528 added / ~1,116 removed (was ~618/~280) - Update PR #9 stats: 16 tests, ~478 added / ~389 removed (was 4-5 tests) - Update total test count: 27+ journey tracing tests (was 77) - Add implementation timeline: Jan 23-27, 2026 - Add "Implementation Status" section with all completed PRs - Update PR #0 description to clarify metrics restoration evolution - Add timestamp propagation path diagram for PR #9 - Clarify that journey event buffering removed in PR #9 All stats now match actual git history and test counts. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Additional documentation improvements: - Add "Implementation Status" section with all completed PRs (PR #0-9) with commit hashes and PR numbers - Add timestamp propagation path diagram showing Request → EngineCoreOutput → OutputProcessor → req_state.stats flow - Update PR #0 description to clarify metrics restoration evolution (journey events were interim, replaced by direct capture in PR #9) - Clarify timeline and completion status Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Completes the migration to OTEL-based journey tracing by removing all intermediate buffering and export mechanisms introduced in earlier PRs. This is the final PR in the 9-part journey tracing implementation series.
Changes:
update_from_output()do_tracing()queued_ts,scheduled_ts) to Request object usingtime.monotonic()EngineCoreOutputfor Prometheus metricsjourney_eventsparametersKey Design Decisions:
scheduled_tsset only once (never overwritten)Test Plan
Added 16 comprehensive tests in
tests/v1/core/test_pr9_no_buffering.py:✅ TestNoBuffering (3 tests)
EngineCoreOutputs.journey_eventsalways None✅ TestSpanInfrastructure (2 tests)
✅ TestMetricsIndependence (3 tests)
✅ TestBackwardCompatibility (2 tests)
journey_eventsparameter acceptedEngineCoreOutputs.journey_eventsfield exists✅ TestTimestampCapture (4 tests)
queued_tscaptured on add_requestscheduled_tscaptured on first schedulescheduled_tsnot overwritten on subsequent scheduleslog_stats=False✅ TestZeroOverheadWhenDisabled (2 tests)
Test Results: All 16 PR #9 tests pass ✓
Regression Check: All existing scheduler tests pass ✓
Dependencies
This PR depends on PRs #1-8 in the journey tracing series:
Files Modified
vllm/v1/core/sched/scheduler.py- Remove buffering, add direct timestamp capturevllm/v1/engine/output_processor.py- Remove export, propagate timestampsvllm/v1/engine/async_llm.py- Remove journey event distributionvllm/v1/request.py- Addqueued_ts,scheduled_tsfieldsvllm/v1/engine/__init__.py- Add timestamp fields toEngineCoreOutputtests/v1/core/test_pr9_no_buffering.py- New comprehensive test suiteJOURNEY_TRACING_PR_PLAN.md- Update PR [CI] Add Docker build and push workflow #9 design specification🤖 Generated with Claude Code