feat(session): add run incident diagnostics#812
Conversation
|
Warning Rate limit exceeded
You’ve run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (4)
📝 WalkthroughWalkthroughThis PR introduces a comprehensive run-incident framework that captures, analyzes, and reports terminal conditions during session execution. It derives incidents from collected observability evidence, computes recovery recommendations and user-facing summaries, sanitizes evidence with redaction rules, integrates incident collection into the run-observability recorder during execution, exports incidents as top-level snapshot diagnostics, and refines observability instrumentation in the event processor. ChangesRun Incident Framework and Observability Integration
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related issues
Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request introduces a comprehensive 'Run Incident' diagnostic system to track and report failures during session execution. It adds granular tracking for tool input generation and materialization, a new RunIncident module for deriving and sanitizing incident data, and logic for recovery recommendations. Feedback includes restoring state-downgrade guards in the recorder, removing redundant methods, preserving sanitized tool names in evidence for better diagnostics, and ensuring all defined stream phases are handled in the derivation logic.
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (1)
packages/opencode/src/session/run-observability/recorder.ts (1)
44-44: 💤 Low valueUnused variable:
toolExecutionCompletedis set but never read.This variable is assigned in
recordToolCompleted(line 256) but is never used infinalize()or elsewhere. The completion state is already captured in evidence events, so this flag may be dead code.Consider removing if unneeded, or add a TODO if it's intended for future use.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/opencode/src/session/run-observability/recorder.ts` at line 44, The variable toolExecutionCompleted is assigned but never read—remove the dead variable declaration and its assignments (references in recordToolCompleted) since completion is already captured by evidence events, or if you intend to keep it for future use add a clear TODO comment where toolExecutionCompleted is declared and where it’s set in recordToolCompleted and ensure finalize() or other consumers will use it; update/remove any related tests or code paths that reference toolExecutionCompleted to keep the module consistent.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@packages/opencode/src/session/run-incident/derive.ts`:
- Around line 140-150: streamPhase mapping mislabels provider progress and omits
"reasoning_generation"; update the ternary in phaseFor so
input.facts.provider_progress_seen maps to "reasoning_generation" (replace the
current "before_first_provider_progress" value) and keep the other branches the
same; reference the streamPhase variable and input.facts.provider_progress_seen
when making this single-value change so incident stream diagnostics include
"reasoning_generation".
In `@packages/opencode/src/session/run-incident/sanitize.ts`:
- Around line 27-33: selectedEvents is naively sliced from the front which can
drop explicitly-selected terminal/cleanup anchors; change the capping logic so
that when selectedEvents.length > MAX_EXPORTED_EVIDENCE you always preserve
events whose anchor indicates terminal/cleanup: compute anchors =
selectedEvents.filter(e => e.anchor === 'terminal' || e.anchor === 'cleanup'),
then build the final selectedEvents by taking the newest non-anchor items (from
the end) up to cap - anchors.length and append the anchors (dedupe by order),
update selectedOrders/omitted accordingly so omitted only contains truly
unselected events; use the existing symbols selectedEvents,
MAX_EXPORTED_EVIDENCE, selectedOrders, omitted and evidence to implement this
replacement.
In `@packages/opencode/test/session/run-observability.test.ts`:
- Around line 479-487: The assertion for summary.incident?.phase is
inconsistent: since the terminal attempt (second.attemptID) has provider
progress recorded earlier, change the expected stream_phase from
"before_first_provider_progress" to a phase that reflects progress during the
terminal attempt (e.g., "during_provider_progress") so the phase assertion
matches the recorded terminal-attempt events; keep terminal_attempt_id:
second.attemptID unchanged and ensure the new value aligns with the event names
used elsewhere in the test.
---
Nitpick comments:
In `@packages/opencode/src/session/run-observability/recorder.ts`:
- Line 44: The variable toolExecutionCompleted is assigned but never read—remove
the dead variable declaration and its assignments (references in
recordToolCompleted) since completion is already captured by evidence events, or
if you intend to keep it for future use add a clear TODO comment where
toolExecutionCompleted is declared and where it’s set in recordToolCompleted and
ensure finalize() or other consumers will use it; update/remove any related
tests or code paths that reference toolExecutionCompleted to keep the module
consistent.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: 28cde99c-d83a-4c26-bb1d-9b248b5c754c
📒 Files selected for processing (12)
packages/opencode/src/session/export.tspackages/opencode/src/session/processor.tspackages/opencode/src/session/run-incident/derive.tspackages/opencode/src/session/run-incident/index.tspackages/opencode/src/session/run-incident/policy.tspackages/opencode/src/session/run-incident/presentation.tspackages/opencode/src/session/run-incident/sanitize.tspackages/opencode/src/session/run-incident/types.tspackages/opencode/src/session/run-observability/recorder.tspackages/opencode/src/session/run-observability/types.tspackages/opencode/test/session/export.test.tspackages/opencode/test/session/run-observability.test.ts
Summary
Adds PR1 for #808's Run Incident Framework: a structured
run-incidentmodel, append-only evidence derivation, and sanitized incident export beside existing run observability diagnostics.Why
#803 showed a concrete misclassification: a provider transport disconnect during partial tool-input generation could be summarized as
tool_failure.tool_execution_failedeven though no tool execution started. This PR separates tool-call generation from execution and derives the terminal cause from ordered evidence so later cleanup/finalizer records cannot overwrite the initiating failure.Related Issue
Part of #808 Phase 1 / PR1 and directly addresses #803. Builds on #788 and #794. Does not implement #802's full lifecycle causal spine or #804 recovery UX.
Human Review Status
Pending
Review Focus
RunIncidentfields stay derived from append-only evidence rather than mutable summary overwrites.provider_transport_disconnectand not tool execution failure.Risk Notes
classification,summary_key, andretry_safetyremain for compatibility; new export surfacesrun_incidentsas the authoritative structured path when present.How To Verify
Screenshots or Recordings
Not applicable — no visible UI changes.
Checklist
bug,enhancement,task,documentation. Type labels are author-added; the labeler bot does NOT assign them. Add the label in the GitHub UI, then tick this.app,ui,platform,harness,ci. The labeler bot assigns these on PR open based on changed paths. Confirm the bot's choice (or override if wrong), then tick this.P0,P1,P2,P3. The priority-triage bot suggests one on PR open. Confirm or override, then tick this.Pending,Approved by @<reviewer>, orNot required: <reason>(default isPending; "not required" is restricted to bot-authored low-risk PRs).dev, and my PR title and commit messages use Conventional Commits in English.Summary by CodeRabbit
Release Notes