feat(session): add llm stream failure diagnostics#771
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (8)
🚧 Files skipped from review as they are similar to previous changes (7)
📝 WalkthroughWalkthroughThis PR implements end-to-end LLM stream failure diagnostics by adding boundary classification utilities, safe error/correlation fingerprinting, stream lifecycle tracking in the LLMTrace recorder, integration into stream timeout/error handling, export snapshot sanitization, and comprehensive test coverage. ChangesLLM Stream Failure Diagnostics
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related issues
Possibly related PRs
Suggested labelsenhancement Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
packages/opencode/src/session/llm-trace/types.ts (1)
160-160: 💤 Low valueConsider using a stricter schema for the
streamfield.Using
z.any().optional()bypasses Zod validation entirely. Given thatStreamDiagnosticsis a well-defined type, you could create a corresponding Zod schema (or usez.unknown().optional()with a type assertion) to catch malformed data at parse time rather than silently accepting anything.If this is intentional to allow schema evolution without breaking older parsers, a comment explaining the rationale would help.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/opencode/src/session/llm-trace/types.ts` at line 160, The `stream` field currently uses `z.any().optional()` which skips validation; replace it with a stricter schema that matches the defined `StreamDiagnostics` shape (create a `StreamDiagnosticsSchema` and use `stream: StreamDiagnosticsSchema.optional()`), or at minimum use `z.unknown().optional()` if you want validation-lite plus a TypeScript assertion; if the permissive `z.any()` was intentional for schema evolution, add a brief inline comment on the `stream` field explaining that rationale. Ensure you reference the existing `StreamDiagnostics` type when creating the Zod schema so parsing catches malformed data at parse time.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@packages/opencode/src/session/export.ts`:
- Around line 1019-1031: The export sanitizer currently spreads rawError into
safeError (via "...rawError") which preserves any unexpected/sensitive keys from
stream.error; change the construction of safeError in export.ts so it does NOT
spread rawError — instead build safeError solely from the output of
safeErrorFingerprint(rawError) (or explicitly pick only approved fields) and
pass that through compactObject, ensuring no arbitrary keys from rawError are
forwarded; keep the existing rawProvider -> safeProviderCorrelation logic
unchanged.
In `@packages/opencode/src/session/llm-trace/stream-diagnostics.ts`:
- Around line 135-144: The safeMessage function is missing Linux home directory
redaction; update the function (safeMessage) to also replace Linux home paths by
adding a regex like /\/home\/[^\s)]+/g with "[redacted:path]" alongside the
existing /Users/ and \\Users\\ replacements, keeping the same truncation
behavior (slice 0,1024) and placement near the other path redaction lines so
Linux usernames are scrubbed the same way.
---
Nitpick comments:
In `@packages/opencode/src/session/llm-trace/types.ts`:
- Line 160: The `stream` field currently uses `z.any().optional()` which skips
validation; replace it with a stricter schema that matches the defined
`StreamDiagnostics` shape (create a `StreamDiagnosticsSchema` and use `stream:
StreamDiagnosticsSchema.optional()`), or at minimum use `z.unknown().optional()`
if you want validation-lite plus a TypeScript assertion; if the permissive
`z.any()` was intentional for schema evolution, add a brief inline comment on
the `stream` field explaining that rationale. Ensure you reference the existing
`StreamDiagnostics` type when creating the Zod schema so parsing catches
malformed data at parse time.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: eb00b8b3-65d9-4df4-8cf7-92902506dd92
📒 Files selected for processing (10)
packages/opencode/src/session/export.tspackages/opencode/src/session/llm-trace/index.tspackages/opencode/src/session/llm-trace/recorder.tspackages/opencode/src/session/llm-trace/stream-diagnostics.tspackages/opencode/src/session/llm-trace/types.tspackages/opencode/src/session/llm.tspackages/opencode/src/session/processor.tspackages/opencode/test/session/export.test.tspackages/opencode/test/session/llm-trace.test.tspackages/opencode/test/session/llm.test.ts
There was a problem hiding this comment.
Code Review
This pull request implements a robust LLM stream diagnostics and tracing framework, featuring failure classification, monotonic timing, and a sanitization layer to redact sensitive information like URLs, secrets, and local paths. The review feedback suggests several enhancements: expanding the path redaction regex to support Linux home directories, removing redundant logic in error fingerprinting, refining type signatures for improved accuracy, and implementing guards to prevent overwriting high-confidence error states in the trace recorder.
|
Handled the P2 follow-ups as follows:
Local verification after the fixes:
|
Summary
Add nested LLM stream diagnostics to assistant traces so failed stream turns preserve safe, structured evidence about the surfaced failure boundary.
Changes include:
llm_trace.streamwith watchdog, timeline, provider, error, and legacy v1 counter semantics;diagnostics.llm_tracesand session-tree assistant diagnostics.Why
#754 showed raw
terminatedafter provider progress, while #755 showed pre-progress connect timeout. Existing v1 trace counters could distinguish those symptoms but could not show whether the failure surfaced at the watchdog, local abort, SDK/transport reader, provider stream, or unknown boundary.Related Issue
Refs #760.
Refs #754, #755, #721.
This PR is diagnostic-only and should not close #754/#755/#721 by itself.
Human Review Status
Pending
Review Focus
Please review against the #760 V6 checklist, especially:
Risk Notes
No visible UI/copy changes. No platform/packaging surface. No docs, dependencies, migrations, generated files, permissions, credentials, deletion behavior, or release-note changes are included.
Skipped conditional checklist items:
How To Verify
Screenshots or Recordings
Not applicable; no visible UI changes.
Checklist
bug,enhancement,task,documentation. Type labels are author-added; the labeler bot does NOT assign them. Add the label in the GitHub UI, then tick this.app,ui,platform,harness,ci. The labeler bot assigns these on PR open based on changed paths. Confirm the bot's choice (or override if wrong), then tick this.P0,P1,P2,P3. The priority-triage bot suggests one on PR open. Confirm or override, then tick this.Pending,Approved by @<reviewer>, orNot required: <reason>(default isPending; "not required" is restricted to bot-authored low-risk PRs).dev, and my PR title and commit messages use Conventional Commits in English.Summary by CodeRabbit
New Features
Tests