Goal
Add structured LLM stream failure diagnostics to PawWork session exports so the next stream failure can be attributed to the correct boundary instead of requiring per-incident guesswork.
When this task is done, a failed assistant turn should tell us whether the failure most likely came from local cancellation, PawWork's stream watchdog, SDK / transport stream reading, provider / gateway closure, or an unknown boundary with enough evidence to continue investigation.
This is a diagnostic foundation task, not a user-visible behavior fix.
Scope
In scope:
- Extend the existing
llm_trace / session export diagnostics with a schema-versioned v2 shape or compatible v1 extension.
- Capture a compact stream phase timeline: request creation, SDK stream returned, watchdog armed, first event, first provider-progress event, last provider-progress event, failure/completion.
- Capture watchdog configuration and state:
connectTimeoutMs, streamTimeoutMs, provider-progress state, and timeout/failure phase.
- Capture a sanitized error fingerprint for stream failures: constructor name, error name, message, code, cause name/message/code, and at most a safe stack/module hint.
- Capture abort state at failure time: whether the LLM abort signal was already aborted, plus any available abort provenance.
- Capture safe provider correlation data when available, such as request id / response id / non-sensitive response headers.
- Keep exports safe: do not include auth headers, cookies, prompt text, tool args, raw provider response body, or arbitrary URLs.
- Preserve current runtime behavior. This task should improve diagnosis only.
Out of scope:
- Translating
terminated into user-facing copy.
- Changing retry behavior or timeout policy.
- Reworking the full watchdog architecture.
- Recording every stream chunk or provider packet.
- Building a general observability or telemetry platform.
- Provider-specific error taxonomy beyond safe raw fingerprints and phase classification.
Relevant files or context
Related issues / PRs:
Likely files:
packages/opencode/src/session/llm.ts
packages/opencode/src/session/llm-trace/types.ts
packages/opencode/src/session/llm-trace/recorder.ts
packages/opencode/src/session/processor.ts
packages/opencode/src/session/export.ts
packages/opencode/src/session/message-v2.ts
packages/opencode/test/session/llm.test.ts
packages/opencode/test/session/export.test.ts
Observed gap from #754's second reproduction:
- The export shows
UnknownError with data.message = "terminated" and flags.stream_error = true.
- The trace proves provider progress happened before failure.
- The trace does not show whether the abort signal was already aborted, whether a watchdog fired, what raw error class/code/cause was thrown by undici/SDK, or whether provider request correlation is available.
Verification
- Add focused unit tests for stream failure diagnostics on:
- connect timeout before first provider-progress event;
- mid-stream external iterator error after provider progress;
- local abort / interrupt path if currently observable in the test surface.
- Add export tests proving diagnostics are included and sanitized.
- Add regression tests ensuring sensitive data is not exported: auth headers, cookies, raw prompt text, raw tool args, and response bodies.
- Run targeted checks, likely:
bun --cwd packages/opencode test test/session/llm.test.ts test/session/export.test.ts --timeout 30000
bun --cwd packages/opencode typecheck
git diff --check
Execution mode
Investigate and propose a plan first — the agent must post the plan as an issue comment and wait for an explicit "approved" comment before writing code or opening a PR.
Goal
Add structured LLM stream failure diagnostics to PawWork session exports so the next stream failure can be attributed to the correct boundary instead of requiring per-incident guesswork.
When this task is done, a failed assistant turn should tell us whether the failure most likely came from local cancellation, PawWork's stream watchdog, SDK / transport stream reading, provider / gateway closure, or an unknown boundary with enough evidence to continue investigation.
This is a diagnostic foundation task, not a user-visible behavior fix.
Scope
In scope:
llm_trace/ session export diagnostics with a schema-versioned v2 shape or compatible v1 extension.connectTimeoutMs,streamTimeoutMs, provider-progress state, and timeout/failure phase.Out of scope:
terminatedinto user-facing copy.Relevant files or context
Related issues / PRs:
terminatedfrom upstream stream close leaks to assistant message without translation #754 — rawterminatedleaks from an upstream / transport stream close; current logs prove the symptom but not the failing boundary.Tool execution aborted; PR fix: preserve abort interrupt provenance #710 improved abort provenance and showed the value of targeted diagnostics.Likely files:
packages/opencode/src/session/llm.tspackages/opencode/src/session/llm-trace/types.tspackages/opencode/src/session/llm-trace/recorder.tspackages/opencode/src/session/processor.tspackages/opencode/src/session/export.tspackages/opencode/src/session/message-v2.tspackages/opencode/test/session/llm.test.tspackages/opencode/test/session/export.test.tsObserved gap from #754's second reproduction:
UnknownErrorwithdata.message = "terminated"andflags.stream_error = true.Verification
bun --cwd packages/opencode test test/session/llm.test.ts test/session/export.test.ts --timeout 30000bun --cwd packages/opencode typecheckgit diff --checkExecution mode
Investigate and propose a plan first — the agent must post the plan as an issue comment and wait for an explicit "approved" comment before writing code or opening a PR.