Skip to content

[Feature] Add lifecycle causality diagnostics for interrupted runs #802

@Astro-Han

Description

@Astro-Han

Goal

Make suspended-session exports explain the causal chain behind local run interruptions, not just the failing layer.

A post-#794 field export showed the new run-level diagnostics correctly classified an interrupted run as local_instance_reload, with watchdog.fired: false, provider progress present, and tool execution not yet started. That moved suspicion away from the LLM, timeout, and todowrite; however, the export still could not answer the next required question: what triggered the InstanceStore.reload that closed the active run scope?

This change is done when future exports can answer, for every local lifecycle close that interrupts an active run:

  • what lifecycle operation happened (reload, dispose, disposeDirectory, disposeAll),
  • what initiated it (API route, server handler, renderer action, config invalidation, context mismatch, shutdown, or explicit unknown),
  • which directory/session/run was affected,
  • whether the provenance is complete or exactly which fields are missing,
  • and whether retry/recovery decisions are safe based on visible output and tool side-effect facts.

Confidence discipline

Do not claim this will catch every future bug in the abstract. That would be false. The target is stricter and checkable:

  1. 100% coverage for every currently known code path that can locally close an active session run scope.
  2. No silent attribution gaps: if a future or missed path interrupts a run without origin metadata, the export must explicitly classify it as unknown/missing provenance rather than looking diagnosed.
  3. No PR until a plan has been posted here, externally reviewed with GPT Pro or an equivalent independent reviewer, and explicitly approved in an issue comment.

Scope

In scope:

  • Extend lifecycle close provenance beyond feat(session): trace lifecycle close provenance #794's action kind into a causal origin record.
  • Add or thread correlation IDs across renderer actions, API requests, server handlers, InstanceStore lifecycle actions, SessionRunState interrupts, run observability summaries, and session export.
  • Add an incident-chain summary to export diagnostics so a suspended session can show the nearest causal chain instead of requiring manual grep.
  • Add missing-provenance reporting for any scope close without a complete lifecycle origin.
  • Add tests that deliberately interrupt active runs through each current local lifecycle path.
  • Redact or hash path/user data where needed; do not dump prompts, tool inputs, secrets, or raw request bodies into diagnostics.

Out of scope:

Recommended design direction

Use a small causal spine instead of more ad hoc log fields:

  1. ClientActionContext in the renderer for user-visible actions that can call backend APIs.
  2. RequestContext in server middleware for every local API request, even when no renderer action ID exists.
  3. LifecycleAction in InstanceStore, carrying operation kind plus origin metadata.
  4. InterruptMeta in SessionRunState, carrying the lifecycle action and active runner/session context.
  5. RunObservability final summary, classifying the run and embedding sanitized lifecycle origin.
  6. ExportIncidentChain, derived during export, joining renderer diagnostics + request/lifecycle/run records where available and reporting explicit missing_provenance where not.

Reject the weaker alternatives:

  • Merely adding reload_origin to InstanceStore.reload is too narrow and will miss dispose/config/shutdown paths.
  • Dumping more raw logs is noisy, privacy-risky, and still leaves humans to reconstruct causality.

Known loopholes to close before PR

  • Direct scope finalizers that close a run without going through InstanceStore.
  • InstanceStore.load calling reload(input) due to context mismatch without recording the mismatch reason.
  • project.git.init and any future API route causing reload without a request/route origin.
  • Config invalidation or server shutdown causing broad disposeAll without an origin.
  • Race where lifecycle action context is popped before run-state finalizers record the interrupt.
  • Overlapping lifecycle actions for the same directory; the nearest action must be correct and stack-safe.
  • Path normalization mismatch between renderer directory, API header/query, InstanceStore, and SessionRunState.
  • Workspace/proxy/remote cases where the local app cannot know the remote origin; this must be marked as forwarded/remote/unknown instead of guessed.
  • Process crash or OS kill where no finalizer runs; startup/export should report prior incomplete runs as crash/restart evidence, not lifecycle-close evidence.
  • Old exports/builds without the new schema; export readers must remain backwards-compatible and label missing fields accurately.
  • Privacy and volume: no prompts, tool inputs, raw request bodies, secrets, or unbounded path lists in diagnostic records.
  • Tool side-effect facts must stay separate from lifecycle causality so retry safety is not accidentally weakened.

Relevant files or context

Likely files:

  • packages/opencode/src/project/instance-store.ts
  • packages/opencode/src/session/lifecycle-provenance.ts
  • packages/opencode/src/session/run-state.ts
  • packages/opencode/src/session/processor.ts
  • packages/opencode/src/session/run-observability/*
  • packages/opencode/src/session/export.ts
  • packages/opencode/src/server/instance/*
  • packages/opencode/src/server/routes/instance/middleware.ts
  • renderer diagnostics/action logging under packages/app/src/pages/session and packages/app/src/context/*

Verification

Minimum verification before PR review:

  • Targeted unit/integration tests for InstanceStore.reload, dispose, disposeDirectory, disposeAll, load -> reload(context mismatch), and project.git.init interrupting an active run.
  • Export tests proving the incident chain includes lifecycle origin when present.
  • Export tests proving missing origin is explicit (missing_provenance) when absent.
  • Race/overlap tests for nested lifecycle actions on the same directory.
  • Path normalization tests for directory keys.
  • Redaction tests proving diagnostics do not include prompts, raw tool inputs, secrets, or raw request bodies.
  • Backwards-compatibility test for a feat(session): trace lifecycle close provenance #794-era export that has local_instance_reload but no origin.
  • bun run typecheck for affected packages.

GPT Pro review prompt

Before implementation, paste this issue plus the draft plan into GPT Pro and ask:

We are adding causal diagnostics for interrupted AI session runs in a desktop app. The current system can classify a run as interrupted by local InstanceStore.reload, but cannot identify the initiator. Review this design for every loophole that could still leave an interrupted run without a trustworthy causal chain. Focus on races, uninstrumented lifecycle paths, privacy leaks, over-attribution, remote/workspace cases, backward compatibility, and verification gaps. Do not optimize for more logs; optimize for evidence-grade causality and explicit unknowns. Return P0/P1 blockers first, then P2 improvements, and say which tests would make you confident.

Implementation must not begin until the review output is considered and the final plan is explicitly approved here.

Execution mode

Investigate and propose a plan first — the agent must post the plan as an issue comment and wait for an explicit "approved" comment before writing code or opening a PR.

Metadata

Metadata

Assignees

No one assigned

    Labels

    P1High priorityappApplication behavior and product flowsenhancementNew feature or requestharnessModel harness, prompts, tool descriptions, and session mechanics

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions