Skip to content

No recovery for sessions/tools stuck after abort signal failure #20097

@ESRE-dev

Description

@ESRE-dev

Problem

When a session or tool becomes stuck despite the abort signal (e.g., the abort handler fails, the shell hard-stop does not kill the process, or a provider ignores cancellation), there is no safety net to detect and recover.

Orphaned tool parts on crash

If the process exits (crash or clean shutdown) while tool executions are in-flight, their database state remains "running" forever. On restart, these orphaned parts are never cleaned up.

Stuck tools during runtime

A tool part that enters "running" but never transitions to "completed" or "error" — due to a missed abort signal, a deadlocked child process, or a provider that never responds — is invisible to the system. No periodic check detects or recovers it.

Idle sessions

Sessions where no activity occurs for an extended period (e.g., a subagent waiting on a prompt that will never come) are never detected or cancelled.

Expected behavior

  • On startup, orphaned "running" tool parts from the previous process should be marked as errored
  • A periodic watchdog should detect tool parts stuck beyond a configurable timeout and cancel their sessions
  • Leaf-level filtering: only force-error actual stuck tools, not task tools that are waiting on child sessions (let normal error propagation handle those)
  • Idle sessions with no activity beyond a configurable threshold should be cancelled
  • Configurable via experimental.tool_timeout, experimental.task_timeout, experimental.idle_timeout

Relationship

This is a safety-net complement to #20096 (tool timeout). While #20096 prevents new hangs, this catches cases where the timeout mechanism itself is bypassed.

Metadata

Metadata

Assignees

Labels

coreAnything pertaining to core functionality of the application (opencode server stuff)needs:complianceThis means the issue will auto-close after 2 hours.

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions