Skip to content

subagents tool should expose 'kill' and 'reap-stale' actions #87445

@ssdatye

Description

@ssdatye

Problem

The subagents tool currently only exposes action='list'. When a child run goes ghost (parent waiting on a child that will never complete), there is no in-tool way to terminate it. The only recovery path is a full gateway restart, which is destructive — it kills every healthy subagent and session along with the stuck one.

Requested actions

action='kill'

Args: runId (or sessionKey, or taskName as a stable alias). Effects:

  1. Terminate the underlying child process / session.
  2. Emit a synthetic completion event to the parent so sessions_yield unparks.
  3. Remove the run from the active list.

action='reap-stale'

Optional helper. Args: olderThanMs (default 600000). Effects:

  1. Iterate active runs whose last activity is older than olderThanMs.
  2. Apply 'kill' semantics to each.
  3. Return a summary { reaped: [...], skipped: [...] }.

Motivation

Today (2026-05-27) two stuck subagent chains in the same workspace forced two gateway restarts (TSK-20260527-0006 and TSK-20260527-0010 v1). The restart disrupts other healthy work, breaks session continuity for unrelated channels, and is the wrong granularity of recovery.

Related: upstream issue for gateway-enforced runTimeoutSeconds (filed separately) would prevent most ghost cases; subagents kill is the manual escape hatch when prevention isn't enough.

Acceptance

  • subagents action='kill' target=<runId|sessionKey|taskName> terminates the target and emits a completion event to its parent.
  • subagents action='reap-stale' (optional) does the same in bulk.
  • Tool schema documents both actions in the allowed values enum.

Metadata

Metadata

Assignees

No one assigned

    Labels

    P2Normal backlog priority with limited blast radius.clawsweeper:fix-shape-clearClawSweeper found a clear likely implementation shape for this issue.clawsweeper:needs-maintainer-reviewClawSweeper marked this issue as needing maintainer review before automation.clawsweeper:needs-product-decisionClawSweeper marked this issue as needing a product or behavior decision.clawsweeper:no-new-fix-prClawSweeper does not recommend queueing a new automated fix PR for this issue.clawsweeper:source-reproClawSweeper found a high-confidence source-level issue reproduction.impact:session-stateSession, memory, transcript, context, or agent state can drift or corrupt.issue-rating: 🦞 diamond lobsterVery strong issue quality with high-confidence source-level or clear reproduction.

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions