You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Today delegate_task is a synchronous tool call that returns only the final message from the child. The parent orchestrator has no visibility into what the child is doing mid-flight, and no way to interrupt a clearly-off-track subagent until the child ends or a hard timeout fires.
This shows up in two failure modes:
Child walked off-road — parent asked for "fix the import in src/X.py", child spent 90s refactoring unrelated modules. Parent can only find out after paying the full wall-time and token cost.
Child stuck on rate-limited upstream — during MiniMax peak hour (15:00-17:30 Asia/Shanghai), 529 retries can stall a single claude -p subprocess for 30-90s per call. Parent has no signal the child is throttled vs. genuinely working.
Proposal
Two-part addition, minimal enough to not disturb the existing blocking contract:
Streams the child's token output, tool calls, and tool results to the parent as they happen. Parent can interrupt by closing the generator.
2. Interrupt contract
On cancellation:
Child's in-flight tool call is allowed to finish (file writes shouldn't be half-applied).
Child's subprocess receives a structured "stop" message — graceful shutdown, not SIGKILL.
Child's partial work + any tool-result side effects are returned to the parent as a final event so the orchestrator can decide what to keep.
Why it matters
Wasted-work recovery — a parent that can see the first 200 tokens of reasoning can often tell if delegation is about to fail, and reroute before paying full cost. In the delegation benchmark this was the single clearest signal that could have saved the 234s task10 trial.
Rate-limit awareness — streaming naturally exposes "child is retrying on 529" vs. "child is generating tokens," which today look identical from the parent's side.
Doesn't require the child to literally stream from the model provider — even a line-buffered relay of the child's tool-call log would cover 80% of the value.
Interrupt behavior on the claude -p proxy shim can land in two steps: first emit "stop" via the proxy's HTTP API so the session ends cleanly; later add true in-flight token cancellation.
Evidence
During internal peak-hour testing of a skill that delegates through the Anthropic-compat shim, a stuck-but-retrying claude -p call held the orchestrator for ~130s with zero visibility; the parent eventually timed out and lost the partial work. Streaming + cancel would have let the parent bail at 20s and reroute.
The delegation benchmark (30 runs) shows strategy B (uniform delegation) losing wall-time mostly because it can't unchoose a bad delegation mid-flight; a streaming + interrupt primitive is what would let a hybrid strategy recover.
Problem
Today
delegate_taskis a synchronous tool call that returns only the final message from the child. The parent orchestrator has no visibility into what the child is doing mid-flight, and no way to interrupt a clearly-off-track subagent until the child ends or a hard timeout fires.This shows up in two failure modes:
Child walked off-road — parent asked for "fix the import in src/X.py", child spent 90s refactoring unrelated modules. Parent can only find out after paying the full wall-time and token cost.
Child stuck on rate-limited upstream — during MiniMax peak hour (15:00-17:30 Asia/Shanghai), 529 retries can stall a single
claude -psubprocess for 30-90s per call. Parent has no signal the child is throttled vs. genuinely working.Proposal
Two-part addition, minimal enough to not disturb the existing blocking contract:
1.
delegate_task_stream(new)Streams the child's token output, tool calls, and tool results to the parent as they happen. Parent can interrupt by closing the generator.
2. Interrupt contract
On cancellation:
Why it matters
Non-goals / scope
claude -pproxy shim can land in two steps: first emit "stop" via the proxy's HTTP API so the session ends cleanly; later add true in-flight token cancellation.Evidence
claude -pcall held the orchestrator for ~130s with zero visibility; the parent eventually timed out and lost the partial work. Streaming + cancel would have let the parent bail at 20s and reroute.