feat(auto-reply): run-generation fence for stronger interruptibility (refs #70319)#70363
feat(auto-reply):
run-generation fence for stronger interruptibility (refs #70319)#70363darconadalabarga wants to merge 3 commits into
Conversation
Introduce a per-session generation counter that complements the existing AbortController-based cancellation in replyRunRegistry. Each ReplyOperation now captures the session's current generation at run begin and exposes isCurrent() so downstream emission and tool-dispatch sites can fence stale side effects after an abort or new-message takeover. - src/auto-reply/reply/run-generation.ts: new registry (get/increment/ isCurrent/forget), global-singleton backed so it survives split chunks. - src/auto-reply/reply/reply-run-registry.ts: capture generation on createReplyOperation; bump generation in abortWithReason so any abortByUser / abortForRestart path invalidates the captured value. - src/auto-reply/reply/abort.ts: bump generation in tryFastAbortFromMessage so /stop fences late output even if the registry lookup misses. - src/auto-reply/reply/run-generation.test.ts: 16 tests covering the registry, registry wiring, and the stale-output / pre-tool / typing fence patterns described in CLAUDE_CODE_PROMPT.md tests 1, 2, 3, 5. - src/auto-reply/reply/agent-runner-execution.test.ts: update the ReplyOperation mock to include the two new fields. Piece D (SIGTERM->SIGKILL escalation) is already covered by the existing src/process/kill-tree.test.ts suite; no change needed. Pieces B (pre-tool gate), C (emission-site fences) and E (new-message takeover) are deferred to follow-up commits: each invasive emission site needs dedicated wiring that deserves its own focused review. Inspired by patterns from Hermes (NousResearch/hermes-agent). AI-assisted: yes (Claude Code / Opus 4.7)
Adds an opt-in stale-output fence parameter to createBlockReplyDeliveryHandler so callers can bail on late block replies from superseded runs. Pass `() => replyOperation.isCurrent()` to wire the fence to the run-generation registry introduced in the previous commit. Pure opt-in — existing callers that don't pass isRunCurrent keep current behavior. New test covers that the fence drops a post-abort delivery.
Pass `() => replyOperation.isCurrent()` into createBlockReplyDeliveryHandler so the stale-output fence is now active in the real runtime, not just in tests. A block reply from a superseded run (after /stop or a new user message) is silently dropped before hitting the channel. This is the first site where the generation registry actually changes runtime behavior. All 1174 auto-reply tests continue to pass.
|
Thanks for the context here. I swept through the related work, and this is now duplicate or superseded. Keep this PR open: it contains meaningful interruptibility work, but it is still draft/dirty, lacks real behavior proof, and now needs to be reconciled with current main's adjacent foreground-delivery fence before merge. Canonical path: Close this stale PR. The latest review rated it F, the branch still lacks merge-ready proof, and there has been no human follow-up after the durable review. So I’m closing this here because the remaining work is already tracked in the canonical issue. Review detailsBest possible solution: Close this stale PR. The latest review rated it F, the branch still lacks merge-ready proof, and there has been no human follow-up after the durable review. Do we have a high-confidence way to reproduce the issue? No. Source inspection shows current main lacks this PR's run-scoped block-reply fence, but it also has a newer foreground delivery fence, so the remaining stale-output gap still needs live reproduction or after-fix proof. Is this the best way to solve the issue? Unclear. The run-generation foundation is plausible, but current main already has dispatch-level foreground delivery fencing and this PR only wires block replies, so maintainers should align the contracts or require proof that a separate run-level counter is necessary. Security review: Security review cleared: The diff adds in-memory cancellation bookkeeping and tests without changing dependencies, CI, secrets, auth, downloaded artifacts, or package execution paths. AGENTS.md: found and applied where relevant. What I checked:
Likely related people:
Codex review notes: model gpt-5.5, reasoning high; reviewed against fa614d0907e8. |
|
ClawSweeper PR egg 🎁 Pass real behavior proof to wake the egg and unlock a hatchable treat. Where did the egg go?
|
|
This pull request has been automatically marked as stale due to inactivity. |
|
ClawSweeper applied the proposed close for this PR.
|
Closes (partially) #70319.
Summary
Introduces a per-session run-generation counter that layers on top of the existing abort primitives (
replyRunRegistry,abortEmbeddedPiRun,chat.abort,cancelScope) to provide a unified invalidation signal downstream code can consult before producing side effects.Goal from the issue: after
/stopor a new user message, the superseded run must not produce more output — tool calls, deltas, typing, final replies.This PR ships the foundation (Pieces A + partial C from the issue's implementation sketch) and wires one real emission site so the fence is live.
What's in this PR
Piece A — Run generation registry (new)
src/auto-reply/reply/run-generation.ts:getCurrentGeneration,incrementGeneration,isCurrentGeneration,forgetGeneration. Global-singleton backed.Piece A wiring —
ReplyOperationcarries the generationReplyOperation.runGenerationcaptured atcreateReplyOperation.ReplyOperation.isCurrent()convenience wrapper overisCurrentGeneration(key, runGeneration).abortWithReasonin the registry bumps the generation, so anyabortByUser/abortForRestartpath invalidates the captured value.Fast-abort integration
tryFastAbortFromMessageinabort.tsalso bumps generation up-front so/stopfences late output even when the registry lookup misses (race between end-of-run and the stop message).Piece C — Stale-output fence (partial, opt-in)
createBlockReplyDeliveryHandleraccepts an optionalisRunCurrent?: () => booleanparameter. When provided and returns false, the block reply is silently dropped before hitting the channel. Callers pass() => replyOperation.isCurrent().Live wiring
agent-runner-execution.tspasses the fence callback so post-abort block replies are actually dropped in the real runtime.Piece D — SIGTERM→SIGKILL escalation
Already covered by
src/process/kill-tree.test.ts. No change needed.What's explicitly deferred
dispatch.ts/queue.tsintegration. The primitive (incrementGeneration) is in place.typing.ts, reply-delivery for non-block path,followup-delivery.ts. Same opt-in shape.Tests
src/auto-reply/reply/run-generation.test.ts— 17 tests covering:ReplyOperationgeneration wiring (capture at begin, flip on abort, next run captures new gen).createBlockReplyDeliveryHandler.pnpm buildalso passes clean.Constraints honored
AGENTS.md/CONTRIBUTING.md/CLAUDE.md.any.Happy to break into smaller commits or reshape if the approach lands differently than maintainers prefer.