fix(continuation): reset chain budget on fresh non-wake turn-entry (#987)#989
Conversation
) The runaway-guard counters continuationChainCount (n/maxChainLength) and continuationChainTokens (cost cap) persisted on the SessionEntry and only cleared on full session-rotation. A long-lived session therefore accumulated toward the chain/cost caps across unrelated re-engagements until every continuation was rejected forever — they are per-chain runaway leashes, not lifetime budgets. Reset the chain budget (count, tokens, startedAt, fresh chainId) at turn-entry BEFORE inference whenever the turn is not a continuation wake (work-wake / delegate-return). Fresh inbound messages, plain heartbeats, and outside-machinery system-events all start a new chain, so the resetting turn itself opens at 0. Continuation wakes are mid-chain steps and never reset, so a true runaway with no fresh re-entry still trips the cap. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Adds explicit RFC v10 language for the chain-budget reset semantics that #987 ships, per figs's "language must be explicit" ask + Frond's RFC-adoption-call to fold edits atomic with the code. Four load-bearing language updates: 1. §2.3 Safety model (line ~186): clarify "configured continuation budget" = current self-continuation chain budget (maxChainLength + costCapTokens), not session-lifetime. Notes fresh non-continuation turn-entry resets per §3.3. 2. §3.3 Chain-state tracking (lines ~507-525): adds continuationChainId to the four-field bullet list + new "Chain-state lifecycle" paragraph explaining the per-turn reset on !isContinuationWake, pre-loadContinuationChainState ordering, fresh-turn-elects-from-0 semantic, work-wake/delegate-return preservation. Names the session-rotation reset path distinct from per-turn chain-reset. 3. §5.1 Operational notes (lines ~918-919): maxChainLength + costCapTokens descriptions expanded to name "unattended self-continuation chain depth" leash semantic + reset trigger + cross-reference §3.3. 4. §5.1 New "Chain budget lifecycle" subsection: explicit sawtooth behavior description for /status display + four-field reset-unit + chain-id rotation semantic + methodological-note for source-readers naming the ?? 0 + chainId-mint-ternary as passive-default-not-active-reset (the trap that four cohort-princes hit + retracted today before locking the byte). Together with the code in c201f7c, this ensures the RFC language matches the shipped semantic atomically — no doc-drift on the chain-reset surface. Cohort-byte-converged through 6+ sources today. Cohort-cross-reference: methodological-note specifically names the loadContinuationChainState ?? 0 + chainId-mint-ternary source-reading traps that cohort byte-walked + 4 princes retracted from today (banked at ~/.openclaw/workspace/memory/2026-06-10.md, lesson #9).
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: cb35f4c9f5
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| continuationFeatureEnabled && | ||
| sessionKey && | ||
| activeSessionEntry && | ||
| !isContinuationWake && |
There was a problem hiding this comment.
Reset stale budgets for ordinary subagent returns
This guard treats every delegate-return as an in-chain wake, but src/agents/subagent-announce.ts:1459-1461 sets continuationTrigger: "delegate-return" for ordinary subagent completions whenever continuation is enabled, and src/agents/subagent-announce-delivery.ts:1435-1444 delivers those as inter-session agent calls. In that scenario a long-lived session with stale continuationChainCount can receive a normal subagent completion and still skip the new reset, so any continuation elected from that fresh completion can be rejected against the old chain cap; the #987 fix remains incomplete for a non-continuation external turn. Please distinguish actual continuation-chain delegate returns from ordinary subagent returns before suppressing the reset.
Useful? React with 👍 / 👎.
3084bf1
into
frond-scribe/20260609/assembly-token-wiring
Closes #987.
What
continuationChainCount(the/statusn/200) +continuationChainTokens(token-cost cap) accumulate per-session and never reset on a normal user-turn — the only reset-site in the tree isagent-runner-session-reset.ts:88(full session-rotation:/new,/reset, compaction-FAILURE-recovery, role-conflict, ACP-reset). So a long-lived session monotonically climbs towardmaxChainLength(200) with no decline until/new— figs's "pool of 195 forever" doom-lock. This is the bug, byte-confirmed at 6+ source convergence (figs's intuition + frond-scribe byte-walk + Silas filing + Emeric retraction + Elliott reconciliation + Rune retraction; the recurring?? 0LOAD-not-RESET conflation was caught + retracted by 4 princes).Fix
Reset all 4 chain-budget fields —
continuationChainCount+continuationChainTokens+continuationChainStartedAt+ a freshcontinuationChainId— at turn-entry before the runner reads chain-state, gated onisContinuationWake === false(a genuine external turn: user message / heartbeat / non-continuation system-event). Continuation-wake turns (work-wake/delegate-return) preserve the count, so the 200 leash still bounds an unbroken unattended self-continuation chain — its actual safety intent — not session-lifetime.Plain-language invariant: the cap counts how deep a self-scheduled loop runs while you're away; the instant you (or a heartbeat) re-engage, it's not a runaway anymore, so it resets to 0.
Reuses the local
persistContinuationChainState(mem + store + disk), zero new imports.Diff
3 files, +211/-6:
src/auto-reply/reply/agent-runner.ts(+28) — the gated reset at turn-entryagent-runner.continuation-work-span.test.ts(+185) — fresh-resets / wake-preserves //new-still-resets / P1: multi-continue_work() in one response silent-drops all but last (single-variable capture, not a store/scheduler bug) #982-fan-out-intact + 3 new continuation: chain-count + token-cost caps never reset on normal turns (perma-accumulate bug; reset on fresh non-continuation turn-entry) #987 testsagent-runner.continuation-span-uniformity.test.ts(+4) — wake-semantics alignmentGate status
auto-reply-replyGREEN: 153 files / 2573 tests (incl 3 new continuation: chain-count + token-cost caps never reset on normal turns (perma-accumulate bug; reset on fresh non-continuation turn-entry) #987 tests; P1: multi-continue_work() in one response silent-drops all but last (single-variable capture, not a store/scheduler bug) #982 array-capture fan-out intact)Review
🪨 Rune + 🌻 Elliott on the continuation-lifecycle surface (per cohort review-pairing). Verify: reset fires upstream of
loadContinuationChainState, all 4 fields as a unit, no phantom-depth on the status-line, runaway-chain still capped on pure self-wakes.Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com