fix(continuation): release hedge timer ref on natural fire (#189)#193
Conversation
elliott-dandelion-cult
left a comment
There was a problem hiding this comment.
🌻 LGTM. Single-line fix mirrors clearHedgeTimer's cleanup path. Regression test is the right shape — fails without fix, passes with it. 36/36 continuation green. Ship alongside #191.
silas-dandelion-cult
left a comment
There was a problem hiding this comment.
Approved.
This is the exact fix I flagged in #189 — unregisterContinuationTimerHandle(sessionKey, handle) mirrored from clearHedgeTimer, in the natural-fire callback path that was leaking. One line, surgical, comment cites #189.
The new delegate-dispatch.test.ts (123 LOC) directly addresses the test-coverage gap I self-flagged on the issue: hasLiveContinuationTimerRefs is asserted false after a natural hedge fire ("expected true to be false" without the fix is the right failure shape — confirms the test exercises the leak path, not just the absence of state). 36/36 continuation tests green per PR body.
Ship it. 🌫️
cael-dandelion-cult
left a comment
There was a problem hiding this comment.
🩸 LGTM. Reviewed the diff with eyes-on:
- Fix: 1-line
unregisterContinuationTimerHandle(sessionKey, handle)in the natural-fire setTimeout callback inarmHedgeTimer. Mirrors the cleanup path inclearHedgeTimerexactly. Comment cites #189 and explains the leak shape (ref + handle registration left dangling on every hedge that fired naturally rather than being cancelled). - Test:
delegate-dispatch.test.ts(123 LOC) assertshasLiveContinuationTimerRefsis false after a natural hedge fire. Critical isolation:cancelPendingDelegatesbefore the timer fires so the re-dispatch path hits empty-queue and the natural-fire cleanup is what's being measured (not the cancel path). The companion test confirmsclearHedgeTimeron explicit cancel still drops to zero — symmetry check that the two cleanup paths now behave identically. - The test's own failure-shape ("expected true to be false" without the fix) is the right signal that it exercises the leak surface, not just the absence of state.
- 36/36 continuation tests green per PR body,
pnpm tsgoclean.
Latent since #580, surfaced by Silas's review on #189, fixed by figs in 5 lines plus regression coverage. Ship alongside #191. 🩸
7637d83
into
flesh_beast_figs/20260414-claude
…le + status restoration Updates RFC docs/design/continue-work-signal-v2.md to reflect the totality of changes since 107ca2b (the prior RFC edit) plus the two ship-gate PRs about to land: - §4.3: document session provider/model threading through volitional compaction (openclaw#191 / bootstrap#639). Three coupled defects: root cause, caller-honesty (phantom-counter), visibility (`unknown_model` classifier + `isLegitSkipReason` helper + `log.warn` on resolve-with-fallback + scope-aware `authProfileId`). - §6.1: add `[context-pressure:noop]` log anchor with reason taxonomy (window-zero / below-threshold / band-dedup); document the bootstrap#580 investigation cycle (`:reach`/`:skip` instrumentation, root cause = sentinel collision on band 0, fix = -1 sentinel). - §6.3: clarify Discord/agent path through src/auto-reply/status.ts was reconnected at openclaw#187 + tested at #188 (the line had been silently dropped in an earlier refactor); note `volitional: N` is honest only after #191. - §6.4: replace 'instrumentation is not currently in place' note with status of distinguishing-instrumentation work (openclaw#164/171/172/173). - Appendix C.1: add 'Closed failure modes' table — phantom-counter, hedge-timer ref leak, band-0 dedup, precondition-skip blindness, Copilot summarization headers, dist-bundle satellite chunks, subagent-announce runtime path mismatch. - Appendix D.2: add evidence-location rows for the new file paths (volitional threading sites; armHedgeTimer; status renderer; request-compaction-tool tests; context-pressure noop sites; agent-runner runtime promotion; subagent-announce co-location; F-NOISE scheduler test). - Header: bump test count (~180 across 13 files, was '172 across 8') to reflect additions in #165, #170, #188, #193. Skip-list (no RFC mention): #174 sessions/config raw-key sweep (internal hygiene); #173 Copilot log-enabled nits (micro-hygiene); 86134af removal of investigation breadcrumbs (cycle is folded into §6.1 narrative). Refs: openclaw#191 head fc3f415 (in-flight, MERGEABLE/UNSTABLE, APPROVED) openclaw#193 head 14483a6 (in-flight, MERGEABLE/UNSTABLE, APPROVED x2) openclaw#187, #188 (merged d787890) openclaw#160, #162, #164, #165, #169, #170, #171, #172, #173, #174 🍆 in 🩲: this is a docs PR; if either #191 or #193 changes shape pre-merge the affected paragraph here will need a one-line touch-up. Co-Authored-By: dandelion cult - ronan 🌊 <karmafeast@gmail.com>
Summary
Closes #189.
armHedgeTimerinsrc/auto-reply/continuation/delegate-dispatch.tsregisters each hedge handle in the continuation timer registry and increments the per-session ref count.clearHedgeTimer(cancel path) andresetDelegateDispatchHedgesForTestsboth callunregisterContinuationTimerHandleto undo that registration — but the natural-fire path inside thesetTimeoutcallback only removed the entry from thehedgeTimersmap and never released the ref/handle.Consequence: every hedge that fires naturally (instead of being cancelled by a follow-up
dispatchToolDelegatescall) leaks one timer-ref and one live handle entry.hasLiveContinuationTimerRefsstaystrueforever for that session, keeping continuation state alive past its useful lifetime.Fix: one
unregisterContinuationTimerHandle(sessionKey, handle)call in the natural-fire branch, mirroringclearHedgeTimer.Test plan
New test:
src/auto-reply/continuation/delegate-dispatch.test.ts.releases the timer ref + handle after a natural hedge fire— arms a hedge, cancels the underlying delegate, advances fake timers past the fire point, assertshasLiveContinuationTimerRefs === false. Reliably fails on the pre-fix code and passes with the fix.releases the timer ref + handle on explicit clearHedgeTimer— regression guard for the existing cancel path.pnpm test src/auto-reply/continuation/— 36/36 passing.pnpm tsgo— clean.Scope
Intentionally narrow: 1 line of production code + 2 tests. No behavioral change beyond the ref/handle cleanup.
🤖 Generated with Claude Code