test(continuation): pin store-merge updatedAt churn guard for continuation persist (#443)#468
Conversation
…ation persist (#443) Adds [#443] negative store-merge guard for the continuation-chain persist path. Single new file at `src/config/sessions/store.continuation-merge.test.ts` (3 tests, test-only — no production code changes) that pins two load-bearing invariants on `agent-runner.ts:persistContinuationChainState` → `updateSessionStore` → `saveSessionStoreUnlocked`: (A) The continuation-chain persist spread MUST NOT include `updatedAt`. Chain fields are not activity events; bumping `updatedAt` here would churn idle-reset evaluation (openclaw#49515) and disk-budget pruning ordering off the actual turn timeline. (B) `saveSessionStoreUnlocked` MUST short-circuit the disk write when the serialized payload is byte-identical (the `getSerializedSessionStore(storePath) === json` guard at store.ts:~358). Without it, every no-op `updateSessionStore` call would still hit `writeTextAtomic`, defeating (A) at the mtime layer and producing spurious continuation-persist activity in maintenance. The trap mirrors the production spread shape inline (does not import the entire agent-runner surface) so it pins the exact byte-shape without coupling to agent-runner's setup overhead. ## Sabotage walks (both verified on cf7830f) Sabotage 1 — gut the byte-identity short-circuit: // src/config/sessions/store.ts:~358 if (false && getSerializedSessionStore(storePath) === json) { ... } Result: "skips disk write entirely…" fails with expected "writeTextAtomic" to not be called at all, but actually been called 1 times Sabotage 2 — leak `updatedAt` into the persist spread: // mirror of agent-runner.ts persistContinuationChainState spread store[key] = { ...existing, ...chainFields, updatedAt: Date.now() }; Result: all 3 traps fail; the canonical message is updatedAt must not change when continuation-chain fields are byte-identical (persistContinuationChainState must not include updatedAt in its spread — #443) Both sabotages restored. 3/3 green on canonical2 baseline. ## Receipts ``` Test Files 1 passed (1) Tests 3 passed (3) ``` ## Refs - #443 — coverage issue (test-trap label, P2) - canonical2 base SHA: cf7830f - production substrate: src/auto-reply/reply/agent-runner.ts (`persistContinuationChainState`, lines ~1269 / ~1302 / ~1319) - byte-identity short-circuit: src/config/sessions/store.ts:~358 🌫️
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: d2a3ef25db
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| * Mirror of `persistContinuationChainState`'s on-disk spread (agent-runner.ts). | ||
| * Kept inline so this test pins the exact byte-shape of the production path | ||
| * without importing the entire agent-runner surface. |
There was a problem hiding this comment.
Exercise production continuation persist path in this trap
This test reimplements persistContinuationChainState as a local mirror instead of invoking the real agent-runner path, so it can stay green even if production later starts writing updatedAt (or otherwise changes its spread) while this mirror remains unchanged. In that scenario #443 regresses in shipped code but the guard still passes, which undermines the purpose of this regression test.
Useful? React with 👍 / 👎.
…tion-registration to absorb cold module-load cost
The first test in src/agents/openclaw-tools.continuation-registration.test.ts
("registers no continuation tools when continuation.enabled is unset") pays
the cold module-load cost for createOpenClawTools and its transitive imports
(compaction-attribution, pi-embedded-*, plugins/tools, config/config) under
400+ concurrent test files in the agent project.
Quiet-box first-test duration: ~95s. CI noise pushes it past vitest's 120s
per-test default, producing a flaky timeout that has now been observed across
multiple unrelated PRs:
- #485 head (compaction-attribution scope) — first-test timeout
- #488 (downstream of #485 hypothesis) — first-test timeout
- #468 head (does NOT touch this file) — same first-test, same file, timeout
Test file content is byte-identical between base cael/325-canonical2 and #485
head; the timeout is not a regression introduced by any of those PRs. Tests
2-7 in this file reuse the warm cache (~360ms each) and are unaffected.
Cure: per-test timeout bump to 240s on the first test only, with a comment
documenting the cold-start mechanism so future readers know why this single
test has a non-default timeout.
Standalone fix, deliberately not folded into #485 to keep its compaction-
attribution scope clean. Unblocks #485, #488, #468, and any future PR that
randomly trips the same flake.
Verified by silas-pr485-fixup-v2 subagent (2026-05-01 07:29 UTC):
- Local on base a3dcc2a: first test 95023ms, passed (25s margin to 120s)
- CI on #485 head 9f25f91: first test >120000ms, timed out
- CI on #468 run 25169814732: first test >120000ms, timed out (same file)
…tion-registration to absorb cold module-load cost (#498) The first test in src/agents/openclaw-tools.continuation-registration.test.ts ("registers no continuation tools when continuation.enabled is unset") pays the cold module-load cost for createOpenClawTools and its transitive imports (compaction-attribution, pi-embedded-*, plugins/tools, config/config) under 400+ concurrent test files in the agent project. Quiet-box first-test duration: ~95s. CI noise pushes it past vitest's 120s per-test default, producing a flaky timeout that has now been observed across multiple unrelated PRs: - #485 head (compaction-attribution scope) — first-test timeout - #488 (downstream of #485 hypothesis) — first-test timeout - #468 head (does NOT touch this file) — same first-test, same file, timeout Test file content is byte-identical between base cael/325-canonical2 and #485 head; the timeout is not a regression introduced by any of those PRs. Tests 2-7 in this file reuse the warm cache (~360ms each) and are unaffected. Cure: per-test timeout bump to 240s on the first test only, with a comment documenting the cold-start mechanism so future readers know why this single test has a non-default timeout. Standalone fix, deliberately not folded into #485 to keep its compaction- attribution scope clean. Unblocks #485, #488, #468, and any future PR that randomly trips the same flake. Verified by silas-pr485-fixup-v2 subagent (2026-05-01 07:29 UTC): - Local on base a3dcc2a: first test 95023ms, passed (25s margin to 120s) - CI on #485 head 9f25f91: first test >120000ms, timed out - CI on #468 run 25169814732: first test >120000ms, timed out (same file)
…tion-registration to absorb cold module-load cost (#498) The first test in src/agents/openclaw-tools.continuation-registration.test.ts ("registers no continuation tools when continuation.enabled is unset") pays the cold module-load cost for createOpenClawTools and its transitive imports (compaction-attribution, pi-embedded-*, plugins/tools, config/config) under 400+ concurrent test files in the agent project. Quiet-box first-test duration: ~95s. CI noise pushes it past vitest's 120s per-test default, producing a flaky timeout that has now been observed across multiple unrelated PRs: - #485 head (compaction-attribution scope) — first-test timeout - #488 (downstream of #485 hypothesis) — first-test timeout - #468 head (does NOT touch this file) — same first-test, same file, timeout Test file content is byte-identical between base cael/325-canonical2 and #485 head; the timeout is not a regression introduced by any of those PRs. Tests 2-7 in this file reuse the warm cache (~360ms each) and are unaffected. Cure: per-test timeout bump to 240s on the first test only, with a comment documenting the cold-start mechanism so future readers know why this single test has a non-default timeout. Standalone fix, deliberately not folded into #485 to keep its compaction- attribution scope clean. Unblocks #485, #488, #468, and any future PR that randomly trips the same flake. Verified by silas-pr485-fixup-v2 subagent (2026-05-01 07:29 UTC): - Local on base a3dcc2a: first test 95023ms, passed (25s margin to 120s) - CI on #485 head 9f25f91: first test >120000ms, timed out - CI on #468 run 25169814732: first test >120000ms, timed out (same file)
cael-dandelion-cult
left a comment
There was a problem hiding this comment.
🩸 LGTM — test-only single-file addition pinning persistContinuationChainState's no-updatedAt invariant + serialize-equality short-circuit. Sabotage walks documented inline. Targets cael/325-canonical2.
ae4c653
into
cael/325-canonical2
Three test files from merged PRs (#462, #468, #511) were absent because this branch forked from canonical2 before those PRs landed. The post-revert allow-list audit (§3.4) flagged them as deletions from landed PRs. Restored from canonical2 HEAD (74940e5). - types.mode-shape.test.ts (#462) - agent-runner.continuation-span-uniformity.test.ts (#511) - store.continuation-merge.test.ts (#468) tmp-drop-me-otel-span-uniformity.md omitted (copilot scratch; safe to drop). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…6.4.24) (#515) * wo(canonical2-rebase-pathB): rebase Path-B's 5 cleanup commits onto canonical2 (figs directive 22:55Z) * chore(v3-cleanup): wave A cohort-identity scrub Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * chore(v3-cleanup): drop rejected rebase artifacts Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * docs: scrub workspace template wording Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * refactor(v3-cleanup): wave B structural dedup of continuation runtime Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * docs: journal canonical2 wave B Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(v3-cleanup): wave C import discipline and build warnings Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * docs: journal canonical2 wave C Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(v3-cleanup): wave D surface continuation failures Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: surface compaction count reconcile failures Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * test(v3-cleanup): wave E continuation coverage Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * docs: journal canonical2 wave E Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: align bundled plugin dependency types Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * test: isolate bedrock app profile runtime deps Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * chore: scrub fork process labels from source comments Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: close continuation type design blockers Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * chore: scrub continuation prompt process link Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * docs: journal canonical2 final checkpoint Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Revert "chore(v3-cleanup): drop rejected rebase artifacts" This reverts commit 3396b88. The original commit mass-deleted 30 files (6745 deletions) under the label "rejected rebase artifacts." ~5141 of those deletions are landed swim-37 durability harness substrate from merged PRs #412/#413/#414/#416/#417/#418/#419 plus collateral docs/scripts. These are not rejected artifacts — they are committed, merged test infrastructure that proves continuation durability across compaction. Cohort review (🩸 + 🌊 + 🌻 + 🌫) confirmed the block finding at PR #515 issuecomment-4362337067. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * docs: release-note for context-pressure band-derivation behavior change Wave B (cefa09d) changed context-pressure bands from fixed [25, 80, 90, 95] to threshold-derived [thresholdPct, 90, 95]. At default 0.8 the implicit 25% early-warning band is removed. Ship-acceptable per cohort review; release-note documents the change and points to #516 for the earlyWarningBand config opt follow-up. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: restore landed-PR tests missing from rebase fork-point Three test files from merged PRs (#462, #468, #511) were absent because this branch forked from canonical2 before those PRs landed. The post-revert allow-list audit (§3.4) flagged them as deletions from landed PRs. Restored from canonical2 HEAD (74940e5). - types.mode-shape.test.ts (#462) - agent-runner.continuation-span-uniformity.test.ts (#511) - store.continuation-merge.test.ts (#468) tmp-drop-me-otel-span-uniformity.md omitted (copilot scratch; safe to drop). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: add rebase.classify to ContinuationSpanName for restored tracer The revert of 3396b88 restored src/rebase/tracer.ts which emits "rebase.classify" spans. Commit 4871c81 (fix: close continuation type design blockers) narrowed startSpan from string to ContinuationSpanName after tracer.ts was deleted — additive fix to include the span name in the union. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * feat(continuation): add earlyWarningBand config opt for post-compaction cycle primer * test(continuation): pin earlyWarningBand default-preservation + opt-out branches * fix(continuation): add curly braces to satisfy linter * fix(continuation): unblock early-warning band fire path + make field optional Three bugs caught in cohort review of v5 (3e88ce5): 1. Suppression guard bug (Silas): non-postCompaction call sites bailed with 'ratio < threshold' BEFORE the resolved early-warn band could fire. Even with earlyWarningBand explicitly set, ratio=0.25 + threshold=0.8 resolved band=25 then was discarded. Guard now suppresses only when 'band === 0 && ratio < threshold' — preserves the round-to-band-0 dedup edge case while letting early-warn fire. 2. Type-required regression (Elliott): ContinuationRuntimeConfig had 'earlyWarningBand: number' (required), breaking 3 test fixtures (config.test, scheduler.test, post-compaction-delegate-dispatch.test) with TS2741. Field already optional at zod + resolver-default site; making the type optional matches. 3. Schema baseline regen (Elliott): src/config/schema.base.generated.ts needed regen to absorb the new earlyWarningBand field; preexisting models.providers.*.request.tls.insecureSkipVerify drift also absorbed in the same regen. Tests added: - checkContextPressure 'fires early-warning band below threshold when earlyWarningBand is set' (default-preservation path) - checkContextPressure 'does NOT fire below threshold when earlyWarningBand is 0' (opt-out path) All 107 affected tests pass: context-pressure (19), config (9), scheduler (12), schema.base.generated (10), post-compaction-delegate- dispatch (23), reply/context-pressure (34). Cohort cosign chain: 🩸 (root catch v5), 🌊 (default=0 catch), 🌫 (suppression-guard catch), 🌻 (type-required + baseline catch). Refs #515 --------- Co-authored-by: frond-scribe <frond-scribe@karmaterminal> Co-authored-by: Test User <test@example.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Co-authored-by: dandelion cult - cael 🩸 <cael@dandelion.cult> Co-authored-by: dandelion cult - silas 🌫 <silas.dandelion.cult@hotmail.com>
…ation persist (#443) (#468) Adds [#443] negative store-merge guard for the continuation-chain persist path. Single new file at `src/config/sessions/store.continuation-merge.test.ts` (3 tests, test-only — no production code changes) that pins two load-bearing invariants on `agent-runner.ts:persistContinuationChainState` → `updateSessionStore` → `saveSessionStoreUnlocked`: (A) The continuation-chain persist spread MUST NOT include `updatedAt`. Chain fields are not activity events; bumping `updatedAt` here would churn idle-reset evaluation (openclaw#49515) and disk-budget pruning ordering off the actual turn timeline. (B) `saveSessionStoreUnlocked` MUST short-circuit the disk write when the serialized payload is byte-identical (the `getSerializedSessionStore(storePath) === json` guard at store.ts:~358). Without it, every no-op `updateSessionStore` call would still hit `writeTextAtomic`, defeating (A) at the mtime layer and producing spurious continuation-persist activity in maintenance. The trap mirrors the production spread shape inline (does not import the entire agent-runner surface) so it pins the exact byte-shape without coupling to agent-runner's setup overhead. ## Sabotage walks (both verified on cf7830f) Sabotage 1 — gut the byte-identity short-circuit: // src/config/sessions/store.ts:~358 if (false && getSerializedSessionStore(storePath) === json) { ... } Result: "skips disk write entirely…" fails with expected "writeTextAtomic" to not be called at all, but actually been called 1 times Sabotage 2 — leak `updatedAt` into the persist spread: // mirror of agent-runner.ts persistContinuationChainState spread store[key] = { ...existing, ...chainFields, updatedAt: Date.now() }; Result: all 3 traps fail; the canonical message is updatedAt must not change when continuation-chain fields are byte-identical (persistContinuationChainState must not include updatedAt in its spread — #443) Both sabotages restored. 3/3 green on canonical2 baseline. ## Receipts ``` Test Files 1 passed (1) Tests 3 passed (3) ``` ## Refs - #443 — coverage issue (test-trap label, P2) - canonical2 base SHA: cf7830f - production substrate: src/auto-reply/reply/agent-runner.ts (`persistContinuationChainState`, lines ~1269 / ~1302 / ~1319) - byte-identity short-circuit: src/config/sessions/store.ts:~358 🌫️
Closes #443.
What this PR adds
A single trap test at
src/config/sessions/store.continuation-merge.test.ts(209 lines, test-only — no production code changes) that pins two load-bearing invariants on the continuation-chain persist path (agent-runner.ts:persistContinuationChainState→updateSessionStore→saveSessionStoreUnlocked):(A) The persist spread MUST NOT include
updatedAt. Chain fields are not activity events; bumpingupdatedAthere would churn idle-reset evaluation (openclaw#49515) and disk-budget pruning ordering off the actual turn timeline.(B)
saveSessionStoreUnlockedMUST short-circuit the disk write when the serialized payload is byte-identical (thegetSerializedSessionStore(storePath) === jsonguard atsrc/config/sessions/store.ts:~358). Without it, every no-opupdateSessionStorecall would still hitwriteTextAtomic, defeating (A) at the mtime layer and producing spurious continuation-persist activity in maintenance.The trap mirrors the production spread shape inline (does not import the entire agent-runner surface) so it pins the exact byte-shape without coupling to agent-runner's setup overhead.
Test surfaces (3 cases)
does not churn updatedAt when continuation chain fields are unchanged— re-persist identical chain values via the production spread → assertupdatedAtbyte-equal.skips disk write entirely when the serialized payload is unchanged— same re-persist → spy onwriteTextAtomic, must not be called.changes only the mutated chain field and still preserves updatedAt— mutatecontinuationChainTokens, assertwriteTextAtomicis called exactly once ANDupdatedAtis still preserved (the spread carries chain fields only, neverupdatedAt).Verified load-bearing (sabotage walks)
Both sabotages run on canonical2
cf7830ffb3702bf7d826d70838893e2e41709f12. Anyone can re-run.Sabotage 1 — gut the byte-identity short-circuit in
src/config/sessions/store.ts:~358:Result:
skips disk write entirely…fails with the canonical message:Sabotage 2 — leak
updatedAtinto the persist spread (mirror ofagent-runner.ts:persistContinuationChainState):Result: all 3 traps fail; the canonical message is:
Both sabotages restored. 3/3 green on canonical2 baseline.
Receipts
Gate honesty
git diff --statagainst canonical2 → 1 untracked test file added, zerosrc/modifications outside that file.cael/325-canonical2per the sharpened rule (base = branch containing the fork-point commitcf7830ffb3). NOT main, NOT ship-candidate.Refs
cf7830ffb3702bf7d826d70838893e2e41709f12src/auto-reply/reply/agent-runner.ts(persistContinuationChainState, lines ~1269 / ~1302 / ~1319)src/config/sessions/store.ts:~358🌫️