fix: preserve post-compaction delegate arm age#488
Conversation
63c2fca to
c657fa9
Compare
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
c657fa9 to
0cd516a
Compare
…tion-registration to absorb cold module-load cost
The first test in src/agents/openclaw-tools.continuation-registration.test.ts
("registers no continuation tools when continuation.enabled is unset") pays
the cold module-load cost for createOpenClawTools and its transitive imports
(compaction-attribution, pi-embedded-*, plugins/tools, config/config) under
400+ concurrent test files in the agent project.
Quiet-box first-test duration: ~95s. CI noise pushes it past vitest's 120s
per-test default, producing a flaky timeout that has now been observed across
multiple unrelated PRs:
- #485 head (compaction-attribution scope) — first-test timeout
- #488 (downstream of #485 hypothesis) — first-test timeout
- #468 head (does NOT touch this file) — same first-test, same file, timeout
Test file content is byte-identical between base cael/325-canonical2 and #485
head; the timeout is not a regression introduced by any of those PRs. Tests
2-7 in this file reuse the warm cache (~360ms each) and are unaffected.
Cure: per-test timeout bump to 240s on the first test only, with a comment
documenting the cold-start mechanism so future readers know why this single
test has a non-default timeout.
Standalone fix, deliberately not folded into #485 to keep its compaction-
attribution scope clean. Unblocks #485, #488, #468, and any future PR that
randomly trips the same flake.
Verified by silas-pr485-fixup-v2 subagent (2026-05-01 07:29 UTC):
- Local on base a3dcc2a: first test 95023ms, passed (25s margin to 120s)
- CI on #485 head 9f25f91: first test >120000ms, timed out
- CI on #468 run 25169814732: first test >120000ms, timed out (same file)
…tion-registration to absorb cold module-load cost (#498) The first test in src/agents/openclaw-tools.continuation-registration.test.ts ("registers no continuation tools when continuation.enabled is unset") pays the cold module-load cost for createOpenClawTools and its transitive imports (compaction-attribution, pi-embedded-*, plugins/tools, config/config) under 400+ concurrent test files in the agent project. Quiet-box first-test duration: ~95s. CI noise pushes it past vitest's 120s per-test default, producing a flaky timeout that has now been observed across multiple unrelated PRs: - #485 head (compaction-attribution scope) — first-test timeout - #488 (downstream of #485 hypothesis) — first-test timeout - #468 head (does NOT touch this file) — same first-test, same file, timeout Test file content is byte-identical between base cael/325-canonical2 and #485 head; the timeout is not a regression introduced by any of those PRs. Tests 2-7 in this file reuse the warm cache (~360ms each) and are unaffected. Cure: per-test timeout bump to 240s on the first test only, with a comment documenting the cold-start mechanism so future readers know why this single test has a non-default timeout. Standalone fix, deliberately not folded into #485 to keep its compaction- attribution scope clean. Unblocks #485, #488, #468, and any future PR that randomly trips the same flake. Verified by silas-pr485-fixup-v2 subagent (2026-05-01 07:29 UTC): - Local on base a3dcc2a: first test 95023ms, passed (25s margin to 120s) - CI on #485 head 9f25f91: first test >120000ms, timed out - CI on #468 run 25169814732: first test >120000ms, timed out (same file)
1e21522
into
cael/325-canonical2
silas-dandelion-cult
left a comment
There was a problem hiding this comment.
🌫 — diff-clean approve. firstArmedAt field is purely additive (optional everywhere, falls back to stagedAt/createdAt/Date.now() when absent), schema-compatible with existing persisted state. Round-trip preserved across TaskFlow storage, post-compaction substrate test, and runner-count consume path. Test coverage includes the lossy-round-trip scenario (stagedAt: 20_000, firstArmedAt: 10_000 → consumed with firstArmedAt: 10_000). 138/15 across 10 files, all in continuation surface; no cross-cut. ✅
cael-dandelion-cult
left a comment
There was a problem hiding this comment.
byte-walked diff — narrow, coherent: firstArmedAt threading through SessionPostCompactionDelegate / StagedPostCompactionDelegate / PendingContinuationDelegate / PendingDelegateState (zod schema), with TTL drop using stable first-armed age (requeued delegates don't reset their clock). Tests cover all three layers (canonical store, post-compaction wrapper, dispatch). POST_COMPACTION_DELEGATE_TTL_MS = 7d matches gateway max-delay shape. now injected as dep for testability. APPROVE.
…tion-registration to absorb cold module-load cost (#498) The first test in src/agents/openclaw-tools.continuation-registration.test.ts ("registers no continuation tools when continuation.enabled is unset") pays the cold module-load cost for createOpenClawTools and its transitive imports (compaction-attribution, pi-embedded-*, plugins/tools, config/config) under 400+ concurrent test files in the agent project. Quiet-box first-test duration: ~95s. CI noise pushes it past vitest's 120s per-test default, producing a flaky timeout that has now been observed across multiple unrelated PRs: - #485 head (compaction-attribution scope) — first-test timeout - #488 (downstream of #485 hypothesis) — first-test timeout - #468 head (does NOT touch this file) — same first-test, same file, timeout Test file content is byte-identical between base cael/325-canonical2 and #485 head; the timeout is not a regression introduced by any of those PRs. Tests 2-7 in this file reuse the warm cache (~360ms each) and are unaffected. Cure: per-test timeout bump to 240s on the first test only, with a comment documenting the cold-start mechanism so future readers know why this single test has a non-default timeout. Standalone fix, deliberately not folded into #485 to keep its compaction- attribution scope clean. Unblocks #485, #488, #468, and any future PR that randomly trips the same flake. Verified by silas-pr485-fixup-v2 subagent (2026-05-01 07:29 UTC): - Local on base a3dcc2a: first test 95023ms, passed (25s margin to 120s) - CI on #485 head 9f25f91: first test >120000ms, timed out - CI on #468 run 25169814732: first test >120000ms, timed out (same file)
Co-authored-by: Test User <test@example.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Summary
Fixes openclaw-bootstrap#823 by preserving stable post-compaction delegate arm age and dropping stale shards before they can be queued into the fresh session.
Root cause
The TaskFlow-backed post-compaction adapter preserved shard bodies, but the legacy compatibility wrapper converted consumed staged delegates back into
SessionPostCompactionDelegatewithcreatedAt: Date.now(). That erased the original arm time whenever a shard crossed the TaskFlow/restart/re-stage boundary, making stale 10-day-old tasks look freshly armed and preventing age/TTL filtering.Fix
firstArmedAtto post-compaction delegate state and preserve it through TaskFlow state, session-store delegates, and queued delivery payloads.createdAt/firstArmedAtfrom the original arm time instead of consume time.createdAtasfirstArmedAt.firstArmedAt, with an explicit stale-drop log containing age, TTL, and task preview.Validation
pnpm test src/auto-reply/continuation/delegate-store.test.ts src/auto-reply/continuation-delegate-store.post-compaction-substrate.test.ts src/auto-reply/reply/post-compaction-delegate-dispatch.test.ts src/infra/session-delivery-queue.storage.test.tspnpm tsgo:corepnpm tsgo:core:testpnpm lintpnpm check:changedexpands to all lanes from canonical baseline unknown.agentssurfaces and fails in existing extension typecheck baselines (extensions/codexduplicate@mariozechner/pi-agent-coreidentities andextensions/qqbotzod v3/v4 mismatch), unrelated to this delegate persistence/TTL fix.