fix(agents): correlate compaction attribution events (#825)#485
fix(agents): correlate compaction attribution events (#825)#485karmafeast wants to merge 1 commit intocael/325-canonical2from
Conversation
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 9f25f9116f
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| provider, | ||
| model, | ||
| authProfileId: compactionAuthProfileId, | ||
| trigger: request.trigger, |
There was a problem hiding this comment.
Keep volitional compaction on the manual execution path
Passing trigger: request.trigger changes runtime behavior for request_compaction: this call previously omitted trigger, so compactEmbeddedPiSession defaulted to manual semantics, but now trigger="volitional" bypasses all params.trigger === "manual" branches (notably forced compaction target/force selection in compact.queued.ts and manual-boundary hardening in compact.ts). In sessions where context pressure is high enough for the tool guard but below automatic compaction thresholds, volitional requests can now be skipped or produce different post-compaction boundaries, which is a functional regression (the same pattern is also present in followup-runner.ts).
Useful? React with 👍 / 👎.
|
Superseded by #502 — rebased onto canonical2 post-#500 with union resolution against #486 (cooldown-arm collision at Cohort context: 🩸 took (1) "drive-it" call in #sprites-of-thornfield ~13:50Z; #485 not gating SUT cut. Closing in favor of #502. 🌻🩸🌫🌊🍖🩲 |
…tion-registration to absorb cold module-load cost
The first test in src/agents/openclaw-tools.continuation-registration.test.ts
("registers no continuation tools when continuation.enabled is unset") pays
the cold module-load cost for createOpenClawTools and its transitive imports
(compaction-attribution, pi-embedded-*, plugins/tools, config/config) under
400+ concurrent test files in the agent project.
Quiet-box first-test duration: ~95s. CI noise pushes it past vitest's 120s
per-test default, producing a flaky timeout that has now been observed across
multiple unrelated PRs:
- #485 head (compaction-attribution scope) — first-test timeout
- #488 (downstream of #485 hypothesis) — first-test timeout
- #468 head (does NOT touch this file) — same first-test, same file, timeout
Test file content is byte-identical between base cael/325-canonical2 and #485
head; the timeout is not a regression introduced by any of those PRs. Tests
2-7 in this file reuse the warm cache (~360ms each) and are unaffected.
Cure: per-test timeout bump to 240s on the first test only, with a comment
documenting the cold-start mechanism so future readers know why this single
test has a non-default timeout.
Standalone fix, deliberately not folded into #485 to keep its compaction-
attribution scope clean. Unblocks #485, #488, #468, and any future PR that
randomly trips the same flake.
Verified by silas-pr485-fixup-v2 subagent (2026-05-01 07:29 UTC):
- Local on base a3dcc2a: first test 95023ms, passed (25s margin to 120s)
- CI on #485 head 9f25f91: first test >120000ms, timed out
- CI on #468 run 25169814732: first test >120000ms, timed out (same file)
…tion-registration to absorb cold module-load cost (#498) The first test in src/agents/openclaw-tools.continuation-registration.test.ts ("registers no continuation tools when continuation.enabled is unset") pays the cold module-load cost for createOpenClawTools and its transitive imports (compaction-attribution, pi-embedded-*, plugins/tools, config/config) under 400+ concurrent test files in the agent project. Quiet-box first-test duration: ~95s. CI noise pushes it past vitest's 120s per-test default, producing a flaky timeout that has now been observed across multiple unrelated PRs: - #485 head (compaction-attribution scope) — first-test timeout - #488 (downstream of #485 hypothesis) — first-test timeout - #468 head (does NOT touch this file) — same first-test, same file, timeout Test file content is byte-identical between base cael/325-canonical2 and #485 head; the timeout is not a regression introduced by any of those PRs. Tests 2-7 in this file reuse the warm cache (~360ms each) and are unaffected. Cure: per-test timeout bump to 240s on the first test only, with a comment documenting the cold-start mechanism so future readers know why this single test has a non-default timeout. Standalone fix, deliberately not folded into #485 to keep its compaction- attribution scope clean. Unblocks #485, #488, #468, and any future PR that randomly trips the same flake. Verified by silas-pr485-fixup-v2 subagent (2026-05-01 07:29 UTC): - Local on base a3dcc2a: first test 95023ms, passed (25s margin to 120s) - CI on #485 head 9f25f91: first test >120000ms, timed out - CI on #468 run 25169814732: first test >120000ms, timed out (same file)
…edes #485) (#502) * fix(agents): correlate compaction attribution events (rebased; supersedes #485) Adds [request_compaction:resolved-success] log line for volitional compaction-success path so attribution events can be correlated with dispatch trigger (sessionKey + runId + diagId + outcome). Rebased onto canonical2 post-#500. Adjacent-but-orthogonal collision with #486 (cooldown-arm at request-compaction-tool.ts:223-225) resolved as union: cooldown-arm → log-emit → counter-increment. Original work in #485 by @karmafeast; superseded by this rebased commit because canonical2 now contains #486 cooldown-arm that #485 did not anticipate. Co-authored-by: karmafeast <karmafeast@gmail.com> * test(plugins): use objectContaining for compaction emit assertions compaction-attribution work in this PR added trigger/sessionKey/compactionCount{Before,After,Delta} fields to the compaction emitAgentEvent payloads in pi-embedded-subscribe.handlers.compaction.ts. The wired-hooks-compaction.test.ts assertions used exact-match shapes that did not include the new fields, causing 3 tests to fail in agentic-plugins shard. Switching to expect.objectContaining preserves the original behavioral assertion (phase + willRetry + completed) while tolerating the additive attribution fields. --------- Co-authored-by: karmafeast <karmafeast@gmail.com>
…tion-registration to absorb cold module-load cost (#498) The first test in src/agents/openclaw-tools.continuation-registration.test.ts ("registers no continuation tools when continuation.enabled is unset") pays the cold module-load cost for createOpenClawTools and its transitive imports (compaction-attribution, pi-embedded-*, plugins/tools, config/config) under 400+ concurrent test files in the agent project. Quiet-box first-test duration: ~95s. CI noise pushes it past vitest's 120s per-test default, producing a flaky timeout that has now been observed across multiple unrelated PRs: - #485 head (compaction-attribution scope) — first-test timeout - #488 (downstream of #485 hypothesis) — first-test timeout - #468 head (does NOT touch this file) — same first-test, same file, timeout Test file content is byte-identical between base cael/325-canonical2 and #485 head; the timeout is not a regression introduced by any of those PRs. Tests 2-7 in this file reuse the warm cache (~360ms each) and are unaffected. Cure: per-test timeout bump to 240s on the first test only, with a comment documenting the cold-start mechanism so future readers know why this single test has a non-default timeout. Standalone fix, deliberately not folded into #485 to keep its compaction- attribution scope clean. Unblocks #485, #488, #468, and any future PR that randomly trips the same flake. Verified by silas-pr485-fixup-v2 subagent (2026-05-01 07:29 UTC): - Local on base a3dcc2a: first test 95023ms, passed (25s margin to 120s) - CI on #485 head 9f25f91: first test >120000ms, timed out - CI on #468 run 25169814732: first test >120000ms, timed out (same file)
…edes #485) (#502) * fix(agents): correlate compaction attribution events (rebased; supersedes #485) Adds [request_compaction:resolved-success] log line for volitional compaction-success path so attribution events can be correlated with dispatch trigger (sessionKey + runId + diagId + outcome). Rebased onto canonical2 post-#500. Adjacent-but-orthogonal collision with #486 (cooldown-arm at request-compaction-tool.ts:223-225) resolved as union: cooldown-arm → log-emit → counter-increment. Original work in #485 by @karmafeast; superseded by this rebased commit because canonical2 now contains #486 cooldown-arm that #485 did not anticipate. Co-authored-by: karmafeast <karmafeast@gmail.com> * test(plugins): use objectContaining for compaction emit assertions compaction-attribution work in this PR added trigger/sessionKey/compactionCount{Before,After,Delta} fields to the compaction emitAgentEvent payloads in pi-embedded-subscribe.handlers.compaction.ts. The wired-hooks-compaction.test.ts assertions used exact-match shapes that did not include the new fields, causing 3 tests to fail in agentic-plugins shard. Switching to expect.objectContaining preserves the original behavioral assertion (phase + willRetry + completed) while tolerating the additive attribution fields. --------- Co-authored-by: karmafeast <karmafeast@gmail.com>
Summary
request_compaction,[compaction-diag], and session-store counter signals without enough shared attribution to join trigger, outcome, and count deltas.compactionCountmove without knowing which terminal compaction event caused it, or whether a volitional enqueue reached the matching runtime compaction attempt.request_compactionnow creates a shareddiagId, logsrunId/diagId/trigger=volitionalat enqueue and terminal resolution, and threads that attribution intocompactEmbeddedPiSession; embedded compaction events now expose/log normalized trigger plus count before/after/delta; session-store count reconcile logs the same run/trigger/outcome attribution.Change Type (select all)
Scope (select all touched areas)
Linked Issue/PR
Root Cause (if applicable)
request_compactionlogged only session/usage/reason, the runtime compaction diagnostic owned its owndiagId, and the session-store compaction count reconcile updated state without logging a joinable terminal attribution record.reasonbut no OpenClaw correlation id, so OpenClaw must add run/trigger/count attribution at the subscribe boundary.Regression Test Plan (if applicable)
src/agents/tools/request-compaction-tool.volitional-threading.test.tsandsrc/agents/pi-embedded-subscribe.handlers.compaction.test.tstriggerCompaction, result details, and enqueue/terminal logs; compaction-end handling emits trigger/count-before/count-after/count-delta attribution.User-visible / Behavior Changes
None for ordinary users. Diagnostic logs/events now include correlation fields (
runId,diagId,trigger, and compaction count deltas) for compaction attribution.Diagram (if applicable)
Security Impact (required)
Yes/No) NoYes/No) NoYes/No) NoYes/No) NoYes/No) NoYes, explain risk + mitigation: N/ARepro + Verification
Environment
Steps
request_compactionwith continuation enabled and a live run id.request_compactionlogs.Expected
runIdanddiagIdwhere applicable.Actual
Evidence
Verified after the fix:
pnpm test src/agents/tools/request-compaction-tool.volitional-threading.test.ts src/agents/tools/request-compaction-tool.test.ts src/agents/pi-embedded-subscribe.handlers.compaction.test.tspnpm test src/agents/openclaw-tools.continuation-registration.test.ts src/agents/tools-effective-inventory.test.ts src/auto-reply/reply/commands-system-prompt.test.tspnpm tsgo:corepnpm tsgo:core:testpnpm lintGate notes:
pnpm check/pnpm tsgo:extensionsfail in untouched extension baselines:extensions/codexduplicate@mariozechner/pi-agent-coretype identities andextensions/qqbotzod v3/v4 mismatch.pnpm check:test-typespasses core test types, then fails in the same untouched extension type baseline.pnpm check:changedexpands to all lanes from canonical-branch baseline surfaces and stops at the same extension type failures.pnpm format:checkreports broad pre-existing formatting drift outside this PR;scripts/committerformatted the staged PR files.Human Verification (required)
budget;willRetry=truesuccessful overflow compaction still records a count delta.Review Conversations
Compatibility / Migration
Yes/No) YesYes/No) NoYes/No) NoRisks and Mitigations