Skip to content

fix: move compaction planning off the event loop#88132

Merged
steipete merged 2 commits into
mainfrom
fix/compaction-planning-worker
May 29, 2026
Merged

fix: move compaction planning off the event loop#88132
steipete merged 2 commits into
mainfrom
fix/compaction-planning-worker

Conversation

@steipete

Copy link
Copy Markdown
Contributor

Summary

Fixes #86358.

This replaces the yield-only mitigation with a real worker-backed compaction planning path:

  • Extracts the CPU-heavy, pure compaction planning logic into src/agents/compaction-planning.ts: token estimates, safe transcript projection, token-share splitting, max-token chunking, oversized fallback planning, adaptive chunk ratio, and history pruning.
  • Adds src/agents/compaction-planning.worker.ts as the worker entry and src/agents/compaction-planning-worker.ts as the main-thread runner.
  • Keeps all provider/model/API calls on the main thread. The worker only prepares plans and sanitized message chunks; summarization requests still run through the existing generateSummary/provider path.
  • Sanitizes worker inputs before workerData structured clone so toolResult.details and runtime-context custom messages do not cross the thread boundary or get cloned into worker memory.
  • Uses the worker only for starvation-sized histories (>=64 messages). Small histories stay synchronous to avoid worker startup overhead and to preserve cheap test/mocked behavior.
  • Falls back to the old in-process planner if the worker runtime is unavailable or a packaged/source worker fails to load; worker timeouts and worker-reported planning failures still fail explicitly.
  • Threads worker-backed planning into both the OpenClaw safeguard path and summarizeInStages: history pruning, adaptive chunk-ratio calculation, stage splitting, summary chunking, and oversized-message fallback planning.
  • Adds the new packaged worker to tsdown, the tsdown config test, release packaging requirements, and release-check fixtures.

The important architecture point: this does not put sessions, auth, plugin providers, model calls, or hooks in a worker. It moves the deterministic CPU planning that was starving the Node event loop while leaving runtime side effects on the main thread.

Verification

  • OPENCLAW_VITEST_MAX_WORKERS=1 OPENCLAW_VITEST_FS_MODULE_CACHE_PATH=/tmp/openclaw-vitest-cache-86358-worker2 /usr/bin/time -l pnpm test src/agents/compaction-planning-worker.test.ts -- --reporter=verbose
    • Passed: 5 tests.
    • Includes real worker execution, missing-worker unavailable classification, and a timer responsiveness regression where a zero-delay timer wins while a large stage-split plan runs in the worker.
  • OPENCLAW_VITEST_MAX_WORKERS=1 OPENCLAW_VITEST_FS_MODULE_CACHE_PATH=/tmp/openclaw-vitest-cache-86358-sanitize /usr/bin/time -l pnpm test src/agents/compaction.token-sanitize.test.ts src/agents/compaction-planning-worker.test.ts src/agents/compaction-partial-summary.test.ts src/agents/compaction.identifier-preservation.test.ts src/agents/compaction.summarize-fallback.test.ts src/agents/compaction.tool-result-details.test.ts src/agents/agent-hooks/compaction-safeguard.test.ts test/release-check.test.ts -- --reporter=verbose
    • Passed: 3 shards, 116 tests.
  • OPENCLAW_VITEST_MAX_WORKERS=1 OPENCLAW_VITEST_FS_MODULE_CACHE_PATH=/tmp/openclaw-vitest-cache-86358-compaction3 /usr/bin/time -l pnpm test src/agents/compaction.test.ts src/agents/compaction.token-sanitize.test.ts src/agents/compaction-planning-worker.test.ts src/infra/tsdown-config.test.ts -- --reporter=verbose
    • Passed: 3 shards, 38 tests.
  • OPENCLAW_VITEST_MAX_WORKERS=1 OPENCLAW_VITEST_FS_MODULE_CACHE_PATH=/tmp/openclaw-vitest-cache-86358-release /usr/bin/time -l pnpm test test/release-check.test.ts -- --reporter=verbose
    • Passed: 52 tests.
  • /usr/bin/time -l pnpm build
    • Passed after adding the worker dist entry.
  • /Users/steipete/Projects/agent-skills/skills/autoreview/scripts/autoreview --mode local
    • Final run clean: autoreview clean: no accepted/actionable findings reported.
  • git diff --check
    • Passed.
  • OPENCLAW_TESTBOX=0 /usr/bin/time -l pnpm check:changed
    • Guard steps passed through runtime sidecar loader guard.
    • Failed in unrelated current-main typecheck errors in src/config/sessions.cache.test.ts at lines 743, 772, 791, and 796. This patch does not touch that file. The same failure was observed before this final patch while validating the issue.

Attempted remote changed-gate proof:

Real behavior proof

Behavior addressed: Large compaction planning no longer monopolizes the main Node event loop before summarization requests. Telegram/MCP timers should get main-thread turns while CPU-heavy compaction planning runs in a worker.

Real environment tested: Local Node/OpenClaw source checkout plus actual Node worker-thread execution in the worker regression test. No live Telegram account or production MCP backend was used.

Exact steps or command run after this patch: See the verification commands above, especially src/agents/compaction-planning-worker.test.ts and pnpm build.

Evidence after fix: The worker regression schedules a zero-delay timer and a large stage-split planning job at the same time; the timer resolves first while the worker continues planning, proving the planning CPU is not blocking the main event loop in that path.

Observed result after fix: Worker planning returns the same summary chunk/stage planning shape, missing worker runtime is classified as unavailable for fallback, sanitized worker inputs omit toolResult.details and runtime-context custom messages, and existing compaction fallback/identifier/safeguard tests continue to pass.

What was not tested: Live Telegram timeout behavior against a real bot session was not exercised in this PR; the proof is focused worker-thread/event-loop behavior plus existing compaction unit coverage.

@steipete steipete requested a review from a team as a code owner May 29, 2026 20:37
@openclaw-barnacle openclaw-barnacle Bot added scripts Repository scripts agents Agent runtime and tooling size: XL maintainer Maintainer-authored PR labels May 29, 2026
@clawsweeper

clawsweeper Bot commented May 29, 2026

Copy link
Copy Markdown
Contributor

Codex review: needs maintainer review before merge. Reviewed May 29, 2026, 4:46 PM ET / 20:46 UTC.

Summary
The PR extracts deterministic compaction planning into a worker-backed path, wires it into compaction and the safeguard hook, and adds packaging/release-check coverage for the new worker.

PR surface: Source +662, Tests +125, Other +2. Total +789 across 11 files.

Reproducibility: yes. for the source-level failure shape: current main and the latest release run token estimation, splitting, and pruning synchronously in compaction, matching the linked timer-delay report. This review did not run a live Telegram reproduction.

Review metrics: 1 noteworthy metric.

  • Packaged worker surface: 1 dist worker entry added; 1 release-check required path added. The new runtime path depends on the worker file being present in packaged installs, so package proof matters before merge.

Merge readiness
Overall: 🐚 platinum hermit
Proof: 🐚 platinum hermit
Patch quality: 🐚 platinum hermit
Result: ready for maintainer review.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Rank-up moves:

  • [P2] Add packaged-install or live transport proof before landing if maintainers want confidence beyond the Node worker/timer regression.

Mantis proof suggestion
A live Telegram compaction run would directly cover the PR body's untested reported symptom. A maintainer can ask Mantis to capture proof by posting a new PR comment that starts with the OpenClaw Mantis account mention, followed by:

telegram live: verify a large Telegram session compacts without delaying Telegram fetch timers or losing the reply.

Risk before merge

  • [P2] The PR body proves the Node worker/timer path locally but does not exercise the reported live Telegram or MCP timeout path, so the production symptom remains a maintainer validation point before landing.
  • [P1] Worker startup failures fall back to main-thread planning, but worker timeouts and worker-reported planning failures now fail compaction explicitly; maintainers should accept that operational behavior for large histories.

Maintainer options:

  1. Accept With Targeted Proof (recommended)
    If CI, build, and release-package checks stay green, maintainers can accept the remaining live-transport gap because the PR directly tests real Node worker timer responsiveness and keeps unavailable-worker fallback.
  2. Request Package Or Transport Proof
    Before merge, ask for a packaged install or live Telegram/MCP compaction run that shows timers stay responsive during a large session.
  3. Pause For Failure Semantics
    If compaction timeouts must preserve the old in-process completion behavior, pause for an explicit compatibility decision before landing this branch.

Next step before merge

  • [P2] The protected maintainer label and core compaction worker merge risk call for human maintainer review rather than an automated repair lane.

Security
Cleared: No concrete security or supply-chain issue was found; the diff adds no dependencies and strips toolResult.details and runtime-context messages before worker cloning.

Review details

Best possible solution:

Land after maintainer review accepts the worker failure semantics and package proof, or add packaged/live compaction proof first if production-path confidence is required.

Do we have a high-confidence way to reproduce the issue?

Yes for the source-level failure shape: current main and the latest release run token estimation, splitting, and pruning synchronously in compaction, matching the linked timer-delay report. This review did not run a live Telegram reproduction.

Is this the best way to solve the issue?

Yes, moving deterministic planning off the main event loop is the narrow maintainable fix for the reported starvation. The remaining question is proof depth, not a different implementation direction.

AGENTS.md: found and applied where relevant.

Codex review notes: model gpt-5.5, reasoning high; reviewed against 61e7b042b633.

Label changes

Label changes:

  • add P1: The linked report describes event-loop starvation during compaction causing user-visible Telegram/fetch timeouts and possible runtime interruption.
  • add merge-risk: 🚨 session-state: The diff moves history pruning, chunk planning, and staged summary planning for session compaction into a new async worker path.
  • add merge-risk: 🚨 availability: Large-history compaction can now fail on worker timeout or worker-reported planning failure, which may stop an active runtime flow.
  • add proof: sufficient: Contributor real behavior proof is sufficient. The PR body includes terminal verification of actual Node worker-thread execution and a timer-responsiveness regression; live Telegram was explicitly not tested.
  • add rating: 🐚 platinum hermit: Overall readiness is 🐚 platinum hermit; proof is 🐚 platinum hermit and patch quality is 🐚 platinum hermit.
  • add status: 👀 ready for maintainer look: ClawSweeper has no concrete contributor-facing blocker left for this PR. Sufficient (terminal): The PR body includes terminal verification of actual Node worker-thread execution and a timer-responsiveness regression; live Telegram was explicitly not tested.

Label justifications:

  • P1: The linked report describes event-loop starvation during compaction causing user-visible Telegram/fetch timeouts and possible runtime interruption.
  • merge-risk: 🚨 session-state: The diff moves history pruning, chunk planning, and staged summary planning for session compaction into a new async worker path.
  • merge-risk: 🚨 availability: Large-history compaction can now fail on worker timeout or worker-reported planning failure, which may stop an active runtime flow.
  • rating: 🐚 platinum hermit: Overall readiness is 🐚 platinum hermit; proof is 🐚 platinum hermit and patch quality is 🐚 platinum hermit.
  • status: 👀 ready for maintainer look: ClawSweeper has no concrete contributor-facing blocker left for this PR. Sufficient (terminal): The PR body includes terminal verification of actual Node worker-thread execution and a timer-responsiveness regression; live Telegram was explicitly not tested.
  • proof: sufficient: Contributor real behavior proof is sufficient. The PR body includes terminal verification of actual Node worker-thread execution and a timer-responsiveness regression; live Telegram was explicitly not tested.
Evidence reviewed

PR surface:

Source +662, Tests +125, Other +2. Total +789 across 11 files.

View PR surface stats
Area Files Added Removed Net
Source 5 981 319 +662
Tests 4 125 0 +125
Docs 0 0 0 0
Config 0 0 0 0
Generated 0 0 0 0
Other 2 2 0 +2
Total 11 1108 319 +789

What I checked:

  • Repository policy: Root and scoped src/agents and scripts policies were read before reviewing the touched surfaces. (AGENTS.md:1, a509c48f0ea2)
  • Current behavior: Current main still performs staged compaction token estimation, splitting, and history pruning synchronously in summarizeInStages and pruneHistoryForContextShare, matching the reported event-loop starvation shape. (src/agents/compaction.ts:506, 61e7b042b633)
  • Shipped behavior: The latest release tag contains synchronous compaction token estimation and splitting helpers, so the central problem is not already shipped-fixed. (src/agents/compaction.ts:122, 27ae826f6525)
  • Worker implementation: The PR adds a compaction planning worker runner with a 64-message threshold, sanitized worker inputs, fallback only for unavailable workers, and explicit timeout/failure propagation. (src/agents/compaction-planning-worker.ts:22, d1531136a800)
  • Compaction wiring: The PR routes chunk planning, oversized fallback planning, stage splitting, history pruning, and adaptive ratio calculation through the worker-backed planner while leaving summary generation on the main path. (src/agents/compaction.ts:158, d1531136a800)
  • Regression proof: The added worker test schedules a zero-delay timer alongside a large stage-split plan and expects the timer to win, proving the reviewed worker path does not monopolize the main event loop in that scenario. (src/agents/compaction-planning-worker.test.ts:69, d1531136a800)

Likely related people:

  • steipete: Blame shows Peter Steinberger on the current compaction and safeguard implementation, and this PR author also has recent related commits in the area. (role: recent area contributor; confidence: high; commits: 4e2d9b0b760d, ae0741a3466c, 27ae826f6525; files: src/agents/compaction.ts, src/agents/agent-hooks/compaction-safeguard.ts, tsdown.config.ts)
  • Val Alexander: Recent history includes a compaction prompt-too-long fix touching the same compaction module, making this a plausible adjacent routing candidate. (role: adjacent bug-fix contributor; confidence: medium; commits: b703ea3675d6; files: src/agents/compaction.ts)
What the crustacean ranks mean
  • 🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
  • 🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
  • 🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
  • 🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
  • 🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
  • 🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
  • 🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works
  • ClawSweeper keeps one durable marker-backed review comment per issue or PR.
  • Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
  • A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
  • PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
  • Maintainers can also comment @clawsweeper review to request a fresh review only.
  • Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
  • Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
  • Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

@clawsweeper clawsweeper Bot added proof: sufficient ClawSweeper judged the real behavior proof convincing. rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. P1 High-priority user-facing bug, regression, or broken workflow. merge-risk: 🚨 session-state 🚨 May lose, corrupt, stale, or mis-associate session, agent, or context state. merge-risk: 🚨 availability 🚨 May cause crashes, hangs, restart loops, stalls, or process outages. labels May 29, 2026
@steipete steipete force-pushed the fix/compaction-planning-worker branch from d153113 to 1d566e3 Compare May 29, 2026 20:55
@steipete steipete force-pushed the fix/compaction-planning-worker branch from 1d566e3 to 15320f2 Compare May 29, 2026 20:58
@steipete

Copy link
Copy Markdown
Contributor Author

Pre-merge verification for this PR:

  • Local: pnpm exec oxlint src/agents/compaction-planning.worker.ts
  • Local: node scripts/run-vitest.mjs src/agents/compaction-planning-worker.test.ts src/agents/compaction.token-sanitize.test.ts
  • Local earlier on this branch: pnpm build, targeted compaction/release tests, git diff --check, autoreview clean.
  • CI on head 15320f25de112c9951e51d203d9ef7a157f2e707: build-artifacts, security, actionlint, changed-path scan, and the compaction-adjacent node shards passed.

Known CI gaps: latest main is already red. Remaining failures on this PR are unrelated to the compaction worker changes: openai-tool-schema.ts / codex-supervisor lint, shrinkwrap/prod/test type failures, an existing architecture cycle, extension/package-boundary failures, src/commands/agent.test.ts session-key expectation, and src/wizard/setup.official-plugins.test.ts tokenjuice catalog expectation. None of the failed annotations are in files touched by this PR.

@steipete steipete merged commit 6443d06 into main May 29, 2026
89 of 99 checks passed
@steipete steipete deleted the fix/compaction-planning-worker branch May 29, 2026 21:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling maintainer Maintainer-authored PR merge-risk: 🚨 availability 🚨 May cause crashes, hangs, restart loops, stalls, or process outages. merge-risk: 🚨 session-state 🚨 May lose, corrupt, stale, or mis-associate session, agent, or context state. P1 High-priority user-facing bug, regression, or broken workflow. proof: sufficient ClawSweeper judged the real behavior proof convincing. rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. scripts Repository scripts size: XL status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Event-loop starvation during context compaction causes fetch timeouts (16.9s timer delay)

1 participant