fix(agents): defer bootstrap context-engine maintenance to background by dripsmvcp · Pull Request #90199 · openclaw/openclaw

dripsmvcp · 2026-06-04T06:25:58Z

Summary

Bootstrap/reconcile context-engine maintenance runs in foreground, where deferred compaction debt cannot execute (allowDeferredCompactionExecution is background-only) and no background follow-up is scheduled (only turns defer). So debt created when bootstrap imports tail messages past the leaf trigger is stranded, leaving sessions repeating deferred compaction still needed (issue [Bug]: bootstrap/reconcile and hot-cache policy can leave deferred compaction debt stranded #67716, Case 1).
Lets reason="bootstrap" schedule the same background debt consumer turns already use, for engines that opt into background maintenance (turnMaintenanceMode === "background").
Foreground bootstrap is unchanged for engines without background maintenance; the plugin-owned (Lossless-Claw) hot-cache sticky-debt path (Case 2) is intentionally out of scope.
Reviewers should focus on: deferring bootstrap is safe because turns already defer for these engines, the deferred run dedups/coalesces per session, and it runs in background mode (so it can actually pay the debt).

Linked context

Related #66820 — deferred-maintenance token budget (a different aspect of the same subsystem; not this scheduling gap).

Not maintainer-requested; selected from the clawsweeper:queueable-fix backlog. ClawSweeper's review of #67716 recommended exactly this OpenClaw-scoped fix: "extend the existing deferred-maintenance lifecycle so bootstrap/reconcile can schedule a valid background debt consumer," keeping plugin hot-cache/dedup policy out.

Real behavior proof (required for external PRs)

Behavior or issue addressed: bootstrap/reconcile created deferred compaction debt that could not execute because bootstrap maintenance runs foreground (issue [Bug]: bootstrap/reconcile and hot-cache policy can leave deferred compaction debt stranded #67716 Case 1).
Real environment tested: Linux / Node 24, exercising the real production runContextEngineMaintenance lifecycle — the deferred background worker, the task registry/queue, and the session lane — not mocked.
Exact steps or command run after this patch: node scripts/run-vitest.mjs src/agents/embedded-agent-runner/context-engine-maintenance.test.ts
Evidence after fix (screenshot, recording, terminal capture, console output, redacted runtime log, linked artifact, or copied live output): copied terminal output. The added regression fails on unfixed main (bootstrap ran foreground and returned a maintenance result) and passes after the fix (bootstrap defers and the background worker runs):

unfixed main: × defers bootstrap maintenance to a background debt consumer for background-mode engines
              AssertionError: expected { Object (changed, bytesFreed, ...) } to be undefined

with the fix: Test Files  1 passed (1)
              Tests       23 passed (23)

Observed result after fix: for a background-mode engine, bootstrap maintenance schedules a background context_engine_turn_maintenance task instead of running inline, and the deferred worker runs maintain() with allowDeferredCompactionExecution: true (the debt can now be paid).
What was not tested: a live gateway bootstrap/reconcile overflow on a running OpenClaw deployment.
Proof limitations or environment constraints: this is unit/integration-level proof against the real lifecycle functions, which CONTRIBUTING.md treats as supplemental rather than a live-setup capture. I could not drive a live gateway into a bootstrap overflow in this environment (needs a full running OpenClaw + provider + forced overflow; the sandbox network also blocked a clean install and the crabbox check:changed gate). A maintainer with a live setup can confirm end-to-end, or apply proof: override for this logic-scoped scheduling fix.
Before evidence (optional but encouraged): the failing assertion above (expected { ... } to be undefined) is the before-state from unfixed main, where bootstrap ran foreground.

How to capture live behavior proof on a real setup (for a maintainer or the reporter)

This needs the background-maintenance context engine (the Lossless-Claw / "LCM" engine from the issue, which sets turnMaintenanceMode: "background" and emits the LCM compaction leaf pass lines) plus a real provider — neither of which I can run in CI/sandbox. Steps:

Build this branch: pnpm install && pnpm build; run the gateway with the LCM-configured agent and tail logs (./scripts/clawlog.sh).
Restart the gateway on a session whose reconcile tail-import pushes rawTokensOutsideTail past the leaf trigger (issue [Bug]: bootstrap/reconcile and hot-cache policy can leave deferred compaction debt stranded #67716 Case 1).
Grep the bootstrap window (redact session keys/content):

grep -E "reconcileSessionTail|deferred turn maintenance (queued|resuming)|deferred compaction (debt pending|skipped|completed)|allowDeferredCompactionExecution|reason=compacted|compactLeafAsync" <gateway-log>

Before this fix the bootstrap window shows deferred compaction debt pending ... allowDeferredCompactionExecution is disabled → deferred compaction skipped ... reason=deferred compaction still needed (stranded). After this fix it shows the core line [context-engine] deferred turn maintenance queued ... lane=context-engine-turn-maintenance:<key> during bootstrap (emitted at context-engine-maintenance.ts:619; previously only on turns) followed by compactLeafAsync start → LCM compaction leaf pass → deferred compaction completed ... reason=compacted (paid). I will paste the redacted excerpt here once it is captured.

Tests and validation

node scripts/run-vitest.mjs src/agents/embedded-agent-runner/context-engine-maintenance.test.ts -> 23 passed (1 new).
Acceptance set (from the issue review, paths updated for the embedded-agent-runner rename): attempt.spawn-workspace.context-engine.test.ts -> 56 passed; extensions/codex/src/app-server/run-attempt.context-engine.test.ts -> 24 passed; src/agents/harness/context-engine-lifecycle.test.ts -> 9 passed.
tsgo -p tsconfig.core.json and core test types: no errors in the changed files (two unrelated pre-existing errors in src/config/io.ts and src/secrets/config-io.ts are present on main).
Regression coverage added: "defers bootstrap maintenance to a background debt consumer for background-mode engines" (fails first, passes after).

Risk checklist

Did user-visible behavior change? No — internal startup maintenance scheduling.
Did config, environment, or migration behavior change? No.
Did security, auth, secrets, network, or tool execution behavior change? No.
Highest-risk area: deferring bootstrap maintenance to background (session-state).
How is that risk mitigated? It only defers for engines that already opt into background turn maintenance (foreground is unchanged for all others); it reuses the existing per-session, dedup/coalesced deferred-maintenance lane that turns use; and it is covered by the new regression plus the existing 22 maintenance tests and the spawn-workspace / codex / lifecycle suites.

Current review state

Next action: maintainer / ClawSweeper review. The code itself is accepted by ClawSweeper ("no narrow code repair is needed"); the only open item is the real-behavior-proof gate (status: 📣 needs proof).
Scope: this addresses the OpenClaw-owned Case 1 only. Case 2 (hot-cache sticky debt / duplicate ledger) is plugin-owned (Lossless-Claw) per ClawSweeper's review and is intentionally excluded; the broader duplicate-transcript umbrella can stay open if maintainers prefer.
On proof: the change is verified at the unit/integration level against the real lifecycle functions, but the live-overflow capture requires the external Lossless-Claw engine + a provider, which I can't run here. The exact live-capture steps are in the "How to capture live behavior proof" block above; I will add the redacted excerpt as soon as anyone with a live LCM setup runs it. Requesting a maintainer proof: override for this logic-scoped core scheduling fix in the meantime — ClawSweeper offered that path for exactly this case.

Bootstrap/reconcile context-engine maintenance runs foreground, where deferred compaction debt cannot execute (allowDeferredCompactionExecution is background-only) and no background follow-up is scheduled — only turns defer. So debt created when bootstrap imports tail messages past the leaf trigger is stranded, leaving sessions repeating "deferred compaction still needed" (issue openclaw#67716, Case 1). Extend the deferred-maintenance gate so reason="bootstrap" also schedules the existing background debt consumer for engines that opt into background maintenance (turnMaintenanceMode === "background"). Foreground bootstrap is unchanged for engines without background maintenance, and the plugin-owned hot-cache sticky-debt path (Case 2) is intentionally left out of scope. Closes openclaw#67716

clawsweeper · 2026-06-04T06:32:41Z

Codex review: needs real behavior proof before merge. Reviewed June 4, 2026, 8:00 AM ET / 12:00 UTC.

Summary
This PR changes runContextEngineMaintenance so background-mode engines defer bootstrap maintenance into the existing background maintenance worker and adds a regression test for that scheduling path.

PR surface: Source +4, Tests +61. Total +65 across 2 files.

Reproducibility: yes. from source: current main runs bootstrap maintenance in foreground while only background execution sets allowDeferredCompactionExecution, and bootstrap is not currently eligible for the deferred background worker. I did not reproduce the full live LCM overflow; the remaining proof gap is live behavior, not source traceability.

Review metrics: none identified.

Merge readiness
Overall: 🦪 silver shellfish
Proof: 🦪 silver shellfish
Patch quality: 🐚 platinum hermit
Result: blocked until real behavior proof from a real setup is added.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Rank-up moves:

[P1] Add redacted live gateway/bootstrap logs or a linked artifact showing bootstrap-created debt queued and consumed after the patch, or obtain an explicit maintainer proof override.

Proof guidance:

[P1] Needs real behavior proof before merge: The PR body provides copied Vitest output only; before merge it needs redacted live logs/terminal output or a linked artifact showing the real bootstrap/reconcile background-maintenance path, or an explicit proof override. Redact private details before posting, and updating the PR body should trigger re-review; if not, ask a maintainer to comment @clawsweeper re-review. After adding proof, update the PR body; ClawSweeper should re-review automatically. If it does not, the PR author or someone with repository write access can comment @clawsweeper re-review.

Risk before merge

[P1] The supplied after-fix proof is copied Vitest output, not a live OpenClaw bootstrap/reconcile overflow or redacted runtime log showing bootstrap-created debt being consumed.
[P1] Merging changes session-state timing for background-mode engines: bootstrap maintenance that used to run inline now runs on the deferred session-lane consumer.

Maintainer options:

Require live bootstrap proof (recommended)
Ask for redacted live logs, terminal output, or a linked artifact showing bootstrap-created deferred debt queued and consumed after this patch before merge.
Accept a maintainer proof override
A maintainer with enough subsystem confidence can explicitly override the proof gate and own the session-state timing risk for this small scheduling change.
Pause until LCM can be exercised
If no one can run the background-maintenance engine in a real setup, keep the PR open rather than merging a session-state timing change on test proof alone.

Next step before merge

[P1] Manual review remains because the only blocker is contributor live proof or a maintainer proof override; I found no narrow code repair for ClawSweeper to apply.

Security
Cleared: The diff only changes internal TypeScript scheduling logic and a focused Vitest test; it does not touch dependencies, CI, secrets, permissions, network calls, or package metadata.

Review details

Best possible solution:

Land the narrow bootstrap deferral only after redacted live bootstrap/reconcile logs or an explicit maintainer proof override confirms the background worker consumes the stranded debt; keep the plugin-owned hot-cache policy case out of this PR.

Do we have a high-confidence way to reproduce the issue?

Yes from source: current main runs bootstrap maintenance in foreground while only background execution sets allowDeferredCompactionExecution, and bootstrap is not currently eligible for the deferred background worker. I did not reproduce the full live LCM overflow; the remaining proof gap is live behavior, not source traceability.

Is this the best way to solve the issue?

Yes for the code shape: reusing the existing deferred background maintenance lane for engines that already opt into background maintenance is the narrowest core fix I found. The merge-readiness gap is proof of the real bootstrap/reconcile scenario, not an alternate code repair.

AGENTS.md: found and applied where relevant.

Codex review notes: model gpt-5.5, reasoning high; reviewed against 5ab430fa11ee.

Label changes

Label justifications:

P1: The linked bug affects session context maintenance in real agent runs and can leave deferred compaction debt stranded across turns.
merge-risk: 🚨 session-state: The PR intentionally changes when bootstrap maintenance side effects occur for background-mode context engines, which can affect transcript/session-state ordering.
rating: 🦪 silver shellfish: Overall readiness is 🦪 silver shellfish; proof is 🦪 silver shellfish and patch quality is 🐚 platinum hermit.
status: 📣 needs proof: The PR needs real behavior proof before ClawSweeper can clear the contributor ask. Needs real behavior proof before merge: The PR body provides copied Vitest output only; before merge it needs redacted live logs/terminal output or a linked artifact showing the real bootstrap/reconcile background-maintenance path, or an explicit proof override. Redact private details before posting, and updating the PR body should trigger re-review; if not, ask a maintainer to comment @clawsweeper re-review. After adding proof, update the PR body; ClawSweeper should re-review automatically. If it does not, the PR author or someone with repository write access can comment @clawsweeper re-review.

Evidence reviewed

PR surface:

Source +4, Tests +61. Total +65 across 2 files.

View PR surface stats

Area	Files	Added	Removed	Net
Source	1	5	1	+4
Tests	1	61	0	+61
Docs	0	0	0	0
Config	0	0	0	0
Generated	0	0	0	0
Other	0	0	0	0
Total	2	66	1	+65

What I checked:

Repository policy read: Root AGENTS.md and scoped agent runner guidance were read; agent/session-state review policy required checking callers, callees, sibling maintenance behavior, tests, and history before verdict. (AGENTS.md:1, 5ab430fa11ee)
Current main source behavior: Current main only defers context-engine maintenance when params.reason === "turn"; executeContextEngineMaintenance only sets allowDeferredCompactionExecution for background execution, so foreground bootstrap maintenance cannot consume deferred compaction debt. (src/agents/embedded-agent-runner/context-engine-maintenance.ts:713, 5ab430fa11ee)
Bootstrap caller path: The harness bootstrap path calls maintenance with reason: "bootstrap" after engine bootstrap, and the embedded attempt caller passes session manager, runtime context, config, and agent id into the same maintenance helper. (src/agents/harness/context-engine-lifecycle.ts:47, 5ab430fa11ee)
Existing background worker behavior: The reusable deferred worker dedups by session key, waits for the session lane to go idle, and invokes maintenance with executionMode: "background", which is the mode that grants deferred compaction execution. (src/agents/embedded-agent-runner/context-engine-maintenance.ts:481, 5ab430fa11ee)
Context-engine contract: The type contract says maintenance runs after bootstrap, turns, or compaction; turnMaintenanceMode is an opt-in background mode, and allowDeferredCompactionExecution is the host flag engines use to consume deferred compaction debt. (src/context-engine/types.ts:108, 5ab430fa11ee)
PR implementation: The PR adds bootstrap to the same defer gate used by turns and adds a regression asserting bootstrap maintenance queues a task and later calls maintain() with allowDeferredCompactionExecution: true. (src/agents/embedded-agent-runner/context-engine-maintenance.ts:718, b2dadefb5336)

Likely related people:

EVA: Authored the squash commit that introduced idle-aware background context-engine turn maintenance and the underlying background-maintenance contract this PR extends. (role: introduced behavior; confidence: high; commits: c15b295a8564; files: src/context-engine/types.ts, src/agents/pi-embedded-runner/context-engine-maintenance.ts)
@jalehman: Reviewed the original background-maintenance PR and authored the later deferred-maintenance token-budget fix in the same subsystem. (role: reviewer and recent adjacent owner; confidence: high; commits: c15b295a8564, 75e7fc97f804; files: src/agents/pi-embedded-runner/run/attempt.ts, src/context-engine/types.ts)
Peter Steinberger: Current-main blame/log show recent broad maintenance of the renamed embedded context-engine maintenance and harness files that now own this code path. (role: recent area contributor; confidence: medium; commits: 045145c70082, 5ab430fa11ee; files: src/agents/embedded-agent-runner/context-engine-maintenance.ts, src/agents/harness/context-engine-lifecycle.ts, src/context-engine/types.ts)

What the crustacean ranks mean

🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works

ClawSweeper keeps one durable marker-backed review comment per issue or PR.
Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
Maintainers can also comment @clawsweeper review to request a fresh review only.
Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

dripsmvcp · 2026-06-04T11:52:23Z

@clawsweeper re-review

clawsweeper · 2026-06-04T11:52:26Z

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

State: Complete
Detail: The targeted re-review finished, the durable review comment was updated, and the synced verdict was routed.
Run: https://github.com/openclaw/clawsweeper/actions/runs/26950038343
Updated: 2026-06-04T12:00:48.633Z

openclaw-barnacle Bot added agents Agent runtime and tooling size: S proof: supplied External PR includes structured after-fix real behavior proof. labels Jun 4, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(agents): defer bootstrap context-engine maintenance to background#90199

fix(agents): defer bootstrap context-engine maintenance to background#90199
dripsmvcp wants to merge 1 commit into
openclaw:mainfrom
dripsmvcp:fix/67716-bootstrap-deferred-compaction

dripsmvcp commented Jun 4, 2026 •

edited

Loading

Uh oh!

clawsweeper Bot commented Jun 4, 2026 •

edited

Loading

Uh oh!

dripsmvcp commented Jun 4, 2026

Uh oh!

clawsweeper Bot commented Jun 4, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

dripsmvcp commented Jun 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Linked context

Real behavior proof (required for external PRs)

Tests and validation

Risk checklist

Current review state

Uh oh!

clawsweeper Bot commented Jun 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dripsmvcp commented Jun 4, 2026

Uh oh!

clawsweeper Bot commented Jun 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

dripsmvcp commented Jun 4, 2026 •

edited

Loading

clawsweeper Bot commented Jun 4, 2026 •

edited

Loading

clawsweeper Bot commented Jun 4, 2026 •

edited

Loading