Defer proactive compaction debt and surface LCM maintenance state#408
Conversation
|
Companion host PR is now live at openclaw/openclaw#65233. Validation on this branch is green:
Review guidance: this PR is the immediate plugin-side stabilization. It removes proactive compaction from the foreground turn path by default, records coalesced maintenance debt instead, and surfaces maintenance state to the user/status surface. The companion OpenClaw PR provides the hard host-side guarantee by moving turn-triggered maintenance onto an idle-aware hidden background task. |
There was a problem hiding this comment.
Pull request overview
This PR shifts proactive threshold/leaf compaction from the foreground afterTurn() path to deferred “maintenance debt” by default, and exposes per-conversation maintenance state (pending/running/last success/failure) via command output and startup banners.
Changes:
- Add
proactiveThresholdCompactionMode: "deferred" | "inline"(default:deferred) and thread it through config/manifest/docs. - Record coalesced proactive compaction debt in
afterTurn()and consume it inmaintain()only when the host opts in (allowDeferredCompactionExecution). - Persist and surface compaction maintenance state (DB table + store +
/lcm statusoutput + startup banner updates).
Reviewed changes
Copilot reviewed 24 out of 24 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| test/session-operation-queues.test.ts | Updates test config to include proactiveThresholdCompactionMode. |
| test/plugin-config-registration.test.ts | Verifies new config wiring, new startup banner text, and engine.info.turnMaintenanceMode. |
| test/lcm-tools.test.ts | Updates tool test config for new config key. |
| test/lcm-expand-tool.test.ts | Updates tool test config for new config key. |
| test/lcm-expand-query-tool.test.ts | Updates tool test config for new config key. |
| test/lcm-command.test.ts | Adds status-output coverage for persisted maintenance state. |
| test/expansion.test.ts | Updates base config to include new config key. |
| test/engine.test.ts | Adds coverage for deferred debt recording and opt-in consumption in maintain(). |
| test/config.test.ts | Adds coverage for config default/env/plugin-config precedence + manifest schema exposure. |
| test/circuit-breaker.test.ts | Updates test config for new config key. |
| test/bootstrap-flood-regression.test.ts | Updates test config for new config key. |
| src/store/index.ts | Exports the new maintenance store and its record type. |
| src/store/compaction-maintenance-store.ts | New store for persisting/querying per-conversation deferred-compaction maintenance state. |
| src/startup-banner-log.ts | Adds a startup banner key for proactive compaction mode. |
| src/plugin/lcm-command.ts | Extends status output to display maintenance state (pending/running/last failure/timestamps/budgets). |
| src/plugin/index.ts | Logs proactive compaction mode and includes it in the “Plugin loaded” banner. |
| src/engine.ts | Implements deferred debt recording/consumption and adds maintenance-store integration + new config behavior. |
| src/db/migration.ts | Adds conversation_compaction_maintenance table to persist maintenance state. |
| src/db/config.ts | Adds config type + parsing for LCM_PROACTIVE_THRESHOLD_COMPACTION_MODE and plugin config key. |
| skills/lossless-claw/references/config.md | Documents the new config option and its behavior. |
| README.md | Updates docs for /lcm status maintenance visibility + new config/env key. |
| openclaw.plugin.json | Adds UI metadata and schema enum for the new config option. |
| docs/configuration.md | Documents the config option and deferred proactive compaction behavior. |
| .changeset/deferred-compaction-maintenance.md | Declares a minor release with deferred proactive compaction + maintenance visibility. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
Latest follow-up review fixes are pushed and all review threads are now resolved. Current validation:
What changed since the initial review:
This PR is open, non-draft, mergeable, and ready for patch release from our side. Companion host change is open at openclaw/openclaw#65233. |
|
Nice work on the maintenance state tracking and coalescing — those are gaps in our simpler approach (#385). One concern: deferring compaction to Cache destruction: Race with next turn: You can't predict when the next turn arrives. If Our approach in #385 solves both by running compaction pre-assembly with a cache-TTL gate:
Pre-assembly is the only position where you can guarantee both non-blocking turns AND cache safety. After the turn, you're always racing against the next arrival. The two approaches could combine well: your deferred debt tracking + maintenance visibility, with the actual debt consumption happening at pre-assembly time gated by cache TTL instead of in |
|
I pushed the cache-aware follow-up onto this PR. Highlights:
Validation after the update:
This should keep the no-stall fix intact while preserving the value of Anthropic prompt caching for hot coding sessions. |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 25 out of 25 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
@codex review Latest head includes the cache-touch fallback fix for Anthropic hot-cache deferral. |
|
Reviewed the updated diff. The cache-safety additions look solid — In deferred mode, afterTurn still runs This supersedes our #385 — closing it in favor of this PR. The per-agent cache TTL via runtime telemetry retention is cleaner than our approach of re-reading agent config. |
|
@jalehman @liu51115 this will be great addition. We're actually creating a hybrid mode right now that allows 1hr TTL for system prompts and compacted by Anthropic internal compaction (their compaction is better than LCM + native OP) + 5HR TTL for short lived information all in one cache without busting it. Adding on top of that LCM for lossless memory so basically the best of both worlds. Will update @liu51115 if you want to help test it. |
|
@liu51115 these are all good enough to download and test now as local patches: openclaw/openclaw#65288 |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 25 out of 25 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
Very exciting development. Testing and will report back
…On Sun, Apr 12, 2026 at 23:08 EVA ***@***.***> wrote:
*100yenadmin* left a comment (Martian-Engineering/lossless-claw#408)
<#408 (comment)>
@jalehman <https://github.com/jalehman> @liu51115
<https://github.com/liu51115> this will be great addition. We're actually
creating a hybrid mode right now that allows 1hr TTL for system prompts and
compacted by Anthropic internal compaction (their compaction is better than
LCM + native OP) + 5HR TTL for short lived information all in one cache
without busting it.
Adding on top of that LCM for lossless memory so basically the best of
both worlds. Will update @liu51115 <https://github.com/liu51115> if you
want to help test it.
—
Reply to this email directly, view it on GitHub
<#408 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AHPGBHJQPGCQ5K7SZSLGU5D4VOWPZAVCNFSM6AAAAACXVRO3ACVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHM2DEMZRG43TAMJWGA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 26 out of 26 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| pending: true, | ||
| requestedAt: input.requestedAt ?? new Date(), | ||
| reason: input.reason, | ||
| running: false, |
There was a problem hiding this comment.
requestProactiveCompactionDebt() always patches running: false, which will clear an in-progress running state if new debt is recorded while maintenance is still executing (e.g., another worker/process records debt mid-run). That can make status output incorrectly report idle and can also enable a second consumer to start work because the row no longer shows running=true. Consider preserving the existing running flag (omit running from the patch, or only set it to false when you explicitly finish/cancel a run) so pending debt can be recorded without losing the running marker.
| running: false, |
| await this.consumeDeferredCompactionDebt({ | ||
| conversationId: conversation.conversationId, | ||
| sessionId: params.sessionId, | ||
| sessionKey: params.sessionKey, | ||
| tokenBudget, | ||
| currentTokenCount: liveContextTokens, | ||
| legacyParams: deferredLegacyParams, | ||
| }); |
There was a problem hiding this comment.
assemble() can call consumeDeferredCompactionDebt(), which runs executeCompactionCore()/executeLeafCompactionCore() explicitly without taking the per-session queue. Since assemble() itself is not serialized with withSessionQueue(), deferred compaction can run concurrently with other operations (including another assemble(), afterTurn() ingestion, or background maintain() from a different engine/process), risking duplicate compaction executions and inconsistent reads/writes of the conversation state. Consider wrapping the deferred-debt consumption path in withSessionQueue(this.resolveSessionQueueKey(sessionId, sessionKey), ...) (or routing through the existing queued compact() / compactLeafAsync() APIs) so prompt-mutating work remains single-file/serialized per session.
| await this.consumeDeferredCompactionDebt({ | |
| conversationId: conversation.conversationId, | |
| sessionId: params.sessionId, | |
| sessionKey: params.sessionKey, | |
| tokenBudget, | |
| currentTokenCount: liveContextTokens, | |
| legacyParams: deferredLegacyParams, | |
| }); | |
| await this.withSessionQueue( | |
| this.resolveSessionQueueKey(params.sessionId, params.sessionKey), | |
| async () => | |
| this.consumeDeferredCompactionDebt({ | |
| conversationId: conversation.conversationId, | |
| sessionId: params.sessionId, | |
| sessionKey: params.sessionKey, | |
| tokenBudget, | |
| currentTokenCount: liveContextTokens, | |
| legacyParams: deferredLegacyParams, | |
| }), | |
| ); |
Restore queue safety for assemble-triggered deferred compaction, re-evaluate deferred debt with the stricter live token budget, and preserve deferred leaf reasons end to end. Add focused regressions for queue serialization, budget shrink, and deferred leaf debt recording. Regeneration-Prompt: | Fix the deferred proactive compaction path in lossless-claw so the plugin remains internally correct regardless of host scheduling. The assemble() path must not execute deferred compaction writes outside the engine's per-session queue, because maintain() and the explicit compaction entry points already rely on that queue as the mutation boundary. Also fix deferred debt re-evaluation so consumption uses the current runtime budget, or at least the stricter of current and recorded budgets, instead of preferring a stale stored token budget. Finally, preserve the incremental compaction decision reason through the real production type so afterTurn() records leaf-trigger debt correctly instead of dropping the reason or reusing a stale threshold reason from an older maintenance row. Add regression coverage that exercises the real leaf-trigger recording path, proves assemble() waits for the session queue before consuming deferred debt, and proves maintain() re-evaluates using the stricter live token budget.
…che smoothing Narrow fix for the interaction between #362 (cache-aware routing noise protection) and #408 (deferred proactive compaction). The issue: when incremental evaluation returns 'no compaction needed' due to hot-cache budget-headroom or hot-cache-defer (routing-noise suppression), deferred Anthropic sessions were incorrectly terminating their maintenance debt instead of continuing to execute after the prompt-cache TTL had expired. The fix: check whether deferred Anthropic sessions should override the cache-aware 'no compaction' decision once the TTL-safe hold has elapsed. If so, allow leaf compaction to execute despite the routing-noise hysteresis. Inline and non-deferred paths keep existing behavior unchanged. - Adds shouldForceDeferredAnthropicLeafCompaction() to gate override - Routes deferred Anthropic leaf compaction through executeLeafCompactionCore when TTL is stale - Adds regression test: 'assemble() still executes deferred Anthropic leaf debt after TTL expiry when cache smoothing remains effectively hot' - No changes to routing-noise protection or incremental evaluation logic Fixes: #408 Related: #362
…che smoothing (#434) * fix: execute deferred Anthropic leaf debt after TTL expiry despite cache smoothing Narrow fix for the interaction between #362 (cache-aware routing noise protection) and #408 (deferred proactive compaction). The issue: when incremental evaluation returns 'no compaction needed' due to hot-cache budget-headroom or hot-cache-defer (routing-noise suppression), deferred Anthropic sessions were incorrectly terminating their maintenance debt instead of continuing to execute after the prompt-cache TTL had expired. The fix: check whether deferred Anthropic sessions should override the cache-aware 'no compaction' decision once the TTL-safe hold has elapsed. If so, allow leaf compaction to execute despite the routing-noise hysteresis. Inline and non-deferred paths keep existing behavior unchanged. - Adds shouldForceDeferredAnthropicLeafCompaction() to gate override - Routes deferred Anthropic leaf compaction through executeLeafCompactionCore when TTL is stale - Adds regression test: 'assemble() still executes deferred Anthropic leaf debt after TTL expiry when cache smoothing remains effectively hot' - No changes to routing-noise protection or incremental evaluation logic Fixes: #408 Related: #362 * fix: use catch-up settings for stale TTL deferred Anthropic debt When deferred Anthropic leaf debt overrides hot-cache smoothing after the prompt-cache TTL expires, run the leaf compaction with the cold-cache catch-up envelope instead of the original hot-cache single-pass settings. This preserves the intended deferred recovery behavior and prevents the maintenance record from being cleared after a single underpowered pass.\n\nAdd a regression that proves the stale-TTL override enables catch-up execution rather than reusing the hot-cache defer envelope, and update the existing stale-TTL test to assert the new execution parameters.\n\nRegeneration-Prompt: |\n Address the review finding on PR #434 in lossless-claw. The stale-TTL deferred Anthropic compaction fix should not merely force one hot-cache-sized leaf compaction pass and then clear the maintenance debt. Keep the change narrow and additive inside the deferred-compaction path. When the Anthropic prompt-cache TTL has expired and deferred debt must override cache smoothing, execute with the same catch-up envelope that cold-cache recovery uses, especially the extra pass allowance and condensed-pass setting. Add regression coverage that would fail if the forced path still used maxPasses=1 or allowCondensedPasses=false, and update any stale expectations in the existing TTL-expiry test.
Summary
Closes #407.
This changes proactive compaction from inline turn work into deferred maintenance debt by default, then makes that deferred path cache-safe for Anthropic. The result is a hybrid model: proactive compaction no longer blocks the foreground
afterTurn()path, and prompt-mutating deferred compaction no longer rewrites a still-hot Anthropic cache.Why
What Changed
proactiveThresholdCompactionMode: "deferred" | "inline"; default is"deferred".afterTurn()now records coalesced proactive compaction debt per conversation/session instead of running proactive threshold or leaf compaction inline.maintain()now consumes that debt only when runtime context explicitly allows deferred execution, and it skips prompt-mutating deferred compaction while Anthropic cache is still hot.assemble()now consumes deferred prompt-mutating compaction pre-assembly when the cache is cold or the prompt is approaching overflow.cacheAwareCompaction.cacheTTLSeconds(default300) and telemetry forlastApiCallAt,lastCacheTouchAt,provider, andmodel.inlinemode so leaf and threshold compaction do not both fire on the same turn.turnMaintenanceMode: "background"for the companion OpenClaw host change.Safety
inlinemode remains available as a rollback/debug escape hatch.Validation
npm test-> 40 files / 709 tests passednpm run buildpassedCompanion Host Work