Skip to content

Defer proactive compaction debt and surface LCM maintenance state#408

Merged
jalehman merged 6 commits intoMartian-Engineering:mainfrom
electricsheephq:codex/deferred-compaction-maintenance
Apr 13, 2026
Merged

Defer proactive compaction debt and surface LCM maintenance state#408
jalehman merged 6 commits intoMartian-Engineering:mainfrom
electricsheephq:codex/deferred-compaction-maintenance

Conversation

@100yenadmin
Copy link
Copy Markdown
Contributor

@100yenadmin 100yenadmin commented Apr 12, 2026

Summary

Closes #407.

This changes proactive compaction from inline turn work into deferred maintenance debt by default, then makes that deferred path cache-safe for Anthropic. The result is a hybrid model: proactive compaction no longer blocks the foreground afterTurn() path, and prompt-mutating deferred compaction no longer rewrites a still-hot Anthropic cache.

Why

  • Inline proactive compaction can keep the main session lane busy after the assistant reply is already visible.
  • That starves immediate follow-up turns and can push subagent completion announce traffic into timeouts.
  • Anthropic prompt caching is exact-prefix, so rewriting the active prompt too soon can burn a freshly written cache and drive up cost.
  • Orchestrator-heavy sessions make the pain worse: one main agent plus up to 4 subagents may all need fast LCM reads while one hot session is still writing.

What Changed

  • Added proactiveThresholdCompactionMode: "deferred" | "inline"; default is "deferred".
  • afterTurn() now records coalesced proactive compaction debt per conversation/session instead of running proactive threshold or leaf compaction inline.
  • maintain() now consumes that debt only when runtime context explicitly allows deferred execution, and it skips prompt-mutating deferred compaction while Anthropic cache is still hot.
  • assemble() now consumes deferred prompt-mutating compaction pre-assembly when the cache is cold or the prompt is approaching overflow.
  • Added persistent maintenance state for pending/running/last success/last failure metadata.
  • Added cacheAwareCompaction.cacheTTLSeconds (default 300) and telemetry for lastApiCallAt, lastCacheTouchAt, provider, and model.
  • Extended LCM status/command output and startup banners to surface maintenance state and cache-aware compaction context.
  • Preserved the local same-turn safety guard in legacy inline mode so leaf and threshold compaction do not both fire on the same turn.
  • Kept manual, overflow, and timeout compaction synchronous.
  • Advertised turnMaintenanceMode: "background" for the companion OpenClaw host change.

Safety

  • Legacy inline mode remains available as a rollback/debug escape hatch.
  • Read-only LCM tools continue to work while compaction debt is pending.
  • Public tool inputs stay stable apart from the new config option.
  • Anthropic-active sessions keep their hot cache unless compaction is needed for correctness or the cache has already gone cold.

Validation

  • npm test -> 40 files / 709 tests passed
  • npm run build passed

Companion Host Work

@100yenadmin
Copy link
Copy Markdown
Contributor Author

Companion host PR is now live at openclaw/openclaw#65233.

Validation on this branch is green:

  • npm test -> 40 files / 706 tests passed
  • npm run build passed

Review guidance: this PR is the immediate plugin-side stabilization. It removes proactive compaction from the foreground turn path by default, records coalesced maintenance debt instead, and surfaces maintenance state to the user/status surface. The companion OpenClaw PR provides the hard host-side guarantee by moving turn-triggered maintenance onto an idle-aware hidden background task.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR shifts proactive threshold/leaf compaction from the foreground afterTurn() path to deferred “maintenance debt” by default, and exposes per-conversation maintenance state (pending/running/last success/failure) via command output and startup banners.

Changes:

  • Add proactiveThresholdCompactionMode: "deferred" | "inline" (default: deferred) and thread it through config/manifest/docs.
  • Record coalesced proactive compaction debt in afterTurn() and consume it in maintain() only when the host opts in (allowDeferredCompactionExecution).
  • Persist and surface compaction maintenance state (DB table + store + /lcm status output + startup banner updates).

Reviewed changes

Copilot reviewed 24 out of 24 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
test/session-operation-queues.test.ts Updates test config to include proactiveThresholdCompactionMode.
test/plugin-config-registration.test.ts Verifies new config wiring, new startup banner text, and engine.info.turnMaintenanceMode.
test/lcm-tools.test.ts Updates tool test config for new config key.
test/lcm-expand-tool.test.ts Updates tool test config for new config key.
test/lcm-expand-query-tool.test.ts Updates tool test config for new config key.
test/lcm-command.test.ts Adds status-output coverage for persisted maintenance state.
test/expansion.test.ts Updates base config to include new config key.
test/engine.test.ts Adds coverage for deferred debt recording and opt-in consumption in maintain().
test/config.test.ts Adds coverage for config default/env/plugin-config precedence + manifest schema exposure.
test/circuit-breaker.test.ts Updates test config for new config key.
test/bootstrap-flood-regression.test.ts Updates test config for new config key.
src/store/index.ts Exports the new maintenance store and its record type.
src/store/compaction-maintenance-store.ts New store for persisting/querying per-conversation deferred-compaction maintenance state.
src/startup-banner-log.ts Adds a startup banner key for proactive compaction mode.
src/plugin/lcm-command.ts Extends status output to display maintenance state (pending/running/last failure/timestamps/budgets).
src/plugin/index.ts Logs proactive compaction mode and includes it in the “Plugin loaded” banner.
src/engine.ts Implements deferred debt recording/consumption and adds maintenance-store integration + new config behavior.
src/db/migration.ts Adds conversation_compaction_maintenance table to persist maintenance state.
src/db/config.ts Adds config type + parsing for LCM_PROACTIVE_THRESHOLD_COMPACTION_MODE and plugin config key.
skills/lossless-claw/references/config.md Documents the new config option and its behavior.
README.md Updates docs for /lcm status maintenance visibility + new config/env key.
openclaw.plugin.json Adds UI metadata and schema enum for the new config option.
docs/configuration.md Documents the config option and deferred proactive compaction behavior.
.changeset/deferred-compaction-maintenance.md Declares a minor release with deferred proactive compaction + maintenance visibility.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/engine.ts
Comment thread src/engine.ts
Comment thread src/plugin/lcm-command.ts Outdated
@100yenadmin
Copy link
Copy Markdown
Contributor Author

Latest follow-up review fixes are pushed and all review threads are now resolved.

Current validation:

  • npm test -> 40 files / 707 tests passed
  • npm run build passed

What changed since the initial review:

  • deferred leaf auth failures now keep maintenance debt pending and report failure correctly
  • the unused maintenance-record import was removed
  • command status now reuses the maintenance store mapping instead of duplicating it

This PR is open, non-draft, mergeable, and ready for patch release from our side. Companion host change is open at openclaw/openclaw#65233.

@liu51115
Copy link
Copy Markdown
Contributor

Nice work on the maintenance state tracking and coalescing — those are gaps in our simpler approach (#385).

One concern: deferring compaction to maintain() solves foreground blocking but doesn't address the cache destruction problem or the race with the next turn.

Cache destruction: maintain() fires in the post-turn lifecycle, potentially seconds after the API call when the provider cache is still hot. Compacting at that point rewrites the conversation prefix, invalidating a cache the provider just wrote. The next turn pays full cache-write cost again.

Race with next turn: You can't predict when the next turn arrives. If maintain() is consuming deferred debt and a new turn comes in, it either blocks waiting for compaction to finish, or the turn proceeds with stale assembly. Deferring moves the race condition from afterTurn to maintain, but doesn't eliminate it.

Our approach in #385 solves both by running compaction pre-assembly with a cache-TTL gate:

  1. Compaction runs at the start of assemble(), before the LLM call — so the turn is never interrupted mid-flight by concurrent compaction.
  2. It only fires when Date.now() - lastApiCallAt > cacheTTLSeconds (default 300s) — so we only compact when the provider cache has already expired.

Pre-assembly is the only position where you can guarantee both non-blocking turns AND cache safety. After the turn, you're always racing against the next arrival.

The two approaches could combine well: your deferred debt tracking + maintenance visibility, with the actual debt consumption happening at pre-assembly time gated by cache TTL instead of in maintain().

@100yenadmin
Copy link
Copy Markdown
Contributor Author

I pushed the cache-aware follow-up onto this PR.

Highlights:

  • deferred compaction is now Anthropic-cache-aware instead of just background-safe
  • afterTurn() records cache/provider telemetry but never mutates the active prompt in deferred mode
  • maintain() skips prompt-mutating deferred compaction while Anthropic cache is still hot
  • assemble() consumes deferred prompt-mutating debt only when cache is cold or overflow pressure makes it necessary
  • legacy inline mode keeps the local same-turn guard so leaf + threshold compaction do not stack on one turn

Validation after the update:

  • npm test -> 40 files / 709 tests passed
  • npm run build passed

This should keep the no-stall fix intact while preserving the value of Anthropic prompt caching for hot coding sessions.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 25 out of 25 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/engine.ts
@100yenadmin
Copy link
Copy Markdown
Contributor Author

@codex review

Latest head includes the cache-touch fallback fix for Anthropic hot-cache deferral.

@liu51115
Copy link
Copy Markdown
Contributor

Reviewed the updated diff. The cache-safety additions look solid — resolvePromptCacheTtlMs correctly maps retention classes, shouldDelayPromptMutatingDeferredCompaction gates actual execution on cache state, and debt consumption at pre-assembly/maintain ensures compaction only fires when the cache is cold.

In deferred mode, afterTurn still runs evaluateIncrementalCompaction to record debt, but since no compaction executes there and consumption is cache-gated, the time-inversion concern is moot in practice.

This supersedes our #385 — closing it in favor of this PR. The per-agent cache TTL via runtime telemetry retention is cleaner than our approach of re-reading agent config.

@100yenadmin
Copy link
Copy Markdown
Contributor Author

@jalehman @liu51115 this will be great addition. We're actually creating a hybrid mode right now that allows 1hr TTL for system prompts and compacted by Anthropic internal compaction (their compaction is better than LCM + native OP) + 5HR TTL for short lived information all in one cache without busting it.

Adding on top of that LCM for lossless memory so basically the best of both worlds. Will update @liu51115 if you want to help test it.

@100yenadmin
Copy link
Copy Markdown
Contributor Author

@liu51115 these are all good enough to download and test now as local patches: openclaw/openclaw#65288

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 25 out of 25 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/store/compaction-maintenance-store.ts Outdated
@liu51115
Copy link
Copy Markdown
Contributor

liu51115 commented Apr 12, 2026 via email

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 26 out of 26 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

pending: true,
requestedAt: input.requestedAt ?? new Date(),
reason: input.reason,
running: false,
Copy link

Copilot AI Apr 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

requestProactiveCompactionDebt() always patches running: false, which will clear an in-progress running state if new debt is recorded while maintenance is still executing (e.g., another worker/process records debt mid-run). That can make status output incorrectly report idle and can also enable a second consumer to start work because the row no longer shows running=true. Consider preserving the existing running flag (omit running from the patch, or only set it to false when you explicitly finish/cancel a run) so pending debt can be recorded without losing the running marker.

Suggested change
running: false,

Copilot uses AI. Check for mistakes.
Comment thread src/engine.ts Outdated
Comment on lines +4483 to +4490
await this.consumeDeferredCompactionDebt({
conversationId: conversation.conversationId,
sessionId: params.sessionId,
sessionKey: params.sessionKey,
tokenBudget,
currentTokenCount: liveContextTokens,
legacyParams: deferredLegacyParams,
});
Copy link

Copilot AI Apr 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assemble() can call consumeDeferredCompactionDebt(), which runs executeCompactionCore()/executeLeafCompactionCore() explicitly without taking the per-session queue. Since assemble() itself is not serialized with withSessionQueue(), deferred compaction can run concurrently with other operations (including another assemble(), afterTurn() ingestion, or background maintain() from a different engine/process), risking duplicate compaction executions and inconsistent reads/writes of the conversation state. Consider wrapping the deferred-debt consumption path in withSessionQueue(this.resolveSessionQueueKey(sessionId, sessionKey), ...) (or routing through the existing queued compact() / compactLeafAsync() APIs) so prompt-mutating work remains single-file/serialized per session.

Suggested change
await this.consumeDeferredCompactionDebt({
conversationId: conversation.conversationId,
sessionId: params.sessionId,
sessionKey: params.sessionKey,
tokenBudget,
currentTokenCount: liveContextTokens,
legacyParams: deferredLegacyParams,
});
await this.withSessionQueue(
this.resolveSessionQueueKey(params.sessionId, params.sessionKey),
async () =>
this.consumeDeferredCompactionDebt({
conversationId: conversation.conversationId,
sessionId: params.sessionId,
sessionKey: params.sessionKey,
tokenBudget,
currentTokenCount: liveContextTokens,
legacyParams: deferredLegacyParams,
}),
);

Copilot uses AI. Check for mistakes.
@jalehman jalehman self-assigned this Apr 12, 2026
Restore queue safety for assemble-triggered deferred compaction, re-evaluate deferred debt with the stricter live token budget, and preserve deferred leaf reasons end to end. Add focused regressions for queue serialization, budget shrink, and deferred leaf debt recording.

Regeneration-Prompt: |
  Fix the deferred proactive compaction path in lossless-claw so the plugin remains internally correct regardless of host scheduling. The assemble() path must not execute deferred compaction writes outside the engine's per-session queue, because maintain() and the explicit compaction entry points already rely on that queue as the mutation boundary.

  Also fix deferred debt re-evaluation so consumption uses the current runtime budget, or at least the stricter of current and recorded budgets, instead of preferring a stale stored token budget. Finally, preserve the incremental compaction decision reason through the real production type so afterTurn() records leaf-trigger debt correctly instead of dropping the reason or reusing a stale threshold reason from an older maintenance row.

  Add regression coverage that exercises the real leaf-trigger recording path, proves assemble() waits for the session queue before consuming deferred debt, and proves maintain() re-evaluates using the stricter live token budget.
@jalehman jalehman merged commit abf31da into Martian-Engineering:main Apr 13, 2026
1 check passed
@github-actions github-actions Bot mentioned this pull request Apr 11, 2026
jalehman added a commit that referenced this pull request Apr 14, 2026
…che smoothing

Narrow fix for the interaction between #362 (cache-aware routing noise
protection) and #408 (deferred proactive compaction). The issue: when
incremental evaluation returns 'no compaction needed' due to hot-cache
budget-headroom or hot-cache-defer (routing-noise suppression), deferred
Anthropic sessions were incorrectly terminating their maintenance debt
instead of continuing to execute after the prompt-cache TTL had expired.

The fix: check whether deferred Anthropic sessions should override the
cache-aware 'no compaction' decision once the TTL-safe hold has elapsed.
If so, allow leaf compaction to execute despite the routing-noise
hysteresis. Inline and non-deferred paths keep existing behavior unchanged.

- Adds shouldForceDeferredAnthropicLeafCompaction() to gate override
- Routes deferred Anthropic leaf compaction through executeLeafCompactionCore
  when TTL is stale
- Adds regression test: 'assemble() still executes deferred Anthropic
  leaf debt after TTL expiry when cache smoothing remains effectively hot'
- No changes to routing-noise protection or incremental evaluation logic

Fixes: #408
Related: #362
jalehman added a commit that referenced this pull request Apr 14, 2026
…che smoothing (#434)

* fix: execute deferred Anthropic leaf debt after TTL expiry despite cache smoothing

Narrow fix for the interaction between #362 (cache-aware routing noise
protection) and #408 (deferred proactive compaction). The issue: when
incremental evaluation returns 'no compaction needed' due to hot-cache
budget-headroom or hot-cache-defer (routing-noise suppression), deferred
Anthropic sessions were incorrectly terminating their maintenance debt
instead of continuing to execute after the prompt-cache TTL had expired.

The fix: check whether deferred Anthropic sessions should override the
cache-aware 'no compaction' decision once the TTL-safe hold has elapsed.
If so, allow leaf compaction to execute despite the routing-noise
hysteresis. Inline and non-deferred paths keep existing behavior unchanged.

- Adds shouldForceDeferredAnthropicLeafCompaction() to gate override
- Routes deferred Anthropic leaf compaction through executeLeafCompactionCore
  when TTL is stale
- Adds regression test: 'assemble() still executes deferred Anthropic
  leaf debt after TTL expiry when cache smoothing remains effectively hot'
- No changes to routing-noise protection or incremental evaluation logic

Fixes: #408
Related: #362

* fix: use catch-up settings for stale TTL deferred Anthropic debt

When deferred Anthropic leaf debt overrides hot-cache smoothing after the prompt-cache TTL expires, run the leaf compaction with the cold-cache catch-up envelope instead of the original hot-cache single-pass settings. This preserves the intended deferred recovery behavior and prevents the maintenance record from being cleared after a single underpowered pass.\n\nAdd a regression that proves the stale-TTL override enables catch-up execution rather than reusing the hot-cache defer envelope, and update the existing stale-TTL test to assert the new execution parameters.\n\nRegeneration-Prompt: |\n  Address the review finding on PR #434 in lossless-claw. The stale-TTL deferred Anthropic compaction fix should not merely force one hot-cache-sized leaf compaction pass and then clear the maintenance debt. Keep the change narrow and additive inside the deferred-compaction path. When the Anthropic prompt-cache TTL has expired and deferred debt must override cache smoothing, execute with the same catch-up envelope that cold-cache recovery uses, especially the extra pass allowance and condensed-pass setting. Add regression coverage that would fail if the forced path still used maxPasses=1 or allowCondensedPasses=false, and update any stale expectations in the existing TTL-expiry test.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Proactive threshold compaction can stall foreground turns and hide maintenance state

4 participants