feat: add cache-aware incremental compaction and dynamic leaf chunk sizing#318
Merged
feat: add cache-aware incremental compaction and dynamic leaf chunk sizing#318
Conversation
Persist prompt-cache telemetry after turns and use it to gate incremental leaf compaction. Hot cache sessions now defer best-effort incremental passes unless raw history pressure is clearly above target, while cold cache sessions can run bounded catch-up passes in a single maintenance cycle. Full threshold sweeps keep their existing behavior. Also add cacheAwareCompaction config/schema support, docs, a changeset, and regression coverage for hot/cold/unknown prompt-cache behavior. Regeneration-Prompt: | Implement the cache-aware incremental compaction spec for lossless-claw using the new prompt-cache telemetry exposed by the dependent OpenClaw branch tied to openclaw/openclaw#62179. Persist lightweight per- conversation cache telemetry after each turn, classify sessions as hot, cold, or unknown, and use that state to decide whether afterTurn() should run incremental leaf compaction. Preserve the existing full-sweep compaction behavior, but let cold-cache sessions do a bounded number of extra leaf passes to catch up while hot-cache sessions defer passes unless raw history pressure is clearly above target. Add the minimal config surface for enabling the feature and setting the cold-cache pass cap, keep the plugin manifest and docs in sync, and cover the behavior with focused engine and config tests.
Add the next compaction spec on top of cache-aware incremental compaction. Incremental maintenance can now grow its working leaf chunk target in busy sessions using internal low/medium/high activity bands, keep the configured static leafChunkTokens value as the floor, and cap growth at a bounded max. When cache-aware compaction is enabled and the prompt cache is cold, incremental compaction now jumps to the max working chunk. This also persists minimal refill telemetry alongside the existing compaction telemetry, threads optional leaf chunk overrides through the incremental compaction path, and retries with smaller chunk targets when a provider rejects an oversized compaction request on token/context-window limits. Full sweeps remain unchanged. Regeneration-Prompt: | Implement the dynamic leafChunkTokens spec from the 2026-04-07 Pagedrop page on top of the existing cache-aware incremental compaction branch. Keep the feature default-off in v1. Reuse the static leafChunkTokens as the floor, add only a minimal dynamicLeafChunkTokens config object with enabled and max, and store lightweight per-conversation refill telemetry needed to derive a simple low/medium/high activity band with internal hysteresis. Use that band to choose a working incremental leaf chunk target, but keep full sweeps unchanged. If cache-aware compaction is enabled and the cache is cold, force incremental compaction to use the max working chunk. Clamp the working chunk against budget-derived limits, and if a provider still rejects an oversized chunk due to token/context window limits, retry with the next smaller chunk target instead of failing immediately. Update the plugin manifest, docs, migration/store schema, and regression tests for config parsing, trigger overrides, cold-cache max bumping, and retry fallback behavior.
Add debug-level tracing around the new cache-aware incremental compaction and dynamic leaf chunk sizing paths so live runs can be diagnosed without querying SQLite directly. This logs telemetry updates, incremental decision inputs and reasons, and leaf compaction start/result state, with focused engine coverage for the new messages. Regeneration-Prompt: | User asked for better observability for the two new incremental compaction features added on this branch: cache-aware prompt-cache handling and dynamic leaf chunk sizing. The requirement was to add debug logs, not info logs, and to explain how to enable those logs in a live OpenClaw instance. Inspect the new policy code in the LCM engine and add low-noise debug traces at the decision points that matter operationally: telemetry persistence after afterTurn, the incremental compaction decision with cache state / activity band / chosen chunk / reason / max passes, reset after a successful leaf compaction pass, and compactLeafAsync start/result. Preserve existing behavior and keep the logs structured enough to grep in production. Add focused tests that prove the debug logger is called for the hot-cache telemetry path, the hot-cache defer path, and the dynamic high-band chunk selection path.
This was referenced Apr 7, 2026
Merged
5 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
This PR implements cache-aware incremental compaction together with dynamic leaf chunk sizing for lossless-claw. It persists prompt-cache and leaf-refill telemetry per conversation, defers best-effort incremental leaf compaction while the prompt cache is hot, allows bounded catch-up passes once the cache is cold, and scales the working
leafChunkTokenstarget for busy sessions instead of always using a static threshold. It also adds debug-level diagnostics around the new policy decisions so live runs can be inspected without querying SQLite directly. This depends on openclaw/openclaw#62179, which provides the runtime prompt-cache signals.Why
The original incremental compaction path had no visibility into real prompt-cache state and always used a static leaf chunk size. That meant it could compact at the wrong time, churn hot cache prefixes, and compact too frequently for busy sessions that refill the compactable region quickly. The goal here is to make incremental compaction more context-aware while keeping the behavior bounded and diagnosable.
Changes
Testing
npx vitest run test/config.test.ts test/engine.test.ts test/lcm-integration.test.ts test/session-operation-queues.test.ts test/expansion.test.ts test/circuit-breaker.test.ts --exclude='.worktrees/**' --exclude='**/.worktrees/**'