feat: add cache-aware incremental compaction and dynamic leaf chunk sizing by jalehman · Pull Request #318 · Martian-Engineering/lossless-claw

jalehman · 2026-04-07T17:27:33Z

What

This PR implements cache-aware incremental compaction together with dynamic leaf chunk sizing for lossless-claw. It persists prompt-cache and leaf-refill telemetry per conversation, defers best-effort incremental leaf compaction while the prompt cache is hot, allows bounded catch-up passes once the cache is cold, and scales the working leafChunkTokens target for busy sessions instead of always using a static threshold. It also adds debug-level diagnostics around the new policy decisions so live runs can be inspected without querying SQLite directly. This depends on openclaw/openclaw#62179, which provides the runtime prompt-cache signals.

Why

The original incremental compaction path had no visibility into real prompt-cache state and always used a static leaf chunk size. That meant it could compact at the wrong time, churn hot cache prefixes, and compact too frequently for busy sessions that refill the compactable region quickly. The goal here is to make incremental compaction more context-aware while keeping the behavior bounded and diagnosable.

Changes

Persist prompt-cache and leaf-refill telemetry
Add cache-aware incremental compaction policy
Defer hot-cache compaction when pressure is modest
Allow bounded cold-cache catch-up passes
Add dynamic working leaf chunk sizing
Retry with smaller chunks after token-limit errors
Add config, schema, docs, and debug diagnostics

Testing

npx vitest run test/config.test.ts test/engine.test.ts test/lcm-integration.test.ts test/session-operation-queues.test.ts test/expansion.test.ts test/circuit-breaker.test.ts --exclude='.worktrees/**' --exclude='**/.worktrees/**'
Expected: all targeted tests pass

Persist prompt-cache telemetry after turns and use it to gate incremental leaf compaction. Hot cache sessions now defer best-effort incremental passes unless raw history pressure is clearly above target, while cold cache sessions can run bounded catch-up passes in a single maintenance cycle. Full threshold sweeps keep their existing behavior. Also add cacheAwareCompaction config/schema support, docs, a changeset, and regression coverage for hot/cold/unknown prompt-cache behavior. Regeneration-Prompt: | Implement the cache-aware incremental compaction spec for lossless-claw using the new prompt-cache telemetry exposed by the dependent OpenClaw branch tied to openclaw/openclaw#62179. Persist lightweight per- conversation cache telemetry after each turn, classify sessions as hot, cold, or unknown, and use that state to decide whether afterTurn() should run incremental leaf compaction. Preserve the existing full-sweep compaction behavior, but let cold-cache sessions do a bounded number of extra leaf passes to catch up while hot-cache sessions defer passes unless raw history pressure is clearly above target. Add the minimal config surface for enabling the feature and setting the cold-cache pass cap, keep the plugin manifest and docs in sync, and cover the behavior with focused engine and config tests.

Add the next compaction spec on top of cache-aware incremental compaction. Incremental maintenance can now grow its working leaf chunk target in busy sessions using internal low/medium/high activity bands, keep the configured static leafChunkTokens value as the floor, and cap growth at a bounded max. When cache-aware compaction is enabled and the prompt cache is cold, incremental compaction now jumps to the max working chunk. This also persists minimal refill telemetry alongside the existing compaction telemetry, threads optional leaf chunk overrides through the incremental compaction path, and retries with smaller chunk targets when a provider rejects an oversized compaction request on token/context-window limits. Full sweeps remain unchanged. Regeneration-Prompt: | Implement the dynamic leafChunkTokens spec from the 2026-04-07 Pagedrop page on top of the existing cache-aware incremental compaction branch. Keep the feature default-off in v1. Reuse the static leafChunkTokens as the floor, add only a minimal dynamicLeafChunkTokens config object with enabled and max, and store lightweight per-conversation refill telemetry needed to derive a simple low/medium/high activity band with internal hysteresis. Use that band to choose a working incremental leaf chunk target, but keep full sweeps unchanged. If cache-aware compaction is enabled and the cache is cold, force incremental compaction to use the max working chunk. Clamp the working chunk against budget-derived limits, and if a provider still rejects an oversized chunk due to token/context window limits, retry with the next smaller chunk target instead of failing immediately. Update the plugin manifest, docs, migration/store schema, and regression tests for config parsing, trigger overrides, cold-cache max bumping, and retry fallback behavior.

Add debug-level tracing around the new cache-aware incremental compaction and dynamic leaf chunk sizing paths so live runs can be diagnosed without querying SQLite directly. This logs telemetry updates, incremental decision inputs and reasons, and leaf compaction start/result state, with focused engine coverage for the new messages. Regeneration-Prompt: | User asked for better observability for the two new incremental compaction features added on this branch: cache-aware prompt-cache handling and dynamic leaf chunk sizing. The requirement was to add debug logs, not info logs, and to explain how to enable those logs in a live OpenClaw instance. Inspect the new policy code in the LCM engine and add low-noise debug traces at the decision points that matter operationally: telemetry persistence after afterTurn, the incremental compaction decision with cache state / activity band / chosen chunk / reason / max passes, reset after a successful leaf compaction pass, and compactLeafAsync start/result. Preserve existing behavior and keep the logs structured enough to grep in production. Add focused tests that prove the debug logger is called for the hot-cache telemetry path, the hot-cache defer path, and the dynamic high-band chunk selection path.

jalehman added 2 commits April 7, 2026 09:29

jalehman mentioned this pull request Apr 7, 2026

feat: cost-aware compaction decisions based on model pricing and cache hit rate #290

Closed

jalehman changed the title ~~feat: add dynamic leaf chunk sizing~~ feat: add cache-aware incremental compaction and dynamic leaf chunk sizing Apr 7, 2026

jalehman changed the base branch from codex/cache-aware-incremental-compaction to main April 7, 2026 22:07

jalehman merged commit b7078df into main Apr 7, 2026
2 checks passed

This was referenced Apr 7, 2026

chore: version packages #301

Merged

chore: version packages majestic7media/lossless-claw#1

Open

liu51115 mentioned this pull request Apr 8, 2026

[Bug] Compaction invalidates provider prompt cache prefix #324

Closed

100yenadmin mentioned this pull request Apr 10, 2026

feat: wire live context token counts through engine to compaction guards #307

Closed

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add cache-aware incremental compaction and dynamic leaf chunk sizing#318

feat: add cache-aware incremental compaction and dynamic leaf chunk sizing#318
jalehman merged 3 commits intomainfrom
codex/dynamic-leaf-chunk-tokens

jalehman commented Apr 7, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

jalehman commented Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What

Why

Changes

Testing

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

jalehman commented Apr 7, 2026 •

edited

Loading