Skip to content

feat: add cache-aware incremental compaction and dynamic leaf chunk sizing#318

Merged
jalehman merged 3 commits intomainfrom
codex/dynamic-leaf-chunk-tokens
Apr 7, 2026
Merged

feat: add cache-aware incremental compaction and dynamic leaf chunk sizing#318
jalehman merged 3 commits intomainfrom
codex/dynamic-leaf-chunk-tokens

Conversation

@jalehman
Copy link
Copy Markdown
Contributor

@jalehman jalehman commented Apr 7, 2026

What

This PR implements cache-aware incremental compaction together with dynamic leaf chunk sizing for lossless-claw. It persists prompt-cache and leaf-refill telemetry per conversation, defers best-effort incremental leaf compaction while the prompt cache is hot, allows bounded catch-up passes once the cache is cold, and scales the working leafChunkTokens target for busy sessions instead of always using a static threshold. It also adds debug-level diagnostics around the new policy decisions so live runs can be inspected without querying SQLite directly. This depends on openclaw/openclaw#62179, which provides the runtime prompt-cache signals.

Why

The original incremental compaction path had no visibility into real prompt-cache state and always used a static leaf chunk size. That meant it could compact at the wrong time, churn hot cache prefixes, and compact too frequently for busy sessions that refill the compactable region quickly. The goal here is to make incremental compaction more context-aware while keeping the behavior bounded and diagnosable.

Changes

  • Persist prompt-cache and leaf-refill telemetry
  • Add cache-aware incremental compaction policy
  • Defer hot-cache compaction when pressure is modest
  • Allow bounded cold-cache catch-up passes
  • Add dynamic working leaf chunk sizing
  • Retry with smaller chunks after token-limit errors
  • Add config, schema, docs, and debug diagnostics

Testing

  • npx vitest run test/config.test.ts test/engine.test.ts test/lcm-integration.test.ts test/session-operation-queues.test.ts test/expansion.test.ts test/circuit-breaker.test.ts --exclude='.worktrees/**' --exclude='**/.worktrees/**'
  • Expected: all targeted tests pass

jalehman added 2 commits April 7, 2026 09:29
Persist prompt-cache telemetry after turns and use it to gate incremental
leaf compaction. Hot cache sessions now defer best-effort incremental
passes unless raw history pressure is clearly above target, while cold
cache sessions can run bounded catch-up passes in a single maintenance
cycle. Full threshold sweeps keep their existing behavior.

Also add cacheAwareCompaction config/schema support, docs, a changeset,
and regression coverage for hot/cold/unknown prompt-cache behavior.

Regeneration-Prompt: |
  Implement the cache-aware incremental compaction spec for lossless-claw
  using the new prompt-cache telemetry exposed by the dependent OpenClaw
  branch tied to openclaw/openclaw#62179. Persist lightweight per-
  conversation cache telemetry after each turn, classify sessions as hot,
  cold, or unknown, and use that state to decide whether afterTurn()
  should run incremental leaf compaction. Preserve the existing full-sweep
  compaction behavior, but let cold-cache sessions do a bounded number of
  extra leaf passes to catch up while hot-cache sessions defer passes
  unless raw history pressure is clearly above target. Add the minimal
  config surface for enabling the feature and setting the cold-cache pass
  cap, keep the plugin manifest and docs in sync, and cover the behavior
  with focused engine and config tests.
Add the next compaction spec on top of cache-aware incremental compaction.
Incremental maintenance can now grow its working leaf chunk target in
busy sessions using internal low/medium/high activity bands, keep the
configured static leafChunkTokens value as the floor, and cap growth at a
bounded max. When cache-aware compaction is enabled and the prompt cache
is cold, incremental compaction now jumps to the max working chunk.

This also persists minimal refill telemetry alongside the existing
compaction telemetry, threads optional leaf chunk overrides through the
incremental compaction path, and retries with smaller chunk targets when a
provider rejects an oversized compaction request on token/context-window
limits. Full sweeps remain unchanged.

Regeneration-Prompt: |
  Implement the dynamic leafChunkTokens spec from the 2026-04-07 Pagedrop
  page on top of the existing cache-aware incremental compaction branch.
  Keep the feature default-off in v1. Reuse the static leafChunkTokens as
  the floor, add only a minimal dynamicLeafChunkTokens config object with
  enabled and max, and store lightweight per-conversation refill telemetry
  needed to derive a simple low/medium/high activity band with internal
  hysteresis. Use that band to choose a working incremental leaf chunk
  target, but keep full sweeps unchanged. If cache-aware compaction is
  enabled and the cache is cold, force incremental compaction to use the
  max working chunk. Clamp the working chunk against budget-derived limits,
  and if a provider still rejects an oversized chunk due to token/context
  window limits, retry with the next smaller chunk target instead of
  failing immediately. Update the plugin manifest, docs, migration/store
  schema, and regression tests for config parsing, trigger overrides,
  cold-cache max bumping, and retry fallback behavior.
Add debug-level tracing around the new cache-aware incremental compaction and
dynamic leaf chunk sizing paths so live runs can be diagnosed without querying
SQLite directly. This logs telemetry updates, incremental decision inputs and
reasons, and leaf compaction start/result state, with focused engine coverage
for the new messages.

Regeneration-Prompt: |
  User asked for better observability for the two new incremental compaction
  features added on this branch: cache-aware prompt-cache handling and dynamic
  leaf chunk sizing. The requirement was to add debug logs, not info logs, and
  to explain how to enable those logs in a live OpenClaw instance.

  Inspect the new policy code in the LCM engine and add low-noise debug traces
  at the decision points that matter operationally: telemetry persistence after
  afterTurn, the incremental compaction decision with cache state / activity
  band / chosen chunk / reason / max passes, reset after a successful leaf
  compaction pass, and compactLeafAsync start/result. Preserve existing behavior
  and keep the logs structured enough to grep in production. Add focused tests
  that prove the debug logger is called for the hot-cache telemetry path, the
  hot-cache defer path, and the dynamic high-band chunk selection path.
@jalehman jalehman changed the title feat: add dynamic leaf chunk sizing feat: add cache-aware incremental compaction and dynamic leaf chunk sizing Apr 7, 2026
@jalehman jalehman changed the base branch from codex/cache-aware-incremental-compaction to main April 7, 2026 22:07
@jalehman jalehman merged commit b7078df into main Apr 7, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant