fix: move compaction from afterTurn to pre-assembly with cache-TTL check by liu51115 · Pull Request #385 · Martian-Engineering/lossless-claw

liu51115 · 2026-04-11T04:03:20Z

Summary

Moves all compaction decisions from afterTurn() to assemble() pre-assembly, using a persisted lastApiCallAt timestamp to determine whether the prompt cache has expired.

Closes #367. Supersedes #362, #363. Related: #102.

Problem

afterTurn() evaluates compaction using the cache status of the call that just completed. This is a fundamental timing inversion: after any API call, the provider has just written (or refreshed) the prompt cache — the cache is always hot in afterTurn, regardless of what the cache status reading says. Compacting at this point always destroys a live cache.

The three removed branches all suffer from this:

hot-cache-budget-headroom: defers when hot and under budget — but "hot" at afterTurn time is meaningless since the cache is always hot after a call
hot-cache-defer: defers when pressure is low and cache is hot — same issue
cold-cache-catchup: triggers aggressive multi-pass on cold readings — on any provider, a cold reading at afterTurn means the provider just wrote the cache, so it's now hot

On providers that load-balance across backend instances with separate prompt caches (e.g. OpenRouter), a secondary problem compounds this: random cold readings from routing misses create a cascade of aggressive compaction → prefix rewrite → guaranteed cold → repeat. But even on direct Anthropic, the core inversion applies — afterTurn is simply the wrong time to evaluate cache state.

Fix

afterTurn() becomes telemetry-only: records lastApiCallAt = Date.now(), nothing else.

assemble() checks at pre-assembly time:

Budget overflow — must compact to fit, fires unconditionally
Cache-TTL — if cacheAwareCompaction.enabled and now - lastApiCallAt > cacheTTLSeconds and there's material to compact → compact before assembling

This is the only window where compaction is both safe and beneficial: the cache has genuinely expired, no live cache to destroy, and compacting reduces input cost on the upcoming cold call.

This approach supersedes the routing-noise guard in #362 (sticky cold counter is no longer needed when afterTurn doesn't make compaction decisions) and the deferred threshold compaction in #363 (all compaction is now off the reply path by design). It also addresses the same use case as the idle compaction trigger in #102 but with a simpler timestamp-based mechanism integrated into the assembly path.

Changes

src/engine.ts: remove two try/catch compaction blocks from afterTurn(), add pre-assembly block to assemble(), delete dead methods (evaluateIncrementalCompaction, logIncrementalCompactionDecision, shouldApplyHotCacheHysteresis, resolveCacheAwareState, isComfortablyUnderTokenBudget)
src/db/config.ts: add cacheTTLSeconds to CacheAwareCompactionConfig (default 300, env LCM_CACHE_TTL_SECONDS)
src/db/migration.ts: add last_api_call_at INTEGER column
src/store/compaction-telemetry-store.ts: add lastApiCallAt field throughout
openclaw.plugin.json: add cacheTTLSeconds to schema
test/engine.test.ts: remove 11 tests for deleted afterTurn compaction behavior
test/config.test.ts: update assertions

Testing

All tests pass. Soak-tested in production: zero compactions during active use (turns 30–200s apart), compaction fires correctly after genuine cache expiry (idle > 300s), per-conversation TTL tracking confirmed working across multiple concurrent sessions.

100yenadmin · 2026-04-12T08:17:19Z

There is genuine value here, but I think it is solving a different axis than #407 rather than replacing it.

I checked our live installed mitigation and the post-patch logs before commenting. The tactical local patch that stops leaf compaction and threshold sweep from both firing on the same turn is still active in our installed extension. In /tmp/openclaw/openclaw-2026-04-12.log, main-session afterTurn before that patch (11 samples before 2026-04-12 02:18:07 +07) averaged about 11.8s with an 83.0s max; after the patch (58 samples after that timestamp), it dropped to about 456ms average, 42.5ms median, and 10.6s max. So the hot-path split absolutely helped.

But it did not close the stall problem by itself. The same post-patch window still shows repeated lane wait exceeded events on agent:main:main and repeated gateway timeout after 90000ms subagent-announce retries. That is why I still think the deferred/background direction in #408 + the host companion PR is the right fit for #407: it addresses coalescing, visibility, and keeping proactive work off the foreground lane entirely.

What I do think is genuinely valuable in this PR:

the cache-TTL / pre-assembly idea is a real cache-safety argument and worth considering on its own merits
pinned-file injection is useful, but it feels orthogonal to the compaction-lane problem

So my read is: useful ideas, but not a drop-in replacement for #407/#408. If this lands, I’d strongly prefer it as a narrower follow-up or split-out PR instead of a competing end-to-end compaction strategy.

…heck (closes Martian-Engineering#367) Move all proactive compaction out of afterTurn() into pre-assembly in assemble(). afterTurn becomes telemetry-only (records cache state + lastApiCallAt). Pre-assembly triggers: 1. Budget overflow — must compact to fit 2. Cache-TTL — compact if idle > cacheTTLSeconds (default 300s) AND leaf trigger fires Removes: evaluateIncrementalCompaction, logIncrementalCompactionDecision, shouldApplyHotCacheHysteresis, resolveCacheAwareState, isComfortablyUnderTokenBudget, IncrementalCompactionDecision type, HOT_CACHE_HYSTERESIS_TURNS constant. Tests: 476/476 pass.

liu51115 · 2026-04-12T08:34:15Z

You're right about the pinned-file creep — that was a mistake on our end, we accidentally force-pushed a branch that bundled pinnedFiles (which has its own PR at #214) into this one. Just fixed it — the PR now contains only the cache-TTL pre-assembly change.

Agreed this is a different axis from #407/#408. The deferred debt + coalescing + visibility work in #408 addresses foreground blocking and starvation. This PR addresses cache-safety timing — making sure compaction doesn't destroy a hot provider cache by firing at the wrong moment.

They're complementary, not competing. Happy to keep this scoped as a narrower follow-up.

Reads agent's params.cacheRetention from OC config and maps to TTL: - "long" → 3600s (1 hour) - "short" → 300s (5 min) - unset → falls back to cacheTTLSeconds config (default 300s) Prevents compacting during a still-valid long cache for agents configured with cacheRetention: "long".

liu51115 · 2026-04-12T10:55:33Z

Closing in favor of #408 which now covers cache-TTL safety + deferred compaction + maintenance visibility. The per-agent cache TTL approach there (via runtime telemetry retention) supersedes our config-based approach.

liu51115 mentioned this pull request Apr 11, 2026

refactor: simplify evaluateIncrementalCompaction — remove cache-state branching #370

Closed

liu51115 force-pushed the feat/upstream-cache-ttl-367 branch 3 times, most recently from f5565e9 to 56eecc9 Compare April 11, 2026 04:42

liu51115 changed the title ~~feat: move compaction from afterTurn to pre-assembly with cache-TTL check~~ fix: move compaction from afterTurn to pre-assembly with cache-TTL check Apr 11, 2026

liu51115 mentioned this pull request Apr 12, 2026

Proactive threshold compaction can stall foreground turns and hide maintenance state #407

Closed

liu51115 force-pushed the feat/upstream-cache-ttl-367 branch from 56eecc9 to 7fb97bd Compare April 12, 2026 07:40

liu51115 mentioned this pull request Apr 12, 2026

Defer proactive compaction debt and surface LCM maintenance state #408

Merged

liu51115 mentioned this pull request Apr 12, 2026

feat(engine): budget-aware compaction deferral via budgetHeadroomRatio #410

Closed

4 tasks

liu51115 force-pushed the feat/upstream-cache-ttl-367 branch from 7fb97bd to debbbf8 Compare April 12, 2026 08:31

jetd1 mentioned this pull request Apr 12, 2026

feat: budget-aware compaction deferral for unknown cache state #409

Open

liu51115 closed this Apr 12, 2026

liu51115 mentioned this pull request Apr 14, 2026

fix: make cache-aware compaction resilient to routing noise #362

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: move compaction from afterTurn to pre-assembly with cache-TTL check#385

fix: move compaction from afterTurn to pre-assembly with cache-TTL check#385
liu51115 wants to merge 2 commits intoMartian-Engineering:mainfrom
liu51115:feat/upstream-cache-ttl-367

liu51115 commented Apr 11, 2026 •

edited

Loading

Uh oh!

100yenadmin commented Apr 12, 2026

Uh oh!

liu51115 commented Apr 12, 2026

Uh oh!

liu51115 commented Apr 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

liu51115 commented Apr 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Problem

Fix

Changes

Testing

Uh oh!

100yenadmin commented Apr 12, 2026

Uh oh!

liu51115 commented Apr 12, 2026

Uh oh!

liu51115 commented Apr 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

liu51115 commented Apr 11, 2026 •

edited

Loading