feat: cache-aware compaction guards with budget-pressure priority and per-tier tuning by 100yenadmin · Pull Request #289 · Martian-Engineering/lossless-claw

100yenadmin · 2026-04-06T11:06:34Z

Summary

Adds cache-aware compaction guards to the leaf compaction trigger, comprehensive documentation, per-model-tier tuning recommendations, and 21 new tests. Prevents unnecessary prompt-cache invalidation on high-traffic conversations while ensuring compaction fires under genuine budget pressure.

Fixes #282

Problem

On high-traffic conversations (18K+ messages), evaluateLeafTrigger() fires every turn because raw tokens constantly exceed leafChunkTokens. Each leaf pass resequences all ordinals, invalidating the prompt cache prefix. Cache hit dropped from 90%+ to 22%.

Solution

Three-tier decision logic

1. Raw tokens >= leafChunkTokens?  NO → skip
2. Assembled < headroom ceiling?   YES → skip (no pressure)
3. Budget pressure detected?       YES → COMPACT (budget wins)
                                   NO  → check cache-aware skip
4. Reduction < 5% of total?        YES → skip (cache cost > gain)
                                   NO  → COMPACT

Budget pressure always overrides cache concerns — prevents starvation.

Configurable thresholds

Setting	Default	Effect
`leafSkipReductionThreshold`	0.05	Min reduction fraction to justify compaction
`leafBudgetHeadroomFactor`	0.8	Skip when under this fraction of budget ceiling

Per-model-tier recommendations

Tier	skipThreshold	headroomFactor	summaryModel
Opus 1M	0.02	0.45	Sonnet/Haiku
Sonnet 200K	0.05 (default)	0.80 (default)	Haiku
Haiku quick	0.10	0.90	Haiku
Orchestration	0.02	0.60	Sonnet

Documentation

New comprehensive Compaction Tuning Guide covering:

TLDR — Copy-paste configs per model tier, cost savings example, verification step, glossary
How It Works — Lifecycle diagram, summary hierarchy, cache impact explanation, timing
Configuration Reference — All settings with defaults, economics tables, escape hatches
Advanced — Model latency comparison, skip guard flowchart, sub-agent isolation, debugging

Updated existing docs:

docs/architecture.md — Cache-aware guards section with Mermaid diagram
docs/configuration.md — New settings reference, model selection table
skills/lossless-claw/references/config.md — Skill reference for new fields

Test Coverage (21 new tests)

Skip logic (16 tests): Basic threshold, budget headroom (skip/compact/bypass), cache-aware (skip/compact), budget pressure override, edge cases (empty conv, negative reduction, per-pass capping), config escape hatches (0=disable), factor clamping, orchestrator vs sub-agent scenario

Config resolution (5 tests): Defaults, plugin config override, env var override, schema entries

Files Changed (8)

File	Change
`src/compaction.ts`	`LeafTriggerResult` type, rewritten `evaluateLeafTrigger()`, `liveContextTokens` param
`src/db/config.ts`	New config fields + `clamp01` validation
`src/engine.ts`	Wire `tokenBudget` + `liveContextTokens`, structured telemetry, logging
`openclaw.plugin.json`	Schema entries for new fields
`docs/compaction-tuning.md`	NEW — Comprehensive tuning guide
`docs/architecture.md`	Cache-aware guards section
`docs/configuration.md`	New settings + model selection
`skills/lossless-claw/references/config.md`	Skill reference update
`test/lcm-integration.test.ts`	16 skip logic tests
`test/config.test.ts`	5 config resolution tests
`test/engine.test.ts`	Updated spy assertion

Adversarial Review Summary

5 agents reviewed across 4 rounds. All CRITICAL/HIGH findings resolved:

Finding	Resolution
Compaction starvation at scale	Budget pressure overrides cache skip
Factor >= 1.0 disables compaction	Clamped to `min(factor, 1.0)`
Per-pass estimate inflated	Uses `min(raw, threshold)`
Schema missing for new fields	Added to `openclaw.plugin.json`
Factor=0 creates false budget pressure	`headroomEnabled` gate prevents it
Test coverage zero for skip logic	21 new tests
Doc accuracy errors	Fixed defaults, added glossary

On high-traffic conversations, evaluateLeafTrigger() fires every turn because raw tokens outside the fresh tail constantly exceed leafChunkTokens. Each leaf pass creates a depth-0 summary that resequences all ordinals, invalidating the Anthropic prompt cache prefix. Cache hit dropped from 90%+ to 22% on large conversations. Add two skip guards to evaluateLeafTrigger(), evaluated only when the basic threshold IS exceeded: 1. Cache-aware skip: if estimated reduction is <5% of total assembled tokens, the cache invalidation cost exceeds the compression gain. 2. Budget headroom skip: if assembled tokens are below 80% of contextThreshold × tokenBudget, there is no budget pressure. Both are configurable: leafSkipReductionThreshold (default 0.05) and leafBudgetHeadroomFactor (default 0.8). Fixes Martian-Engineering#282 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Copilot

Pull request overview

Updates leaf compaction triggering to avoid unnecessary incremental leaf passes that destabilize prompt-cache prefixes and waste work on conversations with ample token budget headroom.

Changes:

Adds cache-aware and budget-headroom skip guards to CompactionEngine.evaluateLeafTrigger().
Wires tokenBudget through the engine’s leaf-trigger evaluation and adds debug logging for skip reasons.
Introduces two new config knobs (leafSkipReductionThreshold, leafBudgetHeadroomFactor) and updates the impacted engine test.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File	Description
`src/db/config.ts`	Adds two new config values and resolves them from env/plugin config with defaults.
`src/compaction.ts`	Extends compaction config and rewrites `evaluateLeafTrigger()` to include skip guards and skip-reason reporting.
`src/engine.ts`	Passes `tokenBudget` into leaf-trigger evaluation and logs skip reasons; wires new config fields into `CompactionConfig`.
`test/engine.test.ts`	Updates spy assertion for the new `evaluateLeafTrigger(..., tokenBudget)` signature.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

…er cache skip Adversarial review found that the cache-aware skip could permanently suppress leaf compaction in large contexts (e.g., 700K of 750K ceiling) because the 5% relative threshold scales with total assembled tokens. Fix: evaluate budget headroom FIRST. When over the headroom ceiling (budget pressure), bypass the cache-aware skip entirely — compaction fires regardless of cache impact. The cache-aware skip only applies when there is genuine headroom (no budget pressure). Also clamp leafBudgetHeadroomFactor to max 1.0 to prevent misconfiguration from silently disabling compaction. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

100yenadmin · 2026-04-06T11:54:34Z

Adversarial Verification Report (5 agents)

Ran 5 parallel adversarial review agents. Found and fixed one CRITICAL issue.

CRITICAL — Compaction starvation at scale (FIXED in `d56eab1`)

The cache-aware skip used a 5% relative threshold that scaled linearly with totalAssembledTokens. In large contexts (e.g., Opus 1M at 700K assembled), the bar for "worth compacting" grew so high that leaf compaction was permanently suppressed — even at 93% of budget capacity.

Root cause: The cache skip short-circuited before the budget headroom check could override it.

Fix: Evaluate budget headroom FIRST. When over the headroom ceiling (budget pressure), bypass the cache-aware skip entirely. Also clamp leafBudgetHeadroomFactor to max 1.0 to prevent misconfiguration.

Scenario verification after fix:

Scenario	Assembled	Budget	Result	Correct?
A: Original issue (2K leafChunkTokens)	150K	200K	COMPACT (budget pressure)	✅
B: Small context, genuine pressure	90K	128K	COMPACT	✅
C: Opus 1M at 93% capacity	700K	1M	COMPACT (budget wins)	✅ Fixed!
D: Orchestrator with headroom	40K	200K	SKIP (headroom)	✅
E: Rapid tool use at ceiling	150K	200K	COMPACT	✅

Other findings (non-blocking):

Test coverage: INSUFFICIENT — All existing tests mock evaluateLeafTrigger, so zero coverage of the actual skip logic. Recommend adding unit tests in a follow-up.
Config UX — Setting to 0 correctly disables each skip (escape hatch works). Neither setting appears in /lossless status diagnostic output.
Economics — Default 5%/0.8 is well-calibrated for Sonnet 200K. For Opus 1M sessions, recommend leafSkipReductionThreshold=0.03, leafBudgetHeadroomFactor=0.65. For Haiku quick tasks: 0.10, 0.90.

Addresses Copilot review round 2: 1. estimatedReduction was using rawTokensOutsideTail (all raw tokens) but a leaf pass only compacts one chunk capped at leafChunkTokens. Now uses Math.min(rawTokensOutsideTail, threshold) so the estimate reflects actual per-pass reduction. 2. Added leafSkipReductionThreshold and leafBudgetHeadroomFactor to openclaw.plugin.json configSchema (which has additionalProperties: false) so users can set them via plugin config, not just env vars. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Adds 16 new unit tests covering all paths through the cache-aware and budget-headroom skip logic, plus 5 config resolution tests: Skip logic tests (lcm-integration.test.ts): - Basic threshold: below/above leafChunkTokens - Budget headroom: skip when under ceiling, compact when over - Budget headroom: bypassed when tokenBudget undefined - Cache-aware: skip when reduction tiny relative to total context - Cache-aware: compact when reduction large enough - Budget pressure overrides cache skip (anti-starvation) - Edge cases: empty conversation, negative reduction - Config escape hatches: threshold=0 and factor=0 disable skips - Factor clamped to 1.0 (misconfiguration protection) - Orchestrator vs sub-agent: different budgets, different decisions - Per-pass chunk size estimate uses min(raw, threshold) Config tests (config.test.ts): - Default values: 0.05 and 0.8 - Plugin config override - Env var override - Schema entries in manifest Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Copilot

Pull request overview

Copilot reviewed 7 out of 8 changed files in this pull request and generated 5 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

…st fixes Addresses 5 Copilot review comments: 1. leafBudgetHeadroomFactor=0 now correctly disables the headroom check (headroomEnabled=false) instead of creating false budget pressure that bypassed the cache-aware skip. 2. Config values clamped to [0,1] in resolveLcmConfig via clamp01(). 3. Removed wasteful "x".repeat(summaryTokens*4) in test — mock store uses tokenCount directly, not content length. 4. Fixed leafChunkTokens=0 test — resolveLeafChunkTokens() normalizes non-positive to default. Use default threshold instead. 5. Updated factor=0 test comment to match corrected semantics. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Copilot

Pull request overview

Copilot reviewed 7 out of 8 changed files in this pull request and generated 1 comment.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

…nfo logging Three hardening improvements from adversarial review: 1. Token accuracy: evaluateLeafTrigger now accepts optional liveContextTokens param and uses max(stored, live) for headroom decisions. Stored token counts can lag after rapid ingestion; the live estimate from afterTurn provides a more accurate floor. 2. Structured telemetry: LeafTriggerResult now includes a `context` field with all decision inputs (totalAssembledTokens, budgetCeiling, budgetPressure, estimatedReduction, reductionThreshold, headroomFactor). Enables machine-parseable diagnostics and config tuning. 3. Observability: Skip and fire decisions logged at info level (not debug). Compaction fires include assembled/pressure context. Volume is at most 1 log per turn — negligible. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Copilot

Pull request overview

Copilot reviewed 7 out of 8 changed files in this pull request and generated 1 comment.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

The comment described when the cache-aware skip is applied but did not precisely reflect the budgetPressure gate semantics after the headroom refactor. Updated to accurately describe: budget pressure is only true when headroom is enabled AND ceiling is breached; otherwise cache-aware skip can fire. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Skip decisions fire every turn in high-traffic sessions — too noisy for info level. Compaction triggers are infrequent (~every 7-10 turns) and worth info level as meaningful state changes. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Copilot

Pull request overview

Copilot reviewed 7 out of 8 changed files in this pull request and generated 1 comment.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Adds documentation for the cache-aware compaction feature across all doc layers: New: docs/compaction-tuning.md — standalone deep-dive covering: - TLDR quick-setup with copy-paste configs per model tier - Compaction model selection guide (why fast models matter) - Full lifecycle diagrams (Mermaid) - Cache-aware decision flowchart - Economics tables (cache miss penalty, break-even formula) - Gateway stall timing per model - Debugging guide for common issues Updated: docs/architecture.md - Cache-aware skip guards section with Mermaid diagram - Budget pressure priority explanation - Prompt cache impact description Updated: docs/configuration.md - leafSkipReductionThreshold and leafBudgetHeadroomFactor reference - Compaction model selection table - Per-tier preset summary with link to tuning guide Updated: skills/lossless-claw/references/config.md - Added both new config fields to skill reference Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Copilot

Pull request overview

Copilot reviewed 11 out of 12 changed files in this pull request and generated 5 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

100yenadmin · 2026-04-06T17:54:04Z

Merge order: 3rd — Merge after #288 and #294. Largest PR — cache-aware compaction + docs.

See #297 for the full sprint tracking issue with all 5 PRs.

Recommended merge sequence: #288 → #294 → #289 → #295 → #296

The code uses totalAssembledTokens < budgetCeiling for headroom (strict less-than), so budget pressure fires at >= budgetCeiling. Docs said 'exceed' which implies strict greater-than. Fixed to 'reach or exceed' across all 5 files. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Copilot

Pull request overview

Copilot reviewed 11 out of 12 changed files in this pull request and generated no new comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Thread observed token counts through leaf and threshold compaction workers so stale persisted counts do not suppress needed compaction after the outer trigger has already detected budget pressure. Add regression coverage for both the engine plumbing and the compaction engine stale-count path, correct the Sonnet 4.6 tuning guide to reflect its 1M context tier, and add the missing patch changeset. Regeneration-Prompt: | Address review findings on PR 289 in the PR worktree without changing unrelated behavior. The bug is that afterTurn/evaluation can use a live current token estimate, but the actual compactLeaf and compactFullSweep worker paths re-check leaf trigger conditions using only stored DB token counts, which can lag ingestion and incorrectly skip compaction under real budget pressure. Thread the observed token count through those worker calls and add tests that prove stale stored counts no longer suppress leaf or threshold sweeps. Also fix the compaction tuning docs so Sonnet 4.6 is described consistently with the documented 1M context window, and add a patch changeset because the PR changes user-facing runtime behavior and docs.

100yenadmin · 2026-04-07T05:56:09Z

Given the size here, splitting it up into multiple PR's @jalehman

100yenadmin · 2026-04-07T05:57:57Z

Splitting this PR for easier review

This PR is +1047/-18 across 13 files — significantly larger than the others that merged quickly. To make review tractable, we're proposing to split it into 3 focused PRs:

PR A: Cache-aware leaf compaction guards (~400 lines, 21 tests)

The core feature. Rewrites evaluateLeafTrigger() with three skip guards that prevent unnecessary compaction when it would waste more in prompt-cache invalidation than it saves in tokens:

Raw threshold gate — skip when assembled tokens haven't crossed the threshold (existing behavior)
Budget headroom gate — skip when assembled < 80% of budget ceiling (leafBudgetHeadroomFactor)
Reduction threshold gate — skip when estimated reduction < 5% of total tokens (leafSkipReductionThreshold)
Budget pressure override — force compaction when context is >95% full regardless of other gates

Files: src/compaction.ts, src/db/config.ts, openclaw.plugin.json, test/lcm-integration.test.ts (16 tests), test/config.test.ts (5 tests)

PR B: Live token awareness (~60 lines, 5 tests)

Depends on PR A. Passes currentTokenCount (live observed counts) through to evaluateLeafTrigger() and compaction methods so headroom decisions use fresh data instead of stale stored counts.

Files: src/compaction.ts (4 signatures), src/engine.ts (2 call sites), test/lcm-integration.test.ts (2 tests), test/engine.test.ts (3 updates)

PR C: Documentation & tuning guide (~430 lines, 0 code)

Independent — can land anytime. New docs/compaction-tuning.md (356 lines) with model-tier presets, cache economics analysis, break-even formulas, and debugging checklist. Updates to architecture.md, configuration.md, skill references.

Note: The branch is 8 commits behind main (main now has #288, #294, #295, #296, #298, #302). Rebase has extensive conflicts. We'll create fresh branches from current main and reapply the changes for each split PR.

Should we proceed with this split, or would you prefer a different grouping? Happy to adjust the boundaries.

On models with prompt caching (Claude, GPT-4), compaction that removes 3% of tokens costs more in cache-miss penalties than it saves. The current trigger fires whenever assembledTokens > threshold × budget regardless of how much compaction would actually remove. Add three guard checks to evaluateLeafTrigger(): 1. Budget headroom gate — skip when assembled < 80% of budget ceiling (leafBudgetHeadroomFactor, default 0.8, set 0 to disable) 2. Cache-aware reduction gate — skip when estimated reduction < 5% of total assembled tokens (leafSkipReductionThreshold, default 0.05) 3. Budget pressure override — force compaction when context reaches or exceeds the ceiling, preventing starvation in large contexts Also passes currentTokenCount through compactLeaf/compactFullSweep so headroom decisions use live observed counts when stored counts are stale. Split from Martian-Engineering#289 for reviewability.

Pass tokenBudget and liveContextTokens from the engine's afterTurn and compact paths into evaluateLeafTrigger and compactLeaf/compactFullSweep so cache-aware headroom decisions use fresh observed counts instead of potentially stale stored values. - evaluateLeafTrigger now receives tokenBudget + liveContextTokens from engine call sites - compactLeaf/compactFullSweep receive currentTokenCount (observedTokens) - afterTurn logs trigger context (assembled, pressure) on compaction - afterTurn logs skip reason when guards prevent compaction - CompactionConfig passes leafSkipReductionThreshold and leafBudgetHeadroomFactor from LcmConfig Split from Martian-Engineering#289 (Part 2 of 3). Depends on Martian-Engineering#306.

New comprehensive guide for operators tuning LCM compaction behavior: - docs/compaction-tuning.md (356 lines): TLDR, per-tier model presets (Opus, Sonnet, Haiku, GPT-4o-mini, Gemini Flash), cache economics break-even formula, debugging checklist, orchestration scenarios - docs/architecture.md: cache-aware guards section with Mermaid flowchart - docs/configuration.md: new settings reference, model comparison table - skills references: config field updates Split from Martian-Engineering#289 (Part 3 of 3). Independent of Martian-Engineering#306 and Martian-Engineering#307.

100yenadmin · 2026-04-07T06:20:22Z

Split complete. This PR is now covered by three focused PRs rebased on current main:

feat: cache-aware leaf compaction guards with budget-pressure override #306 — Core cache-aware compaction guards + config + 22 tests (+521/-8, 6 files)
feat: wire live context token counts through engine to compaction guards #307 — Engine wiring for live token counts (+70/-4, 2 files) — depends on feat: cache-aware leaf compaction guards with budget-pressure override #306
docs: compaction tuning guide with per-tier presets and cache economics #308 — Documentation & tuning guide (+450/-4, 4 files) — independent

Closing this PR in favor of the split. All 1047 lines of additions are preserved across the three PRs.

Restores two load-bearing inline comments from the original PR Martian-Engineering#289 that were lost during the split: - 3-line headroomEnabled rationale: explains why the guard uses three conditions and that factor=0 disables without creating false pressure - 8-line budget-pressure explanation: documents when pressure is true, when the cache-aware skip can fire, and the starvation prevention guarantee

Pass tokenBudget and liveContextTokens from the engine's afterTurn and compact paths into evaluateLeafTrigger and compactLeaf/compactFullSweep so cache-aware headroom decisions use fresh observed counts instead of potentially stale stored values. - evaluateLeafTrigger now receives tokenBudget + liveContextTokens from engine call sites - compactLeaf/compactFullSweep receive currentTokenCount (observedTokens) - afterTurn logs trigger context (assembled, pressure) on compaction - afterTurn logs skip reason when guards prevent compaction - CompactionConfig passes leafSkipReductionThreshold and leafBudgetHeadroomFactor from LcmConfig Split from Martian-Engineering#289 (Part 2 of 3). Depends on Martian-Engineering#306.

On models with prompt caching (Claude, GPT-4), compaction that removes 3% of tokens costs more in cache-miss penalties than it saves. The current trigger fires whenever assembledTokens > threshold × budget regardless of how much compaction would actually remove. Add three guard checks to evaluateLeafTrigger(): 1. Budget headroom gate — skip when assembled < 80% of budget ceiling (leafBudgetHeadroomFactor, default 0.8, set 0 to disable) 2. Cache-aware reduction gate — skip when estimated reduction < 5% of total assembled tokens (leafSkipReductionThreshold, default 0.05) 3. Budget pressure override — force compaction when context reaches or exceeds the ceiling, preventing starvation in large contexts Also passes currentTokenCount through compactLeaf/compactFullSweep so headroom decisions use live observed counts when stored counts are stale. Split from Martian-Engineering#289 for reviewability.

Restores two load-bearing inline comments from the original PR Martian-Engineering#289 that were lost during the split: - 3-line headroomEnabled rationale: explains why the guard uses three conditions and that factor=0 disables without creating false pressure - 8-line budget-pressure explanation: documents when pressure is true, when the cache-aware skip can fire, and the starvation prevention guarantee

Pass tokenBudget and liveContextTokens from the engine's afterTurn and compact paths into evaluateLeafTrigger and compactLeaf/compactFullSweep so cache-aware headroom decisions use fresh observed counts instead of potentially stale stored values. - evaluateLeafTrigger now receives tokenBudget + liveContextTokens from engine call sites - compactLeaf/compactFullSweep receive currentTokenCount (observedTokens) - afterTurn logs trigger context (assembled, pressure) on compaction - afterTurn logs skip reason when guards prevent compaction - CompactionConfig passes leafSkipReductionThreshold and leafBudgetHeadroomFactor from LcmConfig Split from Martian-Engineering#289 (Part 2 of 3). Depends on Martian-Engineering#306.

Copilot AI review requested due to automatic review settings April 6, 2026 11:06

Copilot started reviewing on behalf of 100yenadmin April 6, 2026 11:07 View session

Copilot AI reviewed Apr 6, 2026

View reviewed changes

Comment thread src/compaction.ts Outdated

Comment thread src/db/config.ts

liu51115 mentioned this pull request Apr 6, 2026

feat: cost-aware compaction decisions based on model pricing and cache hit rate #290

Closed

100yenadmin mentioned this pull request Apr 6, 2026

Leaf compaction trigger destabilizes Anthropic prompt cache on high-traffic conversations #282

Closed

100yenadmin changed the title ~~fix: cache-aware skip + budget headroom guard for leaf compaction trigger~~ feat: cache-aware compaction guards with budget-pressure priority and per-tier tuning Apr 6, 2026

100yenadmin requested a review from Copilot April 6, 2026 13:04

Copilot started reviewing on behalf of 100yenadmin April 6, 2026 13:05 View session

Copilot AI reviewed Apr 6, 2026

View reviewed changes

Comment thread test/lcm-integration.test.ts Outdated

Comment thread src/compaction.ts Outdated

Comment thread src/db/config.ts Outdated

Comment thread test/lcm-integration.test.ts Outdated

Comment thread test/lcm-integration.test.ts Outdated

liu51115 mentioned this pull request Apr 6, 2026

feat: expose cache hit token counts to plugins via after_turn hook openclaw/openclaw#61888

Closed

100yenadmin requested a review from Copilot April 6, 2026 13:24

Copilot started reviewing on behalf of 100yenadmin April 6, 2026 13:24 View session

Copilot AI reviewed Apr 6, 2026

View reviewed changes

Comment thread src/compaction.ts Outdated

100yenadmin requested a review from Copilot April 6, 2026 13:45

Copilot started reviewing on behalf of 100yenadmin April 6, 2026 13:45 View session

Copilot AI reviewed Apr 6, 2026

View reviewed changes

Comment thread src/engine.ts

Eva and others added 2 commits April 6, 2026 20:52

100yenadmin requested a review from Copilot April 6, 2026 14:01

Copilot started reviewing on behalf of 100yenadmin April 6, 2026 14:01 View session

Copilot AI reviewed Apr 6, 2026

View reviewed changes

Comment thread test/lcm-integration.test.ts Outdated

Eva and others added 2 commits April 6, 2026 21:13

fix: clarify test comment to avoid threshold naming ambiguity

94bc1bf

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Copilot AI reviewed Apr 6, 2026

View reviewed changes

Comment thread docs/architecture.md Outdated

Comment thread docs/configuration.md Outdated

Comment thread docs/compaction-tuning.md Outdated

Comment thread src/compaction.ts Outdated

Comment thread skills/lossless-claw/references/config.md Outdated

100yenadmin mentioned this pull request Apr 6, 2026

fix: defer DB init to gateway_start hook to prevent database lock race #288

Merged

6 tasks

This was referenced Apr 6, 2026

perf: optimize SQLite PRAGMAs and add missing indexes #294

Merged

perf: context cache, delta tracking, and bulk ordinal resequencing #295

Merged

feat: configurable search sort (recency/relevance/hybrid) + phrase matching fix #296

Merged

100yenadmin mentioned this pull request Apr 6, 2026

fix: route manual/overflow compaction through compact() wrapper #298

Merged

100yenadmin requested a review from Copilot April 6, 2026 18:04

Copilot started reviewing on behalf of 100yenadmin April 6, 2026 18:04 View session

Copilot AI reviewed Apr 6, 2026

View reviewed changes

100yenadmin mentioned this pull request Apr 7, 2026

feat: cache-aware leaf compaction guards with budget-pressure override #306

Closed

6 tasks

100yenadmin mentioned this pull request Apr 7, 2026

feat: wire live context token counts through engine to compaction guards #307

Closed

5 tasks

100yenadmin mentioned this pull request Apr 7, 2026

docs: compaction tuning guide with per-tier presets and cache economics #308

Open

100yenadmin closed this Apr 7, 2026

liu51115 mentioned this pull request Apr 10, 2026

cache-aware compaction: timing inversion — decisions based on last-call status instead of cache expiry #367

Open

Conversation

100yenadmin commented Apr 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Problem

Solution

Three-tier decision logic

Configurable thresholds

Per-model-tier recommendations

Documentation

Test Coverage (21 new tests)

Files Changed (8)

Adversarial Review Summary

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

100yenadmin commented Apr 6, 2026

Adversarial Verification Report (5 agents)

CRITICAL — Compaction starvation at scale (FIXED in d56eab1)

Scenario verification after fix:

Other findings (non-blocking):

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

100yenadmin commented Apr 6, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

100yenadmin commented Apr 7, 2026

Uh oh!

100yenadmin commented Apr 7, 2026

Splitting this PR for easier review

PR A: Cache-aware leaf compaction guards (~400 lines, 21 tests)

PR B: Live token awareness (~60 lines, 5 tests)

PR C: Documentation & tuning guide (~430 lines, 0 code)

Uh oh!

100yenadmin commented Apr 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

100yenadmin commented Apr 6, 2026 •

edited

Loading

CRITICAL — Compaction starvation at scale (FIXED in `d56eab1`)