Problem
During compaction, lossless-claw makes 120-160 redundant DB operations per full sweep on a single SQLite database shared by 5+ concurrent agent sessions. On a 1.9GB database with 219K messages, this creates significant event loop blocking (the real bottleneck — not SQLite locks) that can stall the gateway during bursts.
Root Cause Analysis (Red Team / Blue Team)
A 4-agent adversarial investigation revealed:
The repeated reads are NOT accidental — after each compaction pass, replaceContextRangeWithSummary resequences ALL ordinals. Any pre-write cache is instantly stale. The code deliberately re-reads to ensure correctness.
However, many reads are redundant WITHIN the same phase (before any write occurs):
- `getContextItems()` called 3x before the first write in `compactLeaf`
- `getMessageById()` called individually 50-200 times instead of batch
- `getSummary()` called individually 40-60 times per sweep instead of batch
The real bottleneck is `DatabaseSync` (Node.js synchronous SQLite). Every query blocks the event loop. At burst rates (30 afterTurn/min with 5 agents), event loop utilization hits 7-22%. At DB sizes >4-5GB, cold queries take 50-200ms each, causing HTTP timeouts and LLM streaming stalls.
Proposed Optimizations (ranked by impact × safety)
1. Per-phase context cache (EASY, 8-12 reads saved/sweep)
Cache `getContextItems()` once at the start of each phase. Invalidate after every write. The red team confirmed: within a phase (before any write), the data is immutable.
2. Token count delta tracking (EASY, 3-11 reads saved/sweep)
After each `leafPass`/`condensedPass`, compute: `tokensAfter = tokensBefore - chunkSourceTokens + summaryTokenCount` instead of re-querying. The arithmetic is exact — uses the same values the DB would sum. Eliminates the most expensive query (`getContextTokenCount` = double-JOIN aggregate).
3. Summary lookup cache (EASY, 40-60 reads saved/sweep)
At the start of condensed phase, batch-fetch all summaries referenced in context items into a Map. Summaries are immutable after creation — zero staleness risk.
4. Batch message lookups (MODERATE, 60-70 → 2-3 per leaf iteration)
Replace N individual `getMessageById()` calls with `SELECT ... WHERE message_id IN (...)`. Messages are immutable — zero staleness risk. Biggest event loop savings (~200-400ms/sweep).
5. DB size management — archival/VACUUM (MODERATE)
At 10K messages/day, the DB grows ~25-30MB/day (messages are never deleted — lossless design). Projected 7.5GB at 6 months, 13GB at 1 year. Performance cliffs appear at 4-5GB. Need: periodic VACUUM, old conversation archival, or tiered storage.
What NOT to do
- Don't change ordinal resequencing — The 2×N UPDATE with negative temp ordinals is the only correct approach. SQLite doesn't support deferred UNIQUE constraints.
- Don't add read-only connections — With single-threaded `DatabaseSync`, this adds complexity with no benefit.
- Don't cache across writes — Ordinals change after every write. Per-phase caching with invalidation is the correct pattern.
Concurrent Load Model
For the deployment (1 main Opus + 4 sub-agent Sonnet, 10K messages/day):
| Metric |
Average |
Burst (30/min) |
| DB ops/minute |
70-126 |
300-540 |
| Event loop blocked |
1.7-5.2% |
7-22% |
| Write lock duration |
25-60ms |
Same (per-transaction) |
| Reader wait (WAL) |
0ms |
0ms |
| DB growth |
25-30MB/day |
— |
| 6-month projected size |
7.5GB |
— |
| First failure mode |
Event loop blocking at 4-5GB+ |
— |
Impact
Combined optimizations reduce per-sweep DB operations from ~120-160 to ~30-40 (75% reduction). Event loop blocking during compaction drops proportionally.
Related
Problem
During compaction, lossless-claw makes 120-160 redundant DB operations per full sweep on a single SQLite database shared by 5+ concurrent agent sessions. On a 1.9GB database with 219K messages, this creates significant event loop blocking (the real bottleneck — not SQLite locks) that can stall the gateway during bursts.
Root Cause Analysis (Red Team / Blue Team)
A 4-agent adversarial investigation revealed:
The repeated reads are NOT accidental — after each compaction pass,
replaceContextRangeWithSummaryresequences ALL ordinals. Any pre-write cache is instantly stale. The code deliberately re-reads to ensure correctness.However, many reads are redundant WITHIN the same phase (before any write occurs):
The real bottleneck is `DatabaseSync` (Node.js synchronous SQLite). Every query blocks the event loop. At burst rates (30 afterTurn/min with 5 agents), event loop utilization hits 7-22%. At DB sizes >4-5GB, cold queries take 50-200ms each, causing HTTP timeouts and LLM streaming stalls.
Proposed Optimizations (ranked by impact × safety)
1. Per-phase context cache (EASY, 8-12 reads saved/sweep)
Cache `getContextItems()` once at the start of each phase. Invalidate after every write. The red team confirmed: within a phase (before any write), the data is immutable.
2. Token count delta tracking (EASY, 3-11 reads saved/sweep)
After each `leafPass`/`condensedPass`, compute: `tokensAfter = tokensBefore - chunkSourceTokens + summaryTokenCount` instead of re-querying. The arithmetic is exact — uses the same values the DB would sum. Eliminates the most expensive query (`getContextTokenCount` = double-JOIN aggregate).
3. Summary lookup cache (EASY, 40-60 reads saved/sweep)
At the start of condensed phase, batch-fetch all summaries referenced in context items into a Map. Summaries are immutable after creation — zero staleness risk.
4. Batch message lookups (MODERATE, 60-70 → 2-3 per leaf iteration)
Replace N individual `getMessageById()` calls with `SELECT ... WHERE message_id IN (...)`. Messages are immutable — zero staleness risk. Biggest event loop savings (~200-400ms/sweep).
5. DB size management — archival/VACUUM (MODERATE)
At 10K messages/day, the DB grows ~25-30MB/day (messages are never deleted — lossless design). Projected 7.5GB at 6 months, 13GB at 1 year. Performance cliffs appear at 4-5GB. Need: periodic VACUUM, old conversation archival, or tiered storage.
What NOT to do
Concurrent Load Model
For the deployment (1 main Opus + 4 sub-agent Sonnet, 10K messages/day):
Impact
Combined optimizations reduce per-sweep DB operations from ~120-160 to ~30-40 (75% reduction). Event loop blocking during compaction drops proportionally.
Related