Skip to content

Follow-up: tighten microcompaction fast-path eviction (token efficiency) #4259

@wenshao

Description

@wenshao

Follow-up: tighten microcompaction fast-path eviction (token efficiency)

Both items are non-correctness token-efficiency refinements deferred from PR #4243 review. Current behavior in both cases is fail-safe (over-clear → redundant re-read of unchanged file), but each path forces a re-emit of real bytes that could have been avoided by a more targeted invalidation.

The two share a common shape and likely a common fix (per-path resolution against keepRefs / disarmed-set bookkeeping). Tracking them together so they can be fixed in one pass.

Item 1 — microcompact.ts: same-file blanked+kept still disarms fast-path

When a single microcompaction window blanks an old read of file F and keeps a more recent read of the same F, the kept result's bytes are still quotable from history. But evictedReadPaths adds F to the disarm set based on the blanked entry only, so markReadEvictedFromHistory(F) disarms the fast-path anyway.

Cost: one redundant read_file of an unchanged F on the next edit attempt.

@tanzhenxin's defer rationale on PR #4243 ([link](#4243 (comment)... — replace once permanent ref exists)) was solid: a naive "subtract keptFilePaths" fix risks reintroducing the dangling-file_unchanged-placeholder hazard (issue #4239) if the "kept" judgement is even slightly wrong, and mimo's suggested snippet didn't compile (PartRef has no .id; keepRefs is a "ci:pi" string Set, not an id collection).

Approach for fix: map kept PartRef entries back to call ids via the part traversal that already runs in the blanking loop, then build a keptFilePaths Set; only paths whose every result in the microcompaction window got blanked end up in evictedReadPaths. Add a test for same-file blanked+kept that asserts the fast-path stays armed and a subsequent edit does not trigger a re-read.

Item 2 — client.ts:1249-1252: blanket fileReadCache.clear() on any unresolved path

When microcompaction blanks N file-tool results, the caller (client.ts) attempts surgical disarm: concurrent fs.stat + markReadEvictedFromHistory per evicted path. If any single stat fails or inode mismatches, the code falls back to fileReadCache.clear() — wiping entries already successfully disarmed in the same batch, and entries for files that were never evicted at all.

Cost: a single ghost/replaced file inside a 50-file microcompaction batch invalidates the entire session's cache, forcing re-reads on every subsequent edit until each file is re-touched.

// Current (microcompact.ts caller in client.ts:1249-1252):
if (!fullyDisarmed) {
  fileReadCache.clear();
}

// Suggested:
const unresolvedPaths = m.evictedReadPaths.filter(
  (_, i) => !statResults[i] || !fileReadCache.markReadEvictedFromHistory(statResults[i]!),
);
for (const p of unresolvedPaths) fileReadCache.invalidateByPath(p);

(Requires adding invalidateByPath to FileReadCache if not already present.)

Approach for fix: track per-path disarm outcome in the same loop, then targeted-clear only the unresolved subset. Add a test for the mixed batch where one path is a ghost and the rest are disk-fresh — the fresh ones' cache entries should survive.

Out of scope

Neither item is a correctness fix. Both PR #4239 (the original bug) and PR #4243 (the fix) remain valid as-is.

Validation requirements for follow-up PR

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions