Problem
AnyTrigramIndex.removeFile is {} in pure-mmap mode (src/index.zig:2518), and the mmap_overlay promotion keeps querying the base for paths the overlay has since superseded:
- a file deleted while the index is zero-copy stays
containsFile == true forever, its trigrams keep producing candidates, and fileCount stays inflated
- a file edited after an mmap load feeds candidates from both its old content (mmap base) and its new content (overlay) —
containsFile ORs the two (src/index.zig:2494) and candidates/candidatesRegex merge both sides with no masking
Ghost candidates are re-verified against real content downstream, so the damage is wasted I/O per search plus wrong containment/count answers — but containsFile gates the tier-3 supplemental scan (src/explore.zig:3531, :5294), so a stale true can silently drop a re-indexed file out of the scan tiers that were supposed to cover it.
Failing Test
test_index.zig — fails on current release tip (first assert: containsFile stays true after removeFile):
test "issue-590: mmap trigram index — removeFile takes effect and re-index masks stale base entries" {
var arena = std.heap.ArenaAllocator.init(testing.allocator);
defer arena.deinit();
const allocator = arena.allocator();
var explorer = Explorer.init(testing.allocator, Explorer.DEFAULT_CONTENT_CACHE_CAPACITY);
defer explorer.deinit();
try explorer.indexFile("src/auth.zig", "pub fn handleAuth(req: *Request) !void { validate(req); }");
try explorer.indexFile("src/gate.zig", "pub fn checkGate(ctx: *Context) !bool { return ctx.authenticated; }");
try explorer.indexFile("src/util.zig", "pub fn formatStr(buf: []u8, args: anytype) !void {}");
var tmp_dir = testing.tmpDir(.{});
defer tmp_dir.cleanup();
var path_buf: [std.fs.max_path_bytes]u8 = undefined;
const tmp_path_len = try tmp_dir.dir.realPathFile(io, ".", &path_buf);
const tmp_path = path_buf[0..tmp_path_len];
try explorer.trigram_index.writeToDisk(io, tmp_path, null);
const mmap_idx = MmapTrigramIndex.initFromDisk(io, tmp_path, testing.allocator) orelse
return error.MmapInitFailed;
var any_idx = AnyTrigramIndex{ .mmap = mmap_idx };
defer any_idx.deinit();
// A delete while zero-copy must take effect, not silently no-op.
any_idx.removeFile("src/gate.zig");
try testing.expect(!any_idx.containsFile("src/gate.zig"));
if (any_idx.candidates("checkGate", allocator)) |cands| {
for (cands) |p| try testing.expect(!std.mem.eql(u8, p, "src/gate.zig"));
}
// Re-indexing must mask the base's stale trigrams for that path.
try any_idx.indexFile("src/auth.zig", "pub fn renamedAuth() void {}");
if (any_idx.candidates("handleAuth", allocator)) |cands| {
for (cands) |p| try testing.expect(!std.mem.eql(u8, p, "src/auth.zig"));
}
const fresh = any_idx.candidates("renamedAuth", allocator) orelse return error.NoCandidates;
var found = false;
for (fresh) |p| {
if (std.mem.eql(u8, p, "src/auth.zig")) found = true;
}
try testing.expect(found);
// File accounting follows: 3 on disk, one removed.
try testing.expectEqual(@as(u32, 2), any_idx.fileCount());
}
Expected
Removal and re-index behave identically across heap, mmap, and overlay modes: base entries for superseded/removed paths stop answering.
Fix
Add a masked path set (owned keys) to MmapOverlay: indexFile/removeFile mask the path (removeFile on .mmap promotes to an overlay first — a remove is a write), containsFile and the candidates/candidatesRegex merges filter base hits through it, and fileCount subtracts a maintained masked-in-base counter.
Problem
AnyTrigramIndex.removeFileis{}in pure-mmap mode (src/index.zig:2518), and themmap_overlaypromotion keeps querying the base for paths the overlay has since superseded:containsFile == trueforever, its trigrams keep producing candidates, andfileCountstays inflatedcontainsFileORs the two (src/index.zig:2494) andcandidates/candidatesRegexmerge both sides with no maskingGhost candidates are re-verified against real content downstream, so the damage is wasted I/O per search plus wrong containment/count answers — but
containsFilegates the tier-3 supplemental scan (src/explore.zig:3531, :5294), so a staletruecan silently drop a re-indexed file out of the scan tiers that were supposed to cover it.Failing Test
test_index.zig— fails on current release tip (first assert: containsFile stays true after removeFile):Expected
Removal and re-index behave identically across heap, mmap, and overlay modes: base entries for superseded/removed paths stop answering.
Fix
Add a
maskedpath set (owned keys) toMmapOverlay:indexFile/removeFilemask the path (removeFile on.mmappromotes to an overlay first — a remove is a write),containsFileand thecandidates/candidatesRegexmerges filter base hits through it, andfileCountsubtracts a maintained masked-in-base counter.