Problem
handleCallers (src/mcp.zig:1339) finds call sites by running explorer.searchContentWithScope(name, ...) — a substring full-text search. Then it de-dupes results by filtering out the canonical definition line of name (matching on path == d.path and line_num == d.symbol.line_start).
That filter only removes the one definition site of the searched name. It does not remove lines that mention a different identifier whose name contains the search term as a substring.
Concrete reproduction: codedb_callers(name="fooBar") returns lines that mention fooBarExtended — both its definition site and any references — as if they were call sites of fooBar.
The eval found this for searchInContent returning hits inside searchInContentWithScope, and for isIndexableRoot returning matches against itself in design docs.
Failing Test
test "issue-425: codedb_callers excludes substring matches in unrelated identifiers" {
var arena = std.heap.ArenaAllocator.init(testing.allocator);
defer arena.deinit();
var explorer = Explorer.init(arena.allocator());
var store = Store.init(testing.allocator);
defer store.deinit();
var agents = AgentRegistry.init(testing.allocator);
defer agents.deinit();
_ = try agents.register("__filesystem__");
var bench_ctx = mcp_mod.BenchContext.init(testing.allocator, ".");
defer bench_ctx.deinit();
try explorer.indexFile("def.zig", "pub fn fooBar() void {}\n");
try explorer.indexFile("other.zig", "pub fn fooBarExtended() void {}\n");
try explorer.indexFile("a.zig", "pub fn callerA() void {\n fooBar();\n}\n");
const args_json =
\\{"name":"fooBar"}
;
const parsed = try std.json.parseFromSlice(std.json.Value, testing.allocator, args_json, .{});
defer parsed.deinit();
var out: std.ArrayList(u8) = .empty;
defer out.deinit(testing.allocator);
bench_ctx.runDispatch(io, testing.allocator, .codedb_callers, &parsed.value.object, &out, &store, &explorer, &agents);
try testing.expect(std.mem.indexOf(u8, out.items, "a.zig:2") != null);
try testing.expect(std.mem.indexOf(u8, out.items, "other.zig") == null);
try testing.expect(std.mem.indexOf(u8, out.items, "fooBarExtended") == null);
try testing.expect(std.mem.indexOf(u8, out.items, "1 call sites for 'fooBar'") != null);
}
Failing test lives on branch issue-425-failing-test (commit 656d713).
$ zig build test 2>&1 | rg "issue-425"
error: 'tests.test.issue-425: codedb_callers excludes substring matches in unrelated identifiers' failed
/Users/.../src/tests.zig:10492: try testing.expect(std.mem.indexOf(u8, out.items, "other.zig") == null);
Expected
codedb_callers(name="fooBar") returns only lines where fooBar appears as a whole-word identifier — not as a substring of a longer identifier. The header count reflects the real number of call sites.
Fix
In handleCallers (src/mcp.zig:1352-1382), gate each emission on a whole-word check against r.line_text: at the byte index of the substring match, require the preceding byte (if any) and the following byte (if any) to be non-identifier characters (i.e. not [A-Za-z0-9_]). If the line has no whole-word occurrence of name, skip it.
Effort: small. One helper (hasWholeWordMatch(line, name) bool) reused inside the existing for-loop.
Eval context
Found by an automated codedb evaluation against codedb 0.2.5805. Filed alongside #426 (non-code files leaking into callers) and #427 (Tier 1 sort starves the canonical definition file).
Problem
handleCallers(src/mcp.zig:1339) finds call sites by runningexplorer.searchContentWithScope(name, ...)— a substring full-text search. Then it de-dupes results by filtering out the canonical definition line ofname(matching onpath == d.pathandline_num == d.symbol.line_start).That filter only removes the one definition site of the searched name. It does not remove lines that mention a different identifier whose name contains the search term as a substring.
Concrete reproduction:
codedb_callers(name="fooBar")returns lines that mentionfooBarExtended— both its definition site and any references — as if they were call sites offooBar.The eval found this for
searchInContentreturning hits insidesearchInContentWithScope, and forisIndexableRootreturning matches against itself in design docs.Failing Test
Failing test lives on branch
issue-425-failing-test(commit656d713).Expected
codedb_callers(name="fooBar")returns only lines wherefooBarappears as a whole-word identifier — not as a substring of a longer identifier. The header count reflects the real number of call sites.Fix
In
handleCallers(src/mcp.zig:1352-1382), gate each emission on a whole-word check againstr.line_text: at the byte index of the substring match, require the preceding byte (if any) and the following byte (if any) to be non-identifier characters (i.e. not[A-Za-z0-9_]). If the line has no whole-word occurrence ofname, skip it.Effort: small. One helper (
hasWholeWordMatch(line, name) bool) reused inside the existing for-loop.Eval context
Found by an automated codedb evaluation against codedb 0.2.5805. Filed alongside #426 (non-code files leaking into callers) and #427 (Tier 1 sort starves the canonical definition file).