search: per-file cap + histogram fallback for search_content#495
Merged
Conversation
A single popular symbol could match 200+ lines in a big file (App.tsx,
loop.ts) and eat the entire byte budget before the walk reached any
other file. Truncation marker fired, every other matching file got
zero coverage, and the caller saw a wall of one file's hits with no
hint that distribution was wider than that.
Three knobs:
1. MAX_HITS_PER_FILE = 30. Beyond that, the per-file output ends with
"[rel: N more matches in this file — re-grep with a tighter
pattern or use read_file to see them]". One bullhorn file can no
longer dominate the byte budget.
2. After each file completes, if printed bytes have crossed 80% of
the byte budget, remaining files switch to histogram form
("rel: N matches"). A one-line notice marks the flip so the
caller knows distribution from this point on is summary-only.
3. New `summary_only:true` arg skips line content entirely and
returns just the histogram. Useful for "where does this exist at
all" before drilling in with a targeted read_file.
Closes #489
ChasLui
pushed a commit
to ChasLui/DeepSeek-Reasonix
that referenced
this pull request
May 23, 2026
…sengine#495) A single popular symbol could match 200+ lines in a big file (App.tsx, loop.ts) and eat the entire byte budget before the walk reached any other file. Truncation marker fired, every other matching file got zero coverage, and the caller saw a wall of one file's hits with no hint that distribution was wider than that. Three knobs: 1. MAX_HITS_PER_FILE = 30. Beyond that, the per-file output ends with "[rel: N more matches in this file — re-grep with a tighter pattern or use read_file to see them]". One bullhorn file can no longer dominate the byte budget. 2. After each file completes, if printed bytes have crossed 80% of the byte budget, remaining files switch to histogram form ("rel: N matches"). A one-line notice marks the flip so the caller knows distribution from this point on is summary-only. 3. New `summary_only:true` arg skips line content entirely and returns just the histogram. Useful for "where does this exist at all" before drilling in with a targeted read_file. Closes esengine#489
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
search_contenthad one truncation knob — a byte budget onctx.maxListBytes. A single popular symbol could match 200+ lines in one big file and consume the entire budget before the walk reached any other file. Callers saw a wall ofApp.tsx:NNN: ...lines and no hint that distribution was wider.This PR adds three layers of distribution-preserving behavior:
MAX_HITS_PER_FILE = 30; overflow is summarized with[rel: N more matches in this file — re-grep with a tighter pattern or use read_file to see them].rel: N matchesform. A one-line notice marks the flip.summary_only: truearg. Skip line content entirely; return just the histogram. Cheap one-shot for "where does this exist at all" before drilling in with a targetedread_file.The existing byte-budget cap stays as the safety net.
Closes #489
Test plan
tests/filesystem-tools.test.ts— four new cases under aper-file cap and histogram fallbackdescribe block: hits-cap + footer, no-footer when under cap,summary_onlyshape, and the 80 % flip-to-summary trigger.npm run verify(full suite, 2295 tests).