Skip to content

feat: configurable search sort (recency/relevance/hybrid) + phrase matching fix#296

Merged
jalehman merged 7 commits intoMartian-Engineering:mainfrom
electricsheephq:feat/search-quality-improvements
Apr 6, 2026
Merged

feat: configurable search sort (recency/relevance/hybrid) + phrase matching fix#296
jalehman merged 7 commits intoMartian-Engineering:mainfrom
electricsheephq:feat/search-quality-improvements

Conversation

@100yenadmin
Copy link
Copy Markdown
Contributor

@100yenadmin 100yenadmin commented Apr 6, 2026

Summary

Two search quality improvements for `lcm_grep`:

  1. Configurable sort order — new `sort` parameter: `"recency"` (default), `"relevance"` (FTS5 BM25), or `"hybrid"`
  2. Phrase matching fix — quoted phrases like `"error handling"` now match as a phrase instead of being split into two separate terms

Fixes #293

Why This Matters

FTS5 already computes BM25 relevance scores on every search, but they were being discarded at sort time. On conversations with 200K+ messages, recency-only sorting buries relevant older matches while surfacing recent-but-tangential hits.

Design Rationale (from LCM architecture analysis)

The `"recency"` default is preserved because LCM's design intentionally relies on:

  • The DAG structure itself as a relevance filter (terms surviving multi-level compression are high-signal)
  • `lcm_expand_query`'s sub-agent LLM reasoning for smart retrieval
  • `lcm_grep` as a routing tool (find where info lives, then expand)

The `"relevance"` option is for users with very long histories where the routing function benefits from surfacing the best keyword match, not just the most recent one. The agent chooses per-query — use recency for "what just happened?" and relevance for "find the discussion about X from last week."

Changes

File Change
`src/retrieval.ts` Added `sort` to `GrepInput`, implemented 3 sort modes
`src/tools/lcm-grep-tool.ts` Added `sort` parameter to schema, pass through to retrieval
`src/store/fts5-sanitize.ts` Fixed phrase detection — preserve user-quoted `"..."` groups

Test plan

  • 202/202 tests pass locally
  • CI (may need fork workflow approval)

Copilot AI review requested due to automatic review settings April 6, 2026 16:27
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves lcm_grep search quality by adding a configurable sort order (recency/relevance/hybrid) and updating FTS5 query sanitization to preserve user-quoted phrases as phrase queries.

Changes:

  • Added sort parameter to lcm_grep tool schema and forwarded it into retrieval.
  • Implemented recency / relevance / hybrid result sorting in RetrievalEngine.grep.
  • Updated FTS5 query sanitization to keep "..." groups intact as phrases rather than splitting on whitespace.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.

File Description
src/tools/lcm-grep-tool.ts Adds sort param to tool schema and forwards it to retrieval.
src/retrieval.ts Introduces sort modes for grep results (recency/relevance/hybrid).
src/store/fts5-sanitize.ts Preserves user-quoted phrases during FTS5 query sanitization.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/retrieval.ts Outdated
Comment thread src/retrieval.ts Outdated
Comment thread src/retrieval.ts Outdated
Comment thread src/store/fts5-sanitize.ts
@100yenadmin
Copy link
Copy Markdown
Contributor Author

Part of the LCM Performance & Cache Optimization Sprint — see #297 for the full tracking issue linking all 5 PRs.

100yenadmin pushed a commit to electricsheephq/lossless-claw-test that referenced this pull request Apr 6, 2026
Addresses PR Martian-Engineering#296 review comments:

1. Hybrid sort math: FTS5 rank is negative (more negative = better).
   Old formula multiplied negative rank by age factor, making older
   results rank BETTER (inverted). Fixed: negate rank to get positive
   relevance score, divide by age penalty, negate result for ascending sort.

2. Sort scope: documented that sort is applied post-LIMIT — the store
   queries ORDER BY created_at DESC LIMIT ?, so relevance/hybrid modes
   only re-rank within the top-N-by-recency window. Known limitation.

3. Added 4 phrase matching tests to fts5-sanitize.test.ts:
   - Multi-word quoted phrase preserved
   - Multiple quoted phrases preserved
   - Mixed quoted and unquoted terms
   - Empty quoted phrase

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@100yenadmin 100yenadmin requested a review from Copilot April 6, 2026 17:02
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/retrieval.ts Outdated
Comment thread src/tools/lcm-grep-tool.ts
Comment thread src/store/fts5-sanitize.ts
100yenadmin pushed a commit to electricsheephq/lossless-claw-test that referenced this pull request Apr 6, 2026
Addresses PR Martian-Engineering#296 review round 3:

1. Relevance sort: added recency tiebreaker when FTS5 ranks are equal,
   preventing implementation-dependent ordering.

2. Sort mode validation: relevance/hybrid silently fall back to recency
   when mode is regex (FTS5 rank not available). Previously accepted
   without effect.

3. Removed unused 'remaining' variable in fts5-sanitize.ts.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@100yenadmin 100yenadmin requested a review from Copilot April 6, 2026 17:15
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/retrieval.ts Outdated
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/tools/lcm-grep-tool.ts Outdated
Comment thread src/retrieval.ts Outdated
@100yenadmin
Copy link
Copy Markdown
Contributor Author

Merge order: 5th — Merge last. No conflicts — independent files (retrieval.ts, fts5-sanitize.ts).

See #297 for the full sprint tracking issue with all 5 PRs.

Recommended merge sequence: #288#294#289#295#296

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/retrieval.ts Outdated
Comment on lines +265 to +266
// AGE_DECAY_RATE: ~1% relevance penalty per 10 hours of age.
// At 100h: 10% penalty. At 1000h (~6 weeks): 50% penalty.
Copy link

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The AGE_DECAY_RATE comment math is slightly off: with agePenalty = 1 + ageHours * 0.001, at 100h the relevance becomes 1/1.1 (~9.1% reduction), not 10%. Either adjust the wording (e.g., “~9%”) or tweak the rate/formula to match the stated percentages so future tuning is based on correct intuition.

Suggested change
// AGE_DECAY_RATE: ~1% relevance penalty per 10 hours of age.
// At 100h: 10% penalty. At 1000h (~6 weeks): 50% penalty.
// AGE_DECAY_RATE: ~1% increase in the age penalty per 10 hours of age.
// At 100h: ~9% relevance reduction. At 1000h (~6 weeks): 50% reduction.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Valid — the exact math at 100h is 1/(1+0.1) = 9.1% reduction, not 10%. Will fix the comment to say '~9%' instead of '~10%'. The rate itself is correct — just the human-readable approximation was rounded.

Eva and others added 5 commits April 6, 2026 15:56
…e matching

Two search quality improvements:

1. Add sort parameter to lcm_grep: "recency" (default, preserves
   current behavior), "relevance" (FTS5 BM25 rank — best match first),
   or "hybrid" (blends relevance with recency). FTS5 already computes
   BM25 scores on every search but they were being ignored.

   The "recency" default preserves the original design intent — the
   DAG structure already filters for importance through compression,
   and lcm_expand_query handles smart retrieval via sub-agent LLM
   reasoning. But for users with 200K+ message histories, relevance
   sorting surfaces important older matches that recency-only misses.

2. Fix phrase matching in fts5-sanitize.ts: user-quoted phrases like
   "error handling" were being split into "error" AND "handling"
   (two separate terms). Now preserved as a single phrase match.

Fixes Martian-Engineering#293

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Addresses PR Martian-Engineering#296 review comments:

1. Hybrid sort math: FTS5 rank is negative (more negative = better).
   Old formula multiplied negative rank by age factor, making older
   results rank BETTER (inverted). Fixed: negate rank to get positive
   relevance score, divide by age penalty, negate result for ascending sort.

2. Sort scope: documented that sort is applied post-LIMIT — the store
   queries ORDER BY created_at DESC LIMIT ?, so relevance/hybrid modes
   only re-rank within the top-N-by-recency window. Known limitation.

3. Added 4 phrase matching tests to fts5-sanitize.test.ts:
   - Multi-word quoted phrase preserved
   - Multiple quoted phrases preserved
   - Mixed quoted and unquoted terms
   - Empty quoted phrase

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Addresses PR Martian-Engineering#296 review round 3:

1. Relevance sort: added recency tiebreaker when FTS5 ranks are equal,
   preventing implementation-dependent ordering.

2. Sort mode validation: relevance/hybrid silently fall back to recency
   when mode is regex (FTS5 rank not available). Previously accepted
   without effect.

3. Removed unused 'remaining' variable in fts5-sanitize.ts.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ription

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Eva and others added 2 commits April 6, 2026 15:56
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Move lcm_grep sort handling into the FTS query layer so relevance and hybrid ordering are applied before LIMIT truncates candidates. Add a changeset, update agent-facing grep guidance and delegated retrieval prompts, and cover the new behavior with focused tests.

Regeneration-Prompt: |\n  Address the PR review findings on the search-quality work for lcm_grep. Preserve the new sort parameter and quoted-phrase behavior, but fix the implementation so full-text relevance and hybrid ordering happen before LIMIT is applied instead of reranking a recency-truncated window in memory. Keep regex and fallback behavior intact. Also add the missing changeset because this is user-facing plugin behavior. Finally inspect the prompt and tool-instruction surfaces that teach the model how to use LCM recall tools, and update them so agents know when to prefer full_text, when to quote exact phrases, and when relevance or hybrid sorting is appropriate. Add regression tests for the ranking path and prompt guidance.
@jalehman jalehman force-pushed the feat/search-quality-improvements branch from fff8b3a to d582a4c Compare April 6, 2026 22:57
@jalehman jalehman merged commit 4906c62 into Martian-Engineering:main Apr 6, 2026
1 check passed
@jalehman
Copy link
Copy Markdown
Contributor

jalehman commented Apr 6, 2026

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

perf: use FTS5 BM25 ranking for search results + fix phrase matching

3 participants