Skip to content

fix(search): default FTS5 multi-keyword queries to OR instead of AND#9651

Open
memphislee09-source wants to merge 1 commit into
NousResearch:mainfrom
memphislee09-source:fix/fts5-or-default-for-multi-keyword-search
Open

fix(search): default FTS5 multi-keyword queries to OR instead of AND#9651
memphislee09-source wants to merge 1 commit into
NousResearch:mainfrom
memphislee09-source:fix/fts5-or-default-for-multi-keyword-search

Conversation

@memphislee09-source

Copy link
Copy Markdown

Problem

FTS5 treats space-separated terms as AND by default. This makes multi-keyword session search (e.g. 特洛伊 海伦, Trojan war Helen) return almost nothing, since few individual messages contain ALL search terms simultaneously. Users expect recall-style search to match ANY term, not ALL.

Root Cause

_sanitize_fts5_query in hermes_state.py passes queries straight through to FTS5 MATCH, which interprets spaces as implicit AND. A query like 特洛伊 海伦 requires both terms in the same message — only 0-1 results vs. the expected 5+.

Fix

Added Step 7 to _sanitize_fts5_query: when no explicit boolean operators (AND/OR/NOT) are present, automatically join space-separated terms with OR. Quoted phrases are preserved as single tokens.

Before After
特洛伊 海伦 → implicit AND → 0 results 特洛伊 OR 海伦 → 5+ results
"exact phrase" → preserved ✅ "exact phrase" → preserved ✅
python NOT java → preserved ✅ python NOT java → preserved ✅
docker OR kubernetes → preserved ✅ docker OR kubernetes → preserved ✅

Changes

  • hermes_state.py — Added Step 7 OR conversion in _sanitize_fts5_query
  • tests/test_hermes_state.py — Updated test expectation for hello worldhello OR world

Testing

  • All 39 related tests pass (sanitize_fts5, session_search, search_messages)
  • End-to-end: session_search('Trojan war Helen 特洛伊 海伦') now returns 2 sessions instead of 0

FTS5 treats space-separated terms as AND by default, which makes
multi-keyword session search (e.g. '特洛伊 海伦') return almost nothing
since few messages contain ALL search terms simultaneously.

Add Step 7 to _sanitize_fts5_query that automatically joins plain terms
with OR when no explicit boolean operators (AND/OR/NOT) are present.
Quoted phrases are preserved as single tokens during conversion.

Before: '特洛伊 海伦' → implicit AND → 0 results
After:  '特洛伊 OR 海伦' → 5+ results from matching sessions
@alt-glitch alt-glitch added type/bug Something isn't working P2 Medium — degraded but workaround exists comp/agent Core agent loop, run_agent.py, prompt builder tool/memory Memory tool and memory providers labels Apr 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/agent Core agent loop, run_agent.py, prompt builder P2 Medium — degraded but workaround exists tool/memory Memory tool and memory providers type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants