fix(memory/holographic): sanitize FTS5 queries for natural-language recall by cyb3rwr3n · Pull Request #11333 · NousResearch/hermes-agent

cyb3rwr3n · 2026-04-17T02:40:59Z

Summary

The holographic memory provider's FactRetriever._fts_candidates passes the raw user query directly to FTS5's MATCH operator. FTS5 defaults to AND-between-tokens, which means any multi-word prose query requires every token to co-occur in a fact. For the prefetch() path where the query comes straight from the user message, this reduces recall to near-zero on natural-language prompts.

Example, before this fix:

query: "what happened with the deployment rollback"
FTS5 MATCH: "what AND happened AND with AND the AND deployment AND rollback"
results: 0  (nothing has all six tokens)

query: "deployment OR rollback"
results: 5  (normal recall)

The prefetch hook in run_agent.py that injects memory context on every turn was therefore silently missing relevant facts whenever the user phrased their message in prose.

Fix

Add _sanitize_fts_query() that:

tokenizes and lowercases the query
drops standard English stopwords and <2-char tokens
strips FTS5 operator characters from each remaining token
OR-joins the survivors as phrase literals: "tok1" OR "tok2" OR ...
falls back to the raw query if nothing survives sanitization (pathological inputs)

No changes to the HRR + Jaccard + trust reranking — those keep precision high once the candidate pool isn't empty.

Test plan

Ships with 10 new tests in tests/plugins/memory/test_holographic_retrieval.py:

parametrized sanitizer unit tests (stopword drop, single content word, pure-stopword fallback, FTS5-special stripping, empty input)
FTS5 crash-safety test against problematic inputs (quotes, stars, parens, carets, colons, hyphens, long strings)
integration tests against an in-memory MemoryStore:
- natural-language prose query recovers the relevant fact (the exact regression this fix targets)
- single-keyword query still works
- pure-stopword query returns [] without crashing

Existing memory/fact-store test suite (329 tests) still passes.

pytest tests/plugins/memory/test_holographic_retrieval.py  # 10 passed
pytest tests/ -k "memory or fact_store or retriev or holographic"  # 329 passed, 4 skipped

Notes

Pure bug fix, no API surface change, no config change.
Stopword list is the standard English set (baked in). Could be made configurable later if multi-language is desired, but that's out of scope here.
The sanitizer is a @classmethod so tests can call it directly without instantiating a retriever + store.

…ecall The FactRetriever's _fts_candidates passed the raw query string directly to FTS5's MATCH operator. FTS5 defaults to AND-between-tokens, which means any multi-word prose query like 'what happened with the deployment rollback' required every single token to co-occur in a fact — dropping recall to zero on the kind of queries agents actually issue via prefetch(). Fix: add _sanitize_fts_query() that: - tokenizes the query and drops English stopwords - strips FTS5 operator characters per token - OR-joins the remaining content tokens as phrase literals For pathological inputs (all stopwords, empty), falls back to the raw query so the caller sees zero results instead of a SQL error. This is a pure-retrieval-quality fix — the HRR + Jaccard reranking stages still keep precision high. Ships with 10 tests covering the sanitizer and retrieval integration.

alt-glitch · 2026-04-25T02:56:42Z

Related to #14033, #14262, #14794 — all address holographic FTS5 query sanitization. This PR is the most comprehensive (stopwords + OR-join + crash safety), but check for conflicts with those PRs.

cyb3rwr3n force-pushed the fix/holographic-fts-sanitize branch from 9a65f4c to 2c930a3 Compare April 19, 2026 00:10

alt-glitch mentioned this pull request Apr 22, 2026

fix(memory): sanitize holographic fact_store FTS queries #14033

Open

alt-glitch added type/bug Something isn't working P1 High — major feature broken, no workaround tool/memory Memory tool and memory providers comp/plugins Plugin system and bundled plugins labels Apr 25, 2026

alt-glitch mentioned this pull request Apr 29, 2026

fix(holographic): sanitize FTS5 queries, fix entity wildcard matching, close DB on shutdown #6667

Open

5 tasks

alt-glitch mentioned this pull request Jun 11, 2026

fix(holographic): robust FTS5 retrieval for operator characters and partial-match queries #44040

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(memory/holographic): sanitize FTS5 queries for natural-language recall#11333

fix(memory/holographic): sanitize FTS5 queries for natural-language recall#11333
cyb3rwr3n wants to merge 1 commit into
NousResearch:mainfrom
cyb3rwr3n:fix/holographic-fts-sanitize

cyb3rwr3n commented Apr 17, 2026 •

edited

Loading

Uh oh!

alt-glitch commented Apr 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

cyb3rwr3n commented Apr 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Fix

Test plan

Notes

Uh oh!

alt-glitch commented Apr 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

cyb3rwr3n commented Apr 17, 2026 •

edited

Loading