Skip to content

fix(state): index tool_calls and tool_name in FTS5 for session_search#16866

Closed
yes999zc wants to merge 1 commit into
NousResearch:mainfrom
yes999zc:fix/fts5-index-tool-calls
Closed

fix(state): index tool_calls and tool_name in FTS5 for session_search#16866
yes999zc wants to merge 1 commit into
NousResearch:mainfrom
yes999zc:fix/fts5-index-tool-calls

Conversation

@yes999zc

Copy link
Copy Markdown
Contributor

Problem

FTS5 virtual tables only indexed the content column via external content mode. Tool calls (tool_calls JSON) and tool_name were invisible to FTS5 search, making session_search miss most tool-call-mediated conversation context.

Fixes #16751

Changes

  1. Switch FTS5 tables from external content to inline mode
  2. Update all 6 triggers to concatenate content || tool_name || tool_calls
  3. Extend short-CJK LIKE fallback to cover tool_name and tool_calls

Verification

Before: web_search=0, vision_analyze=0, config.yaml=0
After:  web_search=81, vision_analyze=6, config.yaml=157, qwen3-vl=47

All triggers active, snippet() working, DELETE supported, zero data loss on DB rebuild.

The FTS5 virtual tables (messages_fts, messages_fts_trigram) previously
only indexed the content column via external content mode. Tool calls
and tool names stored in the tool_calls (JSON) and tool_name columns
were invisible to FTS5 search.

Root cause: FTS5 triggers only INSERTed new.content into the index.

Changes:
- Switch FTS5 tables from external content (content=messages) to inline
  mode so that trigger-inserted content is both indexed and stored
- Update all 6 FTS5 triggers to concatenate content, tool_name, and
  tool_calls when indexing new messages
- Extend the short-CJK LIKE fallback to also search tool_name and
  tool_calls columns

Closes: #16751
@alt-glitch alt-glitch added type/bug Something isn't working P2 Medium — degraded but workaround exists comp/agent Core agent loop, run_agent.py, prompt builder labels Apr 28, 2026
@alt-glitch

Copy link
Copy Markdown
Collaborator

Likely duplicate of #16770 — both index tool_calls and tool_name in FTS5 for session_search. Both fix #16751.

@Bartok9 Bartok9 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good improvement — indexing tool_calls and tool_name is a clear win for session_search.

One potential issue: the DELETE triggers and the delete-half of the UPDATE triggers still reference only old.content:

INSERT INTO messages_fts(messages_fts, rowid, content) VALUES('delete', old.id, old.content);

Since the FTS table is now inline (no content=messages), the delete operation needs the exact same concatenated value that was originally inserted. If the delete content does not match what FTS has stored, FTS5 may silently fail to remove the entry or raise a "database disk image is malformed" error on integrity-check.

The delete lines should use the same COALESCE pattern:

COALESCE(old.content, '') || ' ' || COALESCE(old.tool_name, '') || ' ' || COALESCE(old.tool_calls, '')

This applies to all 4 delete operations (2 in FTS_SQL, 2 in FTS_TRIGRAM_SQL).

@teknium1

Copy link
Copy Markdown
Contributor

Merged via #16914. Your commit cfcad80ee is on main with your authorship preserved (rebase-merge).

Thanks for the diagnosis and direction — indexing tool_name + tool_calls in FTS5 was the right fix. We added two follow-ups on top:

  1. The delete triggers still passed old.content to FTS5's delete-command, but after the switch to inline mode the stored content is content || tool_name || tool_calls, so every DELETE FROM messages raised SQL logic error. Replaced the four delete paths with plain DELETE FROM messages_fts WHERE rowid = old.id.
  2. Bumped SCHEMA_VERSION to 11 and added a migration that drops the old FTS tables + triggers and backfills — existing users' DBs now get re-indexed on next open (with CREATE ... IF NOT EXISTS alone they would have kept the broken behavior forever).

6 regression tests covering INSERT / UPDATE / DELETE + the v10→v11 upgrade path are now in tests/test_hermes_state.py.

Closes #16751. Appreciate the contribution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/agent Core agent loop, run_agent.py, prompt builder P2 Medium — degraded but workaround exists type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

session_search does not index tool_calls or tool_name

4 participants