Skip to content

feat: add optional message embedding for semantic search (config-driven)#18059

Open
gzsiang wants to merge 3 commits into
NousResearch:mainfrom
gzsiang:feat/message-embedding
Open

feat: add optional message embedding for semantic search (config-driven)#18059
gzsiang wants to merge 3 commits into
NousResearch:mainfrom
gzsiang:feat/message-embedding

Conversation

@gzsiang

@gzsiang gzsiang commented Apr 30, 2026

Copy link
Copy Markdown

feat: optional message embedding for semantic search (config-driven)

Adds optional message embedding support to Hermes Agent's memory system, enabling hybrid search (FTS5 full-text + vector cosine similarity) for more relevant memory retrieval.

What it does

  • Schema v12: adds message_embedding BLOB column to messages table (auto-migrated on startup)
  • Config-driven: embedding is disabled by default; enable by adding to ~/.hermes/config.yaml:
    embedding:
      base_url: http://your-embedding-server:8081   # Ollama-compatible /v1/embeddings
      model: Qwen3-Embedding-0.6B
      dimension: 1024
  • Graceful fallback: if no config, pure FTS5 is used (zero overhead)
  • Hybrid search: FTS5 results are re-ranked by cosine similarity using stored embeddings
  • Vector-only fallback: if FTS5 returns nothing, falls back to pure vector search
  • Backfill script: scripts/backfill_embeddings.py computes embeddings for existing messages

Files changed

  • hermes_state.py — core embedding logic + schema migration
  • scripts/backfill_embeddings.py — utility to backfill embeddings for existing DB rows

Design notes

  • Embeddings are computed via Ollama-compatible /v1/embeddings API (tested with Qwen3-Embedding-0.6B on a 2080ti)
  • Vectors stored as packed float32 (struct.pack('f' * dim, ...)) in SQLite BLOB
  • Schema migration is non-destructive: existing rows get NULL embedding, filled lazily or via backfill script
  • No external dependencies added — uses urllib.request (standard library) for the embedding API call

Testing

  • Schema migration (v11→v12) tested manually
  • Hybrid search tested with a local embedding server
  • Graceful fallback (no config) verified — pure FTS5, no errors
  • Backfill script tested with --dry-run and --apply modes

@alt-glitch alt-glitch added type/feature New feature or request P3 Low — cosmetic, nice to have tool/memory Memory tool and memory providers area/config Config system, migrations, profiles labels Apr 30, 2026

@trevorgordon981 trevorgordon981 left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. This is a clean, opt-in addition for semantic search over conversation history. The implementation is thoughtful:

  • Schema v12 adds message_embedding BLOB column (backward-compatible via existing reconciliation)
  • Config-driven: embedding.base_url/endpoint in ~/.hermes/config.yaml — disabled by default (zero overhead)
  • Hybrid search: FTS5 BM25 + cosine similarity re-ranking (0.3/0.7 weights), with pure vector fallback
  • Graceful: _compute_embedding short-circuits if base_url is unset — no DNS calls

Verified:

  • Schema migration works (v12 columns present)
  • Append with embedding disabled: PASS
  • 195/198 state tests pass — 3 failures are expected (schema version assertions need updating from 11→12)

The PR is ready once the 3 test assertions are bumped to expect v12. Nice work.

@gzsiang

gzsiang commented May 1, 2026

Copy link
Copy Markdown
Author

Thanks for the review! Updated the 3 schema version test assertions to expect v12.

@gzsiang gzsiang force-pushed the feat/message-embedding branch 2 times, most recently from 8d5031c to a4cfd7f Compare May 4, 2026 16:53
@gzsiang

gzsiang commented May 5, 2026

Copy link
Copy Markdown
Author

Friendly ping — this PR has been approved and the requested changes addressed. Happy to rebase or make any additional tweaks if needed. Would love to get this merged when you have a chance! 🙏

@gzsiang gzsiang force-pushed the feat/message-embedding branch from d47002b to 01a0c9c Compare May 10, 2026 17:35
@gzsiang

gzsiang commented May 10, 2026

Copy link
Copy Markdown
Author

Rebased onto latest upstream/main — force-pushed clean branch with only the embedding commits. Ready for review.

Changes:

  • hermes_state.py: schema v12, embedding + hybrid search
  • scripts/backfill_embeddings.py: utility script

CI should trigger on this push. Please let me know if manual CI trigger is needed.

@gzsiang

gzsiang commented May 10, 2026

Copy link
Copy Markdown
Author

Hi @NousResearch/maintainers, this PR has been rebased onto latest upstream/main with a clean commit history (2 commits, only hermes_state.py + scripts/backfill_embeddings.py).

CI hasn't triggered due to fork PR restrictions. Could someone approve the CI run? Thanks!

Summary: Adds optional message embedding for semantic search (config-driven, disabled by default). Details in the PR description.

@gzsiang gzsiang force-pushed the feat/message-embedding branch from a6c0f86 to e2a53c5 Compare May 17, 2026 12:52
@gzsiang

gzsiang commented May 17, 2026

Copy link
Copy Markdown
Author

PR cleaned up — no unrelated files

Rebased onto latest upstream/main and removed the unrelated i18n files from the branch.

Current state:

  • ✅ Only 2 embedding-related files (hermes_state.py + scripts/backfill_embeddings.py)
  • ✅ +399/−8 lines, zero external deps added
  • ✅ Mergeable
  • ⏳ CI pending (needs maintainer approval for fork PR workflow runs)

Maintainers: please approve the workflow run. Thanks!

@gzsiang gzsiang force-pushed the feat/message-embedding branch 4 times, most recently from c106cf9 to d4369f8 Compare May 24, 2026 15:15
@gzsiang gzsiang force-pushed the feat/message-embedding branch 2 times, most recently from b7c0639 to 1ac8d67 Compare June 2, 2026 12:02
@gzsiang gzsiang force-pushed the feat/message-embedding branch 2 times, most recently from 0ffc7d0 to d2a52d2 Compare June 4, 2026 17:04
@gzsiang gzsiang force-pushed the feat/message-embedding branch from d2a52d2 to 06c84cb Compare June 6, 2026 02:51
gzsiang added a commit to gzsiang/hermes-agent that referenced this pull request Jun 6, 2026
Added Chinese description of fork features:
- Circuit breaker (NousResearch#16749)
- CLI Chinese localization (NousResearch#15282)
- Message embedding (NousResearch#18059)
- Emergency compression (NousResearch#18607)
@gzsiang gzsiang force-pushed the feat/message-embedding branch 2 times, most recently from 73a95cb to c026e81 Compare June 6, 2026 05:58
Add support for computing and storing embedding vectors (packed float32)
for assistant messages, enabling cosine-similarity-based hybrid search.

- Schema v12: add message_embedding BLOB column (auto-reconciled)
- _try_load_embedding_config: load endpoint from ~/.hermes/config.yaml
- _compute_embedding: call Ollama-compatible /v1/embeddings API
- _cosine_similarity: re-rank FTS5 results using vector similarity
- Graceful fallback: no embedding configured = pure FTS5, no overhead
- Vector-only search when FTS5 returns no results

Usage: embed embedding.base_url in ~/.hermes/config.yaml:
  embedding:
    base_url: http://your-server:8081
    model: Qwen3-Embedding-0.6B
    dimension: 1024
gzsiang added 2 commits June 8, 2026 12:35
Utility script to compute and store embedding vectors for existing
messages in state.db. Handles both normal content and reasoning-only
assistant messages.

Usage:
  python3 scripts/backfill_embeddings.py              # dry run
  python3 scripts/backfill_embeddings.py --apply      # execute
  python3 scripts/backfill_embeddings.py --apply --batch-size 100
@gzsiang gzsiang force-pushed the feat/message-embedding branch from c026e81 to c59f30a Compare June 8, 2026 04:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/config Config system, migrations, profiles P3 Low — cosmetic, nice to have tool/memory Memory tool and memory providers type/feature New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants