feat: add optional message embedding for semantic search (config-driven)#18059
feat: add optional message embedding for semantic search (config-driven)#18059gzsiang wants to merge 3 commits into
Conversation
trevorgordon981
left a comment
There was a problem hiding this comment.
LGTM. This is a clean, opt-in addition for semantic search over conversation history. The implementation is thoughtful:
- Schema v12 adds
message_embedding BLOBcolumn (backward-compatible via existing reconciliation) - Config-driven:
embedding.base_url/endpointin~/.hermes/config.yaml— disabled by default (zero overhead) - Hybrid search: FTS5 BM25 + cosine similarity re-ranking (0.3/0.7 weights), with pure vector fallback
- Graceful:
_compute_embeddingshort-circuits ifbase_urlis unset — no DNS calls
Verified:
- Schema migration works (v12 columns present)
- Append with embedding disabled: PASS
- 195/198 state tests pass — 3 failures are expected (schema version assertions need updating from 11→12)
The PR is ready once the 3 test assertions are bumped to expect v12. Nice work.
|
Thanks for the review! Updated the 3 schema version test assertions to expect v12. |
8d5031c to
a4cfd7f
Compare
|
Friendly ping — this PR has been approved and the requested changes addressed. Happy to rebase or make any additional tweaks if needed. Would love to get this merged when you have a chance! 🙏 |
d47002b to
01a0c9c
Compare
|
Rebased onto latest upstream/main — force-pushed clean branch with only the embedding commits. Ready for review. Changes:
CI should trigger on this push. Please let me know if manual CI trigger is needed. |
|
Hi @NousResearch/maintainers, this PR has been rebased onto latest CI hasn't triggered due to fork PR restrictions. Could someone approve the CI run? Thanks! Summary: Adds optional message embedding for semantic search (config-driven, disabled by default). Details in the PR description. |
a6c0f86 to
e2a53c5
Compare
PR cleaned up — no unrelated filesRebased onto latest upstream/main and removed the unrelated i18n files from the branch. Current state:
Maintainers: please approve the workflow run. Thanks! |
c106cf9 to
d4369f8
Compare
b7c0639 to
1ac8d67
Compare
0ffc7d0 to
d2a52d2
Compare
d2a52d2 to
06c84cb
Compare
Added Chinese description of fork features: - Circuit breaker (NousResearch#16749) - CLI Chinese localization (NousResearch#15282) - Message embedding (NousResearch#18059) - Emergency compression (NousResearch#18607)
73a95cb to
c026e81
Compare
Add support for computing and storing embedding vectors (packed float32)
for assistant messages, enabling cosine-similarity-based hybrid search.
- Schema v12: add message_embedding BLOB column (auto-reconciled)
- _try_load_embedding_config: load endpoint from ~/.hermes/config.yaml
- _compute_embedding: call Ollama-compatible /v1/embeddings API
- _cosine_similarity: re-rank FTS5 results using vector similarity
- Graceful fallback: no embedding configured = pure FTS5, no overhead
- Vector-only search when FTS5 returns no results
Usage: embed embedding.base_url in ~/.hermes/config.yaml:
embedding:
base_url: http://your-server:8081
model: Qwen3-Embedding-0.6B
dimension: 1024
Utility script to compute and store embedding vectors for existing messages in state.db. Handles both normal content and reasoning-only assistant messages. Usage: python3 scripts/backfill_embeddings.py # dry run python3 scripts/backfill_embeddings.py --apply # execute python3 scripts/backfill_embeddings.py --apply --batch-size 100
c026e81 to
c59f30a
Compare
feat: optional message embedding for semantic search (config-driven)
Adds optional message embedding support to Hermes Agent's memory system, enabling hybrid search (FTS5 full-text + vector cosine similarity) for more relevant memory retrieval.
What it does
message_embedding BLOBcolumn tomessagestable (auto-migrated on startup)~/.hermes/config.yaml:scripts/backfill_embeddings.pycomputes embeddings for existing messagesFiles changed
hermes_state.py— core embedding logic + schema migrationscripts/backfill_embeddings.py— utility to backfill embeddings for existing DB rowsDesign notes
/v1/embeddingsAPI (tested with Qwen3-Embedding-0.6B on a 2080ti)struct.pack('f' * dim, ...)) in SQLite BLOBNULLembedding, filled lazily or via backfill scripturllib.request(standard library) for the embedding API callTesting
--dry-runand--applymodes