feat(tool_search): optional embedding reranker for progressive tool disclosure#35457
Open
davidgut1982 wants to merge 1 commit into
Open
feat(tool_search): optional embedding reranker for progressive tool disclosure#35457davidgut1982 wants to merge 1 commit into
davidgut1982 wants to merge 1 commit into
Conversation
Contributor
Author
|
Per-Scope Cache Improvement Added Cherry-picked commit 09d86e6 (fix(tool_search): per-scope reranker cache) onto this PR. This adds critical multi-agent support:
New test coverage:
This is essential for orchestrator patterns where multiple concurrent agents with different MCP toolsets need to avoid thrashing the embedding endpoint. |
This was referenced May 31, 2026
3ee5b7a to
23c8dcd
Compare
…isclosure Adds an optional, opt-in embedding reranker to the tool_search BM25 bridge (PR NousResearch#34493). Default OFF — when disabled the BM25 path is byte-for-byte identical to upstream. urllib-only (no new deps), task-prefixed, md5-cached tool embeddings, full-catalog retrieve, rerank/RRF(k=10) modes, graceful BM25 fallback on any endpoint failure. Backend is any OpenAI-compatible /v1/embeddings endpoint (cloud, local CPU, or GPU). Live-validated (194 tools / 98 labeled queries, nomic-embed-text-v2-moe): overall Recall@5 0.617 -> 0.810, SEMANTIC 0.500 -> 0.849, LEXICAL preserved at 1.000; warm per-query ~146ms, dead-endpoint fallback ~8ms. Fulfills NousResearch#13332.
23c8dcd to
1a33591
Compare
19 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Adds optional embedding-based reranker for semantic tool discovery on top of BM25 lexical search. When enabled, all tool descriptions are embedded once per process using nomic-embed-text-v2-moe (MD5-cached), then per-query tool candidates are reranked by cosine similarity. Implements progressive tool disclosure: when a profile exceeds the activation threshold, the full catalog (~54k tokens) is deferred behind
tool_searchstubs (~2.3k tokens) and tools are fetched on demand. Two reranking modes: pure cosine or Reciprocal Rank Fusion (RRF k=10).Files modified:
tools/tool_search.py(reranker + progressive disclosure),tests/tools/test_tool_search.py(new tests),website/docs/user-guide/features/tool-search.md(updated).Why
BM25 lexical matching fails on semantic queries ("remind me tonight" vs "create_calendar_event"). Embedding reranker recovers those cases. Large tool catalogs consume 34-67% of a 131k context window. Progressive disclosure defers the catalog and reduces visible tools from 226 → 4, freeing 95.8% of tool-definition tokens.
Tests
53 tests pass: BM25 fallback, RRF exact-score, limit contract, dimension-mismatch, prefix payload, cache invalidation. Offline eval suite shows R@5 improvement from 0.634 (BM25) → 0.810 (with reranker).
Platforms tested
Linux (CT/LXC environment, Python 3.13)