Bug description
The Honcho memory tool schemas advertise focused retrieval controls that are not fully honored by the current implementation:
honcho_context exposes an optional query parameter described as focusing/filtering context, but the handler currently calls get_session_context(..., peer=peer) without passing the query through.
honcho_search accepts max_tokens, but HonchoSessionManager.search_context() returns the assembled representation/card blob without deterministic budget trimming.
This makes the Hermes tool surface noisier than the underlying focused Honcho/pgvector retrieval path: broad peer representation/card content can dominate even when a caller asks for a focused search or smaller token budget.
Expected behavior
honcho_context(query=...) should pass the focused query into Honcho peer-context retrieval rather than silently returning broad session context.
honcho_search(max_tokens=N) should bound its returned raw context to the requested approximate token budget.
- The existing JSON shape (
{"result": ...}) should remain backward-compatible for a narrow bugfix.
Proposed first slice
Keep the first PR intentionally narrow:
- Pass
query from the honcho_context tool handler into HonchoSessionManager.get_session_context().
- When a query is present, fetch focused peer context via
peer.context(search_query=..., target=...) instead of broad session context.
- Apply deterministic post-fetch trimming in
search_context() based on max_tokens.
- Add regression tests for the drift above.
Out of scope for the first PR
- Switching
honcho_search to a new conclusions/query API.
- Adding
top_k, min_score, MMR, or structured ranked result arrays.
- Changing peer routing semantics.
Those would be good follow-up work after the schema/implementation drift is fixed.
Bug description
The Honcho memory tool schemas advertise focused retrieval controls that are not fully honored by the current implementation:
honcho_contextexposes an optionalqueryparameter described as focusing/filtering context, but the handler currently callsget_session_context(..., peer=peer)without passing the query through.honcho_searchacceptsmax_tokens, butHonchoSessionManager.search_context()returns the assembled representation/card blob without deterministic budget trimming.This makes the Hermes tool surface noisier than the underlying focused Honcho/pgvector retrieval path: broad peer representation/card content can dominate even when a caller asks for a focused search or smaller token budget.
Expected behavior
honcho_context(query=...)should pass the focused query into Honcho peer-context retrieval rather than silently returning broad session context.honcho_search(max_tokens=N)should bound its returned raw context to the requested approximate token budget.{"result": ...}) should remain backward-compatible for a narrow bugfix.Proposed first slice
Keep the first PR intentionally narrow:
queryfrom thehoncho_contexttool handler intoHonchoSessionManager.get_session_context().peer.context(search_query=..., target=...)instead of broad session context.search_context()based onmax_tokens.Out of scope for the first PR
honcho_searchto a new conclusions/query API.top_k,min_score, MMR, or structured ranked result arrays.Those would be good follow-up work after the schema/implementation drift is fixed.