fix(honcho): timeout, cost safety, peer targeting, session isolation, and 5-tool surface#6719
Closed
erosika wants to merge 10 commits into
Closed
fix(honcho): timeout, cost safety, peer targeting, session isolation, and 5-tool surface#6719erosika wants to merge 10 commits into
erosika wants to merge 10 commits into
Conversation
ba98027 to
5bbf220
Compare
Default was 1 which fires a Honcho dialectic query on every user message. With cadence 3, the user model refreshes every 3 turns instead of every turn, cutting Honcho LLM API cost by ~66% for multi-turn conversations. The user model (preferences, identity, project context) does not change meaningfully between individual messages. Every 3 turns keeps the model reasonably fresh without excessive API spend.
Three code paths constructed temporary AIAgent instances without skip_memory=True, causing the Honcho memory provider to initialise and create empty sessions that were never written to: 1. gateway/run.py — hygiene auto-compress and /compress temp agents lacked skip_memory=True (flush agent at line 717 already had it) 2. plugins/memory/honcho/__init__.py — per-session strategy ran migrate_memory_files on every new session, uploading MEMORY.md / USER.md / SOUL.md into short-lived sessions that were immediately abandoned after session rotation 3. run_agent.py — _spawn_background_review created a full AIAgent to review conversation history for local memory extraction, but did not need Honcho (it uses the shared _memory_store directly) All three now pass skip_memory=True or short-circuit appropriately, consistent with the existing convention across 20+ call sites. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…isolation The plugin architecture refactor (924bc67, 2026-04-02) removed the honcho_session_key parameter from AIAgent, breaking the gateway → Honcho session key plumbing. This caused all Honcho sessions to use bare timestamp IDs (e.g. 20260412_171002_xxx) instead of the stable per-chat key (e.g. agent-main-telegram-dm-8439114563), fragmenting conversation history across disposable sessions. Fix: thread gateway_session_key through the full path: 1. gateway/run.py — pass gateway_session_key=session_key to AIAgent 2. run_agent.py — accept, store, and forward via _init_kwargs 3. plugins/memory/honcho/__init__.py — extract and pass to resolver 4. plugins/memory/honcho/client.py — add gateway_session_key as a new priority level in resolve_session_name (after /title override, before per-session fallback) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
eb6ef70 to
b901c42
Compare
…lation Honcho correctness fixes, cost-safety defaults, bidirectional tool surface expansion, and session isolation. Addresses community-reported issues including Cygnus's context budget experience and several gateway/CLI bugs. Tool surface (5 bidirectional tools): honcho_profile — read or update peer card honcho_search — semantic search over context honcho_context — session context (summary, representation, card, messages) honcho_reasoning — synthesized answer, reasoning_level param (agentic) honcho_conclude — create or delete conclusions (PII removal) Cost safety: - dialectic_cadence defaults to 3 (~66% fewer LLM calls) - context_tokens defaults to uncapped (cap opt-in via config/wizard) - on_turn_start hook wired up (fixes broken cadence/injection gating) - dialecticDynamic repurposed as agentic reasoning level gate Correctness: - explicit target= on peer context/card fetches - honcho_search perspective fix under directional observation - honest status on session setup failure - timeout config plumbing - peerName precedence over gateway user_id - orphan session prevention (temp agents) - gateway_session_key for stable per-chat continuity - initOnSessionStart for eager tools-mode init Backward compatible: no breaking changes. Session strategy stays per-directory in code. Setup wizard guides new users to per-session. Co-Authored-By: Kathie yu <yuhsh3@alumni.sysu.edu.cn> Co-Authored-By: Tranquil-Flow <tranquil_flow@protonmail.com>
d2fff69 to
e5b2bcc
Compare
7b13855 to
8a86f84
Compare
…, dialecticDynamic note - Add "required": [] to honcho_conclude schema (explicit empty, not omitted) - Fix get_session_context fallback to respect peer param (was always user) - dialecticDynamic semantic shift noted in PR description: Old: true = auto-bump reasoning by query length heuristic New: true = model can override reasoning_level per-call (agentic) Existing users with true get fixed "low" unless model passes param.
8a86f84 to
ab19987
Compare
…o_context description README had 4 tools with honcho_context described as LLM-synthesized. Update to 5 tools with correct descriptions matching the code.
…supplement, trivial skip
Inspired by claude-honcho's architecture. Fixes the parroting problem
where dialectic output was the sole injection source, causing stale
observations and roleplaying responses in auto-injected context.
New injection architecture:
- Base layer: peer.context() result (representation + card), cached
with 5-minute TTL. Stable, cheap, always available.
- Supplement: dialectic result fires every N turns (cadence-gated),
cached until next refresh. Adds synthesis when fresh.
- Trivial prompt skip: "ok", "yes", "continue", slash commands, and
empty prompts skip injection entirely. Zero token waste.
Session start pre-warm now:
- Fetches peer.context() synchronously and caches it
- Fires dialectic with targeted prompt ("Summarize what you know about
this user. Focus on preferences, current projects, and working style.")
instead of generic "What should I know about this user?"
The dialectic is no longer the sole injection source. It supplements
the stable representation. honcho_reasoning tool remains the only way
to get direct dialectic output on demand.
- Skip dialectic at session start when peer has no context (empty representation + card = new peer). Avoids generating slop from nothing. - Change warm prompt from "Focus on preferences, current projects, and working style" to "Focus on preferences, goals, and style" — Hermes isn't only a work assistant.
Contributor
Author
|
Superseded by a clean rebase — branch contained unrelated ABC refactors from other contributors mixed in via squash. Reopening from a surgical branch. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Honcho correctness fixes, cost-safety defaults, bidirectional 5-tool surface, context injection overhaul, and session isolation. Addresses community-reported issues including Cygnus's context budget experience, dialectic parroting, and gateway/CLI correctness bugs.
Context Injection Overhaul (inspired by claude-honcho)
peer.context()(representation + card) cached with 5-minute TTL. Stable, cheap, always available.This fixes the parroting problem where dialectic output was the sole injection source, causing stale observations and roleplaying responses.
Tool Surface (5 bidirectional tools)
honcho_profilecardto update, omit to readhoncho_searchhoncho_contexthoncho_reasoningreasoning_levelparam (minimal/low/medium/high/max)honcho_concludedelete_idfor PII removalAll tools accept
peer:'user'(default),'ai', or any workspace peer ID.Cost Safety
dialectic_cadencedefaults to 3 (was 1) — ~66% fewer Honcho LLM calls. Configurable.context_tokensdefaults to uncapped — cap is opt-in via config/wizard.on_turn_starthook wired inrun_agent.py— fixes broken cadence/injection gating.Agentic Reasoning
dialecticDynamicrepurposed:true= model can overridereasoning_levelper-call,false= always configured level.dialecticDynamic: truepreviously got automatic depth scaling by query length heuristic. Now they get configured level unless model explicitly passesreasoning_level. Intentional — model judgment replaces heuristic.Correctness
target=on peer context/card fetches (fixes identity blur)honcho_searchperspective fix under directional observationhermes honcho statushonest failure reportingpeerNameprecedence over gatewayuser_idskip_memoryon temp agents)gateway_session_keyfor stable per-chat continuityinitOnSessionStartfor eager tools-mode initget_session_contextfallback respects peer param"mid"→"medium"in reasoning level validationSession Strategy (backward compat)
Stays
per-directoryin code. Setup wizard guides new users toper-session. No breaking changes.Documentation
Updated:
honcho.md,memory-providers.md,tools-reference.md,cli-commands.md, plugin README.Can close on merge
Issues: #5667 | PRs: #5658, #4608
Related (cherry-picked, PR remains open): #8424
Validation