Skip to content

fix(honcho): timeout, cost safety, peer targeting, session isolation, and 5-tool surface#6719

Closed
erosika wants to merge 10 commits into
NousResearch:mainfrom
erosika:eri/honcho-peer-perspective-wrapper-fixes
Closed

fix(honcho): timeout, cost safety, peer targeting, session isolation, and 5-tool surface#6719
erosika wants to merge 10 commits into
NousResearch:mainfrom
erosika:eri/honcho-peer-perspective-wrapper-fixes

Conversation

@erosika

@erosika erosika commented Apr 9, 2026

Copy link
Copy Markdown
Contributor

Summary

Honcho correctness fixes, cost-safety defaults, bidirectional 5-tool surface, context injection overhaul, and session isolation. Addresses community-reported issues including Cygnus's context budget experience, dialectic parroting, and gateway/CLI correctness bugs.

Context Injection Overhaul (inspired by claude-honcho)

  • Base layer: peer.context() (representation + card) cached with 5-minute TTL. Stable, cheap, always available.
  • Dialectic supplement: fires every N turns (cadence-gated), result cached until next refresh. Supplements the base layer when fresh.
  • Trivial prompt skip: "ok", "yes", "continue", slash commands skip injection entirely.
  • New peer guard: dialectic skipped at session start when peer has no context — avoids generating slop from nothing.
  • Targeted warm prompt: "Focus on preferences, goals, and style" instead of generic "What should I know about this user?"

This fixes the parroting problem where dialectic output was the sole injection source, causing stale observations and roleplaying responses.

Tool Surface (5 bidirectional tools)

Tool LLM? Purpose
honcho_profile No Read or update peer card — pass card to update, omit to read
honcho_search No Semantic search over context
honcho_context No Full session context — summary, representation, card, recent messages
honcho_reasoning Yes Synthesized answer — reasoning_level param (minimal/low/medium/high/max)
honcho_conclude No Create or delete conclusions — delete_id for PII removal

All tools accept peer: 'user' (default), 'ai', or any workspace peer ID.

Cost Safety

  • dialectic_cadence defaults to 3 (was 1) — ~66% fewer Honcho LLM calls. Configurable.
  • context_tokens defaults to uncapped — cap is opt-in via config/wizard.
  • on_turn_start hook wired in run_agent.py — fixes broken cadence/injection gating.

Agentic Reasoning

  • dialecticDynamic repurposed: true = model can override reasoning_level per-call, false = always configured level.
  • Behavioral note: existing users with dialecticDynamic: true previously got automatic depth scaling by query length heuristic. Now they get configured level unless model explicitly passes reasoning_level. Intentional — model judgment replaces heuristic.

Correctness

  • Explicit target= on peer context/card fetches (fixes identity blur)
  • honcho_search perspective fix under directional observation
  • hermes honcho status honest failure reporting
  • Timeout config plumbing
  • peerName precedence over gateway user_id
  • Orphan session prevention (skip_memory on temp agents)
  • gateway_session_key for stable per-chat continuity
  • initOnSessionStart for eager tools-mode init
  • get_session_context fallback respects peer param
  • "mid""medium" in reasoning level validation

Session Strategy (backward compat)

Stays per-directory in code. Setup wizard guides new users to per-session. No breaking changes.

Documentation

Updated: honcho.md, memory-providers.md, tools-reference.md, cli-commands.md, plugin README.

Can close on merge

Issues: #5667 | PRs: #5658, #4608

Related (cherry-picked, PR remains open): #8424

Validation

source .venv/bin/activate && TERM=dumb python -m pytest tests/honcho_plugin/ -q

@erosika erosika force-pushed the eri/honcho-peer-perspective-wrapper-fixes branch from ba98027 to 5bbf220 Compare April 9, 2026 18:13
@erosika erosika changed the title fix(honcho): isolate peer perspective and strip leaked memory wrappers fix(honcho): add timeout config and isolate peer perspective Apr 10, 2026
@erosika erosika changed the title fix(honcho): add timeout config and isolate peer perspective fix(honcho): add timeout config, named peer targeting, and honest status checks Apr 11, 2026
@erosika erosika marked this pull request as ready for review April 11, 2026 03:02
erosika and others added 3 commits April 13, 2026 16:07
Default was 1 which fires a Honcho dialectic query on every user message.
With cadence 3, the user model refreshes every 3 turns instead of every
turn, cutting Honcho LLM API cost by ~66% for multi-turn conversations.

The user model (preferences, identity, project context) does not change
meaningfully between individual messages. Every 3 turns keeps the model
reasonably fresh without excessive API spend.
Three code paths constructed temporary AIAgent instances without
skip_memory=True, causing the Honcho memory provider to initialise
and create empty sessions that were never written to:

1. gateway/run.py — hygiene auto-compress and /compress temp agents
   lacked skip_memory=True (flush agent at line 717 already had it)
2. plugins/memory/honcho/__init__.py — per-session strategy ran
   migrate_memory_files on every new session, uploading MEMORY.md /
   USER.md / SOUL.md into short-lived sessions that were immediately
   abandoned after session rotation
3. run_agent.py — _spawn_background_review created a full AIAgent to
   review conversation history for local memory extraction, but did
   not need Honcho (it uses the shared _memory_store directly)

All three now pass skip_memory=True or short-circuit appropriately,
consistent with the existing convention across 20+ call sites.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…isolation

The plugin architecture refactor (924bc67, 2026-04-02) removed the
honcho_session_key parameter from AIAgent, breaking the gateway →
Honcho session key plumbing. This caused all Honcho sessions to use
bare timestamp IDs (e.g. 20260412_171002_xxx) instead of the stable
per-chat key (e.g. agent-main-telegram-dm-8439114563), fragmenting
conversation history across disposable sessions.

Fix: thread gateway_session_key through the full path:
1. gateway/run.py — pass gateway_session_key=session_key to AIAgent
2. run_agent.py — accept, store, and forward via _init_kwargs
3. plugins/memory/honcho/__init__.py — extract and pass to resolver
4. plugins/memory/honcho/client.py — add gateway_session_key as a
   new priority level in resolve_session_name (after /title override,
   before per-session fallback)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@erosika erosika changed the title fix(honcho): add timeout config, named peer targeting, and honest status checks fix(honcho): timeout, peer targeting, cost safety defaults, and session isolation Apr 13, 2026
@erosika erosika changed the title fix(honcho): timeout, peer targeting, cost safety defaults, and session isolation fix(honcho): cost safety, agentic reasoning, 5-tool surface, session isolation Apr 13, 2026
@erosika erosika changed the title fix(honcho): cost safety, agentic reasoning, 5-tool surface, session isolation fix(honcho): timeout, cost safety, peer targeting, session isolation, and 5-tool surface Apr 13, 2026
@erosika erosika force-pushed the eri/honcho-peer-perspective-wrapper-fixes branch from eb6ef70 to b901c42 Compare April 14, 2026 16:17
…lation

Honcho correctness fixes, cost-safety defaults, bidirectional tool surface
expansion, and session isolation. Addresses community-reported issues
including Cygnus's context budget experience and several gateway/CLI bugs.

Tool surface (5 bidirectional tools):
  honcho_profile    — read or update peer card
  honcho_search     — semantic search over context
  honcho_context    — session context (summary, representation, card, messages)
  honcho_reasoning  — synthesized answer, reasoning_level param (agentic)
  honcho_conclude   — create or delete conclusions (PII removal)

Cost safety:
  - dialectic_cadence defaults to 3 (~66% fewer LLM calls)
  - context_tokens defaults to uncapped (cap opt-in via config/wizard)
  - on_turn_start hook wired up (fixes broken cadence/injection gating)
  - dialecticDynamic repurposed as agentic reasoning level gate

Correctness:
  - explicit target= on peer context/card fetches
  - honcho_search perspective fix under directional observation
  - honest status on session setup failure
  - timeout config plumbing
  - peerName precedence over gateway user_id
  - orphan session prevention (temp agents)
  - gateway_session_key for stable per-chat continuity
  - initOnSessionStart for eager tools-mode init

Backward compatible: no breaking changes. Session strategy stays
per-directory in code. Setup wizard guides new users to per-session.

Co-Authored-By: Kathie yu <yuhsh3@alumni.sysu.edu.cn>
Co-Authored-By: Tranquil-Flow <tranquil_flow@protonmail.com>
@erosika erosika force-pushed the eri/honcho-peer-perspective-wrapper-fixes branch from d2fff69 to e5b2bcc Compare April 14, 2026 19:36
@erosika erosika force-pushed the eri/honcho-peer-perspective-wrapper-fixes branch from 7b13855 to 8a86f84 Compare April 14, 2026 20:03
…, dialecticDynamic note

- Add "required": [] to honcho_conclude schema (explicit empty, not omitted)
- Fix get_session_context fallback to respect peer param (was always user)
- dialecticDynamic semantic shift noted in PR description:
  Old: true = auto-bump reasoning by query length heuristic
  New: true = model can override reasoning_level per-call (agentic)
  Existing users with true get fixed "low" unless model passes param.
@erosika erosika force-pushed the eri/honcho-peer-perspective-wrapper-fixes branch from 8a86f84 to ab19987 Compare April 14, 2026 20:31
erosika added 4 commits April 14, 2026 16:53
…o_context description

README had 4 tools with honcho_context described as LLM-synthesized.
Update to 5 tools with correct descriptions matching the code.
…supplement, trivial skip

Inspired by claude-honcho's architecture. Fixes the parroting problem
where dialectic output was the sole injection source, causing stale
observations and roleplaying responses in auto-injected context.

New injection architecture:
  - Base layer: peer.context() result (representation + card), cached
    with 5-minute TTL. Stable, cheap, always available.
  - Supplement: dialectic result fires every N turns (cadence-gated),
    cached until next refresh. Adds synthesis when fresh.
  - Trivial prompt skip: "ok", "yes", "continue", slash commands, and
    empty prompts skip injection entirely. Zero token waste.

Session start pre-warm now:
  - Fetches peer.context() synchronously and caches it
  - Fires dialectic with targeted prompt ("Summarize what you know about
    this user. Focus on preferences, current projects, and working style.")
    instead of generic "What should I know about this user?"

The dialectic is no longer the sole injection source. It supplements
the stable representation. honcho_reasoning tool remains the only way
to get direct dialectic output on demand.
- Skip dialectic at session start when peer has no context (empty
  representation + card = new peer). Avoids generating slop from
  nothing.
- Change warm prompt from "Focus on preferences, current projects,
  and working style" to "Focus on preferences, goals, and style" —
  Hermes isn't only a work assistant.
@erosika

erosika commented Apr 14, 2026

Copy link
Copy Markdown
Contributor Author

Superseded by a clean rebase — branch contained unrelated ABC refactors from other contributors mixed in via squash. Reopening from a surgical branch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant