feat(honcho): context injection overhaul, 5-tool surface, cost safety, session isolation by erosika · Pull Request #9884 · NousResearch/hermes-agent

erosika · 2026-04-14T22:10:13Z

Summary

Honcho correctness fixes, cost-safety defaults, bidirectional 5-tool surface, context injection overhaul, and session isolation. Addresses community-reported issues including Cygnus's context budget experience, dialectic parroting, and gateway/CLI correctness bugs.

Replaces #6719 — clean rebase, plugin-only scope.

Context Injection Overhaul

Base layer: peer.context() (session summary + representation + card) cached with TTL. Static mode header stays in system prompt for prompt-cache stability; live context injected into user message.
Dialectic supplement: fires every N turns (dialecticCadence, default 3), result cached until next refresh. Supplements the base layer when fresh.
Trivial prompt skip: "ok", "yes", "continue", slash commands skip injection entirely.
New peer guard: dialectic skipped at session start when peer has no context — avoids generating slop from nothing.
Targeted warm prompt: "Focus on preferences, goals, and style" instead of generic "What should I know about this user?"
First-turn sync dialectic: bounded timeout (8s default) with graceful async fallback on slow connections.

Fixes the parroting problem where dialectic output was the sole injection source, causing stale observations and roleplaying responses.

Tool Surface (5 bidirectional tools)

Tool	LLM?	Purpose
`honcho_profile`	No	Read or update peer card — pass `card` to update, omit to read
`honcho_search`	No	Semantic search over context
`honcho_context`	No	Full session context snapshot — summary, representation, card, recent messages
`honcho_reasoning`	Yes	Synthesized answer — `reasoning_level` param (minimal/low/medium/high/max)
`honcho_conclude`	No	Create or delete conclusions — `delete_id` for PII removal

All tools accept peer: 'user' (default), 'ai', or any workspace peer ID.

Cost Safety

dialecticCadence defaults to 3 (was 1) — ~66% fewer Honcho LLM calls. Configurable.
contextTokens defaults to uncapped — cap is opt-in via config/wizard.
on_turn_start hook wired in run_agent.py — fixes broken cadence/injection gating.

Multi-Pass Dialectic (dialecticDepth)

dialecticDepth (1–3, clamped) controls how many .chat() calls fire per cycle:

Depth	Passes	Behavior
1	single `.chat()`	Base query only
2	audit + synthesis	Pass 0 self-audited; conditional bail-out if strong signal
3	audit + synthesis + reconciliation	Pass 2 reconciles contradictions

When dialecticDepthLevels is not set, passes use proportional reasoning levels relative to the configured base (minimal/base for depth 2, minimal/base/low for depth 3).

Correctness

set_peer_card: None guard for unresolvable peer ID
get_session_context: cache-miss fallback now respects peer param (was silently returning user context for AI peer queries)
_resolve_peer_id: return type tightened str | None → str; dead None branch removed from _resolve_observer_target
Explicit target= on peer context/card fetches (fixes identity blur)
honcho_search perspective fix under directional observation
hermes honcho status honest failure reporting
Timeout config plumbing
peerName precedence over gateway user_id
Orphan session prevention (skip_memory on temp agents)
gateway_session_key for stable per-chat continuity
initOnSessionStart for eager tools-mode init
get_session_context fallback respects peer param
"mid" → "medium" in reasoning level validation
honcho_conclude schema: anyOf required constraint (schema validators now correctly reject empty calls)
_signal_sufficient: anchored regex prevents false matches on version strings

ABC Footprint (minimal, honcho-only)

run_agent.py: gateway_session_key param + memory provider wiring (+5 lines)
gateway/run.py: skip_memory on 2 temp agents, gateway_session_key on main agent (+3 lines)
agent/memory_manager.py: sanitize regex for context tag variants (+9 lines)

Session Strategy (backward compat)

Default stays per-directory. Setup wizard guides new users to per-session. No breaking changes. New hermes honcho strategy CLI command.

Documentation

Updated: honcho.md, memory-providers.md, tools-reference.md, cli-commands.md, plugin README, SKILL.md (full accuracy pass — tool descriptions, defaults, dialecticDynamic semantics, agent usage patterns).

Can close on merge

Issues: #5667 | PRs: #5658, #4608

Related (cherry-picked, PR remains open): #8424

Validation

source .venv/bin/activate && TERM=dumb python -m pytest tests/honcho_plugin/ tests/agent/test_memory_provider.py -q

247 passed.

…, session isolation Context Injection Overhaul: - Base layer: peer.context() (representation + card) cached with 5-minute TTL - Dialectic supplement: cadence-gated, cached until next refresh - Trivial prompt skip: short inputs/slash commands skip injection - New peer guard: dialectic skipped at session start when peer has no context - Targeted warm prompt for better dialectic quality Tool Surface (5 bidirectional tools): - honcho_profile: read or update peer card - honcho_search: semantic search over context - honcho_context: full session context (summary, representation, card, messages) - honcho_reasoning: synthesized answer, reasoning_level param - honcho_conclude: create or delete conclusions (PII removal) Cost Safety: - dialectic_cadence defaults to 3 (~66% fewer LLM calls) - context_tokens defaults to uncapped (cap opt-in via config/wizard) - on_turn_start hook wired up (fixes broken cadence/injection gating) Correctness: - Explicit target= on peer context/card fetches (fixes identity blur) - honcho_search perspective fix under directional observation - Timeout config plumbing - peerName precedence over gateway user_id - skip_memory on temp agents (orphan session prevention) - gateway_session_key for stable per-chat session continuity - initOnSessionStart for eager tools-mode init - get_session_context fallback respects peer param - mid -> medium in reasoning level validation ABC changes (minimal, honcho-only): - run_agent.py: gateway_session_key param + memory provider wiring (+5 lines) - gateway/run.py: skip_memory on 2 temp agents, gateway_session_key on main agent (+3 lines) - agent/memory_manager.py: sanitize regex for context tag variants (+9 lines)

… cadence after tool call

…ix honcho_context crash - system_prompt_block() now returns static header only (matching ABC contract) All other providers already did this; Honcho was the only one baking live user data into system prompt, freezing it on turn 1 forever - prefetch() assembles two layers: Layer 1: base context (representation + card) from peer.context(), cached and refreshed on context_cadence (not frozen) Layer 2: dialectic supplement, refreshed on dialectic_cadence - Context and dialectic cadence now checked independently in queue_prefetch() Previously context refresh was gated behind dialectic cadence - Fix honcho_context tool crash: Honcho SDK Message objects use .peer_id not .role — was silently returning 'No context available yet' due to AttributeError caught by broad except

When Honcho's saveMessages persists a turn that included injected memory context, the <memory-context>...</memory-context> block can reappear in subsequent user messages via message history. This causes stale observations (months-old AstroMap bugs, natal chart logs) to leak into the visible conversation as user text. Strips memory-context blocks from both user_message and persist_user_message in run_conversation() preamble, right after the existing surrogate sanitization pass. Adds TestMemoryContextSanitization with source-inspection and end-to-end stripping tests.

Adds dialecticDepth config (1-3, clamped) controlling how many .chat() calls fire per dialectic cycle. Cadence gates when; depth gates how deep. Architecture: depth 1 → single call (default, backward compatible) depth 2 → self-audit + targeted synthesis (conditional bail-out) depth 3 → audit + synthesis + reconciliation Cold start (no session context) → general user query: 'Who is this person? What are their preferences, goals, and working style?' Warm session (has context) → scoped: 'Given what's been discussed in this session, what context about this user is most relevant?' Each pass after the first is conditional — bails early if prior pass returned structured, substantial output (_signal_sufficient heuristic). Config keys: dialecticDepth: 1-3 (int, clamped) dialecticDepthLevels: ['minimal', 'high'] (optional per-pass override) Proportional reasoning levels when dialecticDepthLevels is not set: depth 2: [minimal, base] depth 3: [minimal, base, low] 188 honcho tests passing (+27 new), 803 run_agent tests green.

The base layer (get_prefetch_context) now fetches the session summary from Honcho when available and includes it as the first section in the formatted context block. This gives the injected context session scope — the model knows what the current conversation is about, not just who the user is. Base context injection order: 1. Session Summary (what we've been discussing) 2. User Representation + Card (who the user is) 3. AI Representation + Card (who the AI is) Cold start (no summary yet) gracefully omits the section. The cold/warm detection for dialectic depth still works correctly — _base_context_cache being populated means warm regardless of whether summary is present. 191 honcho tests passing (+3 new).

Rewrites plugin README, skill, and website feature docs to reflect: - Session summary in base context injection (placed first) - dialecticDepth (1-3) multi-pass .chat() architecture - dialecticDepthLevels per-pass reasoning override - Cold start vs warm session automatic prompt selection - Three orthogonal knobs: cadence (when), depth (how many), level (how hard) - contextTokens budget enforcement - <memory-context> sanitization - 5 bidirectional tools with peer parameter - Full config reference with all 15 keys and defaults Files updated: plugins/memory/honcho/README.md optional-skills/autonomous-ai-agents/honcho/SKILL.md website/docs/user-guide/features/honcho.md website/docs/user-guide/features/memory-providers.md

Improves comment on summary fetch in get_prefetch_context to document the per-session vs per-directory behavior: per-session cold start → null summary, gracefully omitted per-directory returning → accumulated summary injects The guard (ctx.summary check) handles both cases without needing strategy-specific branching. No behavior change.

Resets .gitignore and cli-commands.md to origin/main — both were accidentally included in the original 11b4c9e cherry-pick that founded this branch. Not related to the Honcho plugin scope.

…viders

The dialectic supplement was always one turn behind because queue_prefetch() fires at end-of-turn for the *next* turn. On the very first turn, no dialectic had ever been queued, so the cold-start synthesis was missing. Now prefetch() detects _last_dialectic_turn == -999 (never fired) and runs _run_dialectic_depth() synchronously, mirroring how the base context already handles first-call. After this, the cadence gate prevents double-firing on the same turn. Adds two tests: - First-turn sync dialectic fires and produces output - queue_prefetch() correctly skips after first-turn sync

on_turn_start() was never called from run_conversation(), leaving _turn_count at 0 forever. This meant cadence checks like (turn_count - last_dialectic_turn) always evaluated to (0 - (-999)) = 999, satisfying any cadence threshold. Result: dialectic and context refresh fired every single turn regardless of dialecticCadence / contextCadence settings. Fix: call memory_manager.on_turn_start(self._user_turn_count, msg) right before prefetch_all() in run_conversation(). Also fix the injection_frequency='first-turn' guard from > 0 to > 1 since _user_turn_count is 1-indexed (first message = 1).

- Fix 'injected into system prompt' → 'injected into user message' (preserves prompt caching, was always the actual behavior) - Fix injectionFrequency 'turn 0' → 'first user message, skip from turn 2 onward' to match 1-indexed turn count - Add Session Name Resolution section with full priority chain (manual map → /title → gateway key → strategy fallback) - Add 'What each strategy produces' with concrete examples - Add Multi-Profile Pattern section showing host block inheritance - Remove redundant workspace/peerName from example host blocks - Clarify workspace as 'shared environment' not 'world'

The _INTERNAL_CONTEXT_RE regex matched supermemory-context and supermemory-containers tags — not this PR's responsibility. Narrowed to memory-context only. Removed supermemory mention from README Input Sanitization section.

- set_peer_card: add None guard for _resolve_peer_id result - get_session_context: fallback now respects peer param on sessions-cache miss - _resolve_peer_id: tighten return type str|None -> str, remove dead None branch in _resolve_observer_target - logger.warning -> logger.debug in get_session_context failure path - honcho_conclude schema: add anyOf required constraint so validators reject empty calls (neither conclusion nor delete_id) - _signal_sufficient: tighten ordered-list heuristic with anchored regex (re.search r'^\s*\d+\. ') to avoid false matches on version strings - First-turn sync dialectic: wrap in daemon thread with 8s timeout; on timeout _last_dialectic_turn stays -999 so async path retries at next cadence-allowed turn instead of blocking the response - import re added to __init__.py - client.py: rename l -> lvl in list comprehension, deduplicate _parse_dialectic_depth call, remove stray blank line - SKILL.md: fix dialecticCadence default 1->3 (two occurrences), fix contextTokens default 4096->uncapped, fix dialecticDynamic description (model-driven override, not auto-bump by query length), fix dialecticDepthLevels behavior (proportional levels table, not 'all rounds use global level'), rewrite Tools section (honcho_context and honcho_reasoning were swapped, honcho_observe removed -- does not exist, peer: param documented correctly instead of target:), add Agent Usage Patterns section with decision guidance for Hermes - tests: add TestSetPeerCardNoneGuard, TestGetSessionContextFallback, test_honcho_conclude_missing_both_params_returns_error

…, session isolation Salvaged from PR #9884 by erosika. Cherry-picked plugin changes onto current main with minimal core modifications. Plugin changes (plugins/memory/honcho/): - New honcho_reasoning tool (5th tool, splits LLM calls from honcho_context) - Two-layer context injection: base context (summary + representation + card) on contextCadence, dialectic supplement on dialecticCadence - Multi-pass dialectic depth (1-3 passes) with early bail-out on strong signal - Cold/warm prompt selection based on session state - dialecticCadence defaults to 3 (was 1) — ~66% fewer Honcho LLM calls - Session summary injection for conversational continuity - Bidirectional peer targeting on all 5 tools - Correctness fixes: peer param fallback, None guard on set_peer_card, schema validation, signal_sufficient anchored regex, mid->medium level fix Core changes (~20 lines across 3 files): - agent/memory_manager.py: Enhanced sanitize_context() to strip full <memory-context> blocks and system notes (prevents leak from saveMessages) - run_agent.py: gateway_session_key param for stable per-chat Honcho sessions, on_turn_start() call before prefetch_all() for cadence tracking, sanitize_context() on user messages to strip leaked memory blocks - gateway/run.py: skip_memory=True on 2 temp agents (prevents orphan sessions), gateway_session_key threading to main agent Tests: 509 passed (3 skipped — honcho SDK not installed locally) Docs: Updated honcho.md, memory-providers.md, tools-reference.md, SKILL.md Co-authored-by: erosika <erosika@users.noreply.github.com>

…, session isolation (#10619) Salvaged from PR #9884 by erosika. Cherry-picked plugin changes onto current main with minimal core modifications. Plugin changes (plugins/memory/honcho/): - New honcho_reasoning tool (5th tool, splits LLM calls from honcho_context) - Two-layer context injection: base context (summary + representation + card) on contextCadence, dialectic supplement on dialecticCadence - Multi-pass dialectic depth (1-3 passes) with early bail-out on strong signal - Cold/warm prompt selection based on session state - dialecticCadence defaults to 3 (was 1) — ~66% fewer Honcho LLM calls - Session summary injection for conversational continuity - Bidirectional peer targeting on all 5 tools - Correctness fixes: peer param fallback, None guard on set_peer_card, schema validation, signal_sufficient anchored regex, mid->medium level fix Core changes (~20 lines across 3 files): - agent/memory_manager.py: Enhanced sanitize_context() to strip full <memory-context> blocks and system notes (prevents leak from saveMessages) - run_agent.py: gateway_session_key param for stable per-chat Honcho sessions, on_turn_start() call before prefetch_all() for cadence tracking, sanitize_context() on user messages to strip leaked memory blocks - gateway/run.py: skip_memory=True on 2 temp agents (prevents orphan sessions), gateway_session_key threading to main agent Tests: 509 passed (3 skipped — honcho SDK not installed locally) Docs: Updated honcho.md, memory-providers.md, tools-reference.md, SKILL.md Co-authored-by: erosika <erosika@users.noreply.github.com>

teknium1 · 2026-04-16T02:12:27Z

Merged via #10619. Your plugin changes were salvaged onto current main (your branch was 141 commits behind, so direct merge would have reverted Bedrock support, Nous rate limiting, and other recent features). All your plugin code, docs, SKILL.md, and tests are preserved. Core changes (~20 lines) were manually applied. Thanks for the thorough work, @erosika!

…, session isolation (NousResearch#10619) Salvaged from PR NousResearch#9884 by erosika. Cherry-picked plugin changes onto current main with minimal core modifications. Plugin changes (plugins/memory/honcho/): - New honcho_reasoning tool (5th tool, splits LLM calls from honcho_context) - Two-layer context injection: base context (summary + representation + card) on contextCadence, dialectic supplement on dialecticCadence - Multi-pass dialectic depth (1-3 passes) with early bail-out on strong signal - Cold/warm prompt selection based on session state - dialecticCadence defaults to 3 (was 1) — ~66% fewer Honcho LLM calls - Session summary injection for conversational continuity - Bidirectional peer targeting on all 5 tools - Correctness fixes: peer param fallback, None guard on set_peer_card, schema validation, signal_sufficient anchored regex, mid->medium level fix Core changes (~20 lines across 3 files): - agent/memory_manager.py: Enhanced sanitize_context() to strip full <memory-context> blocks and system notes (prevents leak from saveMessages) - run_agent.py: gateway_session_key param for stable per-chat Honcho sessions, on_turn_start() call before prefetch_all() for cadence tracking, sanitize_context() on user messages to strip leaked memory blocks - gateway/run.py: skip_memory=True on 2 temp agents (prevents orphan sessions), gateway_session_key threading to main agent Tests: 509 passed (3 skipped — honcho SDK not installed locally) Docs: Updated honcho.md, memory-providers.md, tools-reference.md, SKILL.md Co-authored-by: erosika <erosika@users.noreply.github.com>

erosika added 17 commits April 14, 2026 18:07

chore: remove extraction-queue.txt (personal file, not for PR)

e5fa09a

chore: gitignore extraction-queue.txt

3387394

fix(honcho): use sanitize regexes, collapse session key hyphens, gate…

af5bbda

… cadence after tool call

chore: revert contaminant diffs from foundational cherry-pick

146a128

Resets .gitignore and cli-commands.md to origin/main — both were accidentally included in the original 11b4c9e cherry-pick that founded this branch. Not related to the Honcho plugin scope.

docs: add honcho_reasoning to tools reference, fix link to Memory Pro…

90f302f

…viders

erosika changed the title ~~feat(honcho): context injection overhaul, 5-tool surface, cost safety, session isolation~~ fix(honcho): context injection overhaul, 5-tool surface, cost safety, session isolation Apr 15, 2026

erosika changed the title ~~fix(honcho): context injection overhaul, 5-tool surface, cost safety, session isolation~~ feat(honcho): context injection overhaul, 5-tool surface, cost safety, session isolation Apr 15, 2026

teknium1 mentioned this pull request Apr 16, 2026

feat(honcho): context injection overhaul, 5-tool surface, cost safety, session isolation #10619

Merged

teknium1 closed this Apr 16, 2026

erosika mentioned this pull request Apr 18, 2026

fix(honcho): dialectic lifecycle — defaults, retry, prewarm consumption #12160

Closed

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(honcho): context injection overhaul, 5-tool surface, cost safety, session isolation#9884

feat(honcho): context injection overhaul, 5-tool surface, cost safety, session isolation#9884
erosika wants to merge 17 commits into
NousResearch:mainfrom
erosika:eri/honcho-injection

erosika commented Apr 14, 2026 •

edited

Loading

Uh oh!

teknium1 commented Apr 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

erosika commented Apr 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Context Injection Overhaul

Tool Surface (5 bidirectional tools)

Cost Safety

Multi-Pass Dialectic (dialecticDepth)

Correctness

ABC Footprint (minimal, honcho-only)

Session Strategy (backward compat)

Documentation

Can close on merge

Validation

Uh oh!

teknium1 commented Apr 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

erosika commented Apr 14, 2026 •

edited

Loading