feat(honcho): context injection overhaul, 5-tool surface, cost safety, session isolation#9884
Closed
erosika wants to merge 17 commits into
Closed
feat(honcho): context injection overhaul, 5-tool surface, cost safety, session isolation#9884erosika wants to merge 17 commits into
erosika wants to merge 17 commits into
Conversation
…, session isolation Context Injection Overhaul: - Base layer: peer.context() (representation + card) cached with 5-minute TTL - Dialectic supplement: cadence-gated, cached until next refresh - Trivial prompt skip: short inputs/slash commands skip injection - New peer guard: dialectic skipped at session start when peer has no context - Targeted warm prompt for better dialectic quality Tool Surface (5 bidirectional tools): - honcho_profile: read or update peer card - honcho_search: semantic search over context - honcho_context: full session context (summary, representation, card, messages) - honcho_reasoning: synthesized answer, reasoning_level param - honcho_conclude: create or delete conclusions (PII removal) Cost Safety: - dialectic_cadence defaults to 3 (~66% fewer LLM calls) - context_tokens defaults to uncapped (cap opt-in via config/wizard) - on_turn_start hook wired up (fixes broken cadence/injection gating) Correctness: - Explicit target= on peer context/card fetches (fixes identity blur) - honcho_search perspective fix under directional observation - Timeout config plumbing - peerName precedence over gateway user_id - skip_memory on temp agents (orphan session prevention) - gateway_session_key for stable per-chat session continuity - initOnSessionStart for eager tools-mode init - get_session_context fallback respects peer param - mid -> medium in reasoning level validation ABC changes (minimal, honcho-only): - run_agent.py: gateway_session_key param + memory provider wiring (+5 lines) - gateway/run.py: skip_memory on 2 temp agents, gateway_session_key on main agent (+3 lines) - agent/memory_manager.py: sanitize regex for context tag variants (+9 lines)
… cadence after tool call
…ix honcho_context crash - system_prompt_block() now returns static header only (matching ABC contract) All other providers already did this; Honcho was the only one baking live user data into system prompt, freezing it on turn 1 forever - prefetch() assembles two layers: Layer 1: base context (representation + card) from peer.context(), cached and refreshed on context_cadence (not frozen) Layer 2: dialectic supplement, refreshed on dialectic_cadence - Context and dialectic cadence now checked independently in queue_prefetch() Previously context refresh was gated behind dialectic cadence - Fix honcho_context tool crash: Honcho SDK Message objects use .peer_id not .role — was silently returning 'No context available yet' due to AttributeError caught by broad except
When Honcho's saveMessages persists a turn that included injected memory context, the <memory-context>...</memory-context> block can reappear in subsequent user messages via message history. This causes stale observations (months-old AstroMap bugs, natal chart logs) to leak into the visible conversation as user text. Strips memory-context blocks from both user_message and persist_user_message in run_conversation() preamble, right after the existing surrogate sanitization pass. Adds TestMemoryContextSanitization with source-inspection and end-to-end stripping tests.
Adds dialecticDepth config (1-3, clamped) controlling how many .chat() calls fire per dialectic cycle. Cadence gates when; depth gates how deep. Architecture: depth 1 → single call (default, backward compatible) depth 2 → self-audit + targeted synthesis (conditional bail-out) depth 3 → audit + synthesis + reconciliation Cold start (no session context) → general user query: 'Who is this person? What are their preferences, goals, and working style?' Warm session (has context) → scoped: 'Given what's been discussed in this session, what context about this user is most relevant?' Each pass after the first is conditional — bails early if prior pass returned structured, substantial output (_signal_sufficient heuristic). Config keys: dialecticDepth: 1-3 (int, clamped) dialecticDepthLevels: ['minimal', 'high'] (optional per-pass override) Proportional reasoning levels when dialecticDepthLevels is not set: depth 2: [minimal, base] depth 3: [minimal, base, low] 188 honcho tests passing (+27 new), 803 run_agent tests green.
The base layer (get_prefetch_context) now fetches the session summary from Honcho when available and includes it as the first section in the formatted context block. This gives the injected context session scope — the model knows what the current conversation is about, not just who the user is. Base context injection order: 1. Session Summary (what we've been discussing) 2. User Representation + Card (who the user is) 3. AI Representation + Card (who the AI is) Cold start (no summary yet) gracefully omits the section. The cold/warm detection for dialectic depth still works correctly — _base_context_cache being populated means warm regardless of whether summary is present. 191 honcho tests passing (+3 new).
Rewrites plugin README, skill, and website feature docs to reflect: - Session summary in base context injection (placed first) - dialecticDepth (1-3) multi-pass .chat() architecture - dialecticDepthLevels per-pass reasoning override - Cold start vs warm session automatic prompt selection - Three orthogonal knobs: cadence (when), depth (how many), level (how hard) - contextTokens budget enforcement - <memory-context> sanitization - 5 bidirectional tools with peer parameter - Full config reference with all 15 keys and defaults Files updated: plugins/memory/honcho/README.md optional-skills/autonomous-ai-agents/honcho/SKILL.md website/docs/user-guide/features/honcho.md website/docs/user-guide/features/memory-providers.md
Improves comment on summary fetch in get_prefetch_context to document the per-session vs per-directory behavior: per-session cold start → null summary, gracefully omitted per-directory returning → accumulated summary injects The guard (ctx.summary check) handles both cases without needing strategy-specific branching. No behavior change.
Resets .gitignore and cli-commands.md to origin/main — both were accidentally included in the original 11b4c9e cherry-pick that founded this branch. Not related to the Honcho plugin scope.
The dialectic supplement was always one turn behind because queue_prefetch() fires at end-of-turn for the *next* turn. On the very first turn, no dialectic had ever been queued, so the cold-start synthesis was missing. Now prefetch() detects _last_dialectic_turn == -999 (never fired) and runs _run_dialectic_depth() synchronously, mirroring how the base context already handles first-call. After this, the cadence gate prevents double-firing on the same turn. Adds two tests: - First-turn sync dialectic fires and produces output - queue_prefetch() correctly skips after first-turn sync
on_turn_start() was never called from run_conversation(), leaving _turn_count at 0 forever. This meant cadence checks like (turn_count - last_dialectic_turn) always evaluated to (0 - (-999)) = 999, satisfying any cadence threshold. Result: dialectic and context refresh fired every single turn regardless of dialecticCadence / contextCadence settings. Fix: call memory_manager.on_turn_start(self._user_turn_count, msg) right before prefetch_all() in run_conversation(). Also fix the injection_frequency='first-turn' guard from > 0 to > 1 since _user_turn_count is 1-indexed (first message = 1).
- Fix 'injected into system prompt' → 'injected into user message' (preserves prompt caching, was always the actual behavior) - Fix injectionFrequency 'turn 0' → 'first user message, skip from turn 2 onward' to match 1-indexed turn count - Add Session Name Resolution section with full priority chain (manual map → /title → gateway key → strategy fallback) - Add 'What each strategy produces' with concrete examples - Add Multi-Profile Pattern section showing host block inheritance - Remove redundant workspace/peerName from example host blocks - Clarify workspace as 'shared environment' not 'world'
The _INTERNAL_CONTEXT_RE regex matched supermemory-context and supermemory-containers tags — not this PR's responsibility. Narrowed to memory-context only. Removed supermemory mention from README Input Sanitization section.
- set_peer_card: add None guard for _resolve_peer_id result - get_session_context: fallback now respects peer param on sessions-cache miss - _resolve_peer_id: tighten return type str|None -> str, remove dead None branch in _resolve_observer_target - logger.warning -> logger.debug in get_session_context failure path - honcho_conclude schema: add anyOf required constraint so validators reject empty calls (neither conclusion nor delete_id) - _signal_sufficient: tighten ordered-list heuristic with anchored regex (re.search r'^\s*\d+\. ') to avoid false matches on version strings - First-turn sync dialectic: wrap in daemon thread with 8s timeout; on timeout _last_dialectic_turn stays -999 so async path retries at next cadence-allowed turn instead of blocking the response - import re added to __init__.py - client.py: rename l -> lvl in list comprehension, deduplicate _parse_dialectic_depth call, remove stray blank line - SKILL.md: fix dialecticCadence default 1->3 (two occurrences), fix contextTokens default 4096->uncapped, fix dialecticDynamic description (model-driven override, not auto-bump by query length), fix dialecticDepthLevels behavior (proportional levels table, not 'all rounds use global level'), rewrite Tools section (honcho_context and honcho_reasoning were swapped, honcho_observe removed -- does not exist, peer: param documented correctly instead of target:), add Agent Usage Patterns section with decision guidance for Hermes - tests: add TestSetPeerCardNoneGuard, TestGetSessionContextFallback, test_honcho_conclude_missing_both_params_returns_error
This was referenced Apr 15, 2026
Closed
teknium1
added a commit
that referenced
this pull request
Apr 16, 2026
…, session isolation Salvaged from PR #9884 by erosika. Cherry-picked plugin changes onto current main with minimal core modifications. Plugin changes (plugins/memory/honcho/): - New honcho_reasoning tool (5th tool, splits LLM calls from honcho_context) - Two-layer context injection: base context (summary + representation + card) on contextCadence, dialectic supplement on dialecticCadence - Multi-pass dialectic depth (1-3 passes) with early bail-out on strong signal - Cold/warm prompt selection based on session state - dialecticCadence defaults to 3 (was 1) — ~66% fewer Honcho LLM calls - Session summary injection for conversational continuity - Bidirectional peer targeting on all 5 tools - Correctness fixes: peer param fallback, None guard on set_peer_card, schema validation, signal_sufficient anchored regex, mid->medium level fix Core changes (~20 lines across 3 files): - agent/memory_manager.py: Enhanced sanitize_context() to strip full <memory-context> blocks and system notes (prevents leak from saveMessages) - run_agent.py: gateway_session_key param for stable per-chat Honcho sessions, on_turn_start() call before prefetch_all() for cadence tracking, sanitize_context() on user messages to strip leaked memory blocks - gateway/run.py: skip_memory=True on 2 temp agents (prevents orphan sessions), gateway_session_key threading to main agent Tests: 509 passed (3 skipped — honcho SDK not installed locally) Docs: Updated honcho.md, memory-providers.md, tools-reference.md, SKILL.md Co-authored-by: erosika <erosika@users.noreply.github.com>
teknium1
added a commit
that referenced
this pull request
Apr 16, 2026
…, session isolation (#10619) Salvaged from PR #9884 by erosika. Cherry-picked plugin changes onto current main with minimal core modifications. Plugin changes (plugins/memory/honcho/): - New honcho_reasoning tool (5th tool, splits LLM calls from honcho_context) - Two-layer context injection: base context (summary + representation + card) on contextCadence, dialectic supplement on dialecticCadence - Multi-pass dialectic depth (1-3 passes) with early bail-out on strong signal - Cold/warm prompt selection based on session state - dialecticCadence defaults to 3 (was 1) — ~66% fewer Honcho LLM calls - Session summary injection for conversational continuity - Bidirectional peer targeting on all 5 tools - Correctness fixes: peer param fallback, None guard on set_peer_card, schema validation, signal_sufficient anchored regex, mid->medium level fix Core changes (~20 lines across 3 files): - agent/memory_manager.py: Enhanced sanitize_context() to strip full <memory-context> blocks and system notes (prevents leak from saveMessages) - run_agent.py: gateway_session_key param for stable per-chat Honcho sessions, on_turn_start() call before prefetch_all() for cadence tracking, sanitize_context() on user messages to strip leaked memory blocks - gateway/run.py: skip_memory=True on 2 temp agents (prevents orphan sessions), gateway_session_key threading to main agent Tests: 509 passed (3 skipped — honcho SDK not installed locally) Docs: Updated honcho.md, memory-providers.md, tools-reference.md, SKILL.md Co-authored-by: erosika <erosika@users.noreply.github.com>
Contributor
|
Merged via #10619. Your plugin changes were salvaged onto current main (your branch was 141 commits behind, so direct merge would have reverted Bedrock support, Nous rate limiting, and other recent features). All your plugin code, docs, SKILL.md, and tests are preserved. Core changes (~20 lines) were manually applied. Thanks for the thorough work, @erosika! |
2 tasks
ulasbilgen
pushed a commit
to ulasbilgen/hermes-adhd-agent
that referenced
this pull request
May 1, 2026
…, session isolation (NousResearch#10619) Salvaged from PR NousResearch#9884 by erosika. Cherry-picked plugin changes onto current main with minimal core modifications. Plugin changes (plugins/memory/honcho/): - New honcho_reasoning tool (5th tool, splits LLM calls from honcho_context) - Two-layer context injection: base context (summary + representation + card) on contextCadence, dialectic supplement on dialecticCadence - Multi-pass dialectic depth (1-3 passes) with early bail-out on strong signal - Cold/warm prompt selection based on session state - dialecticCadence defaults to 3 (was 1) — ~66% fewer Honcho LLM calls - Session summary injection for conversational continuity - Bidirectional peer targeting on all 5 tools - Correctness fixes: peer param fallback, None guard on set_peer_card, schema validation, signal_sufficient anchored regex, mid->medium level fix Core changes (~20 lines across 3 files): - agent/memory_manager.py: Enhanced sanitize_context() to strip full <memory-context> blocks and system notes (prevents leak from saveMessages) - run_agent.py: gateway_session_key param for stable per-chat Honcho sessions, on_turn_start() call before prefetch_all() for cadence tracking, sanitize_context() on user messages to strip leaked memory blocks - gateway/run.py: skip_memory=True on 2 temp agents (prevents orphan sessions), gateway_session_key threading to main agent Tests: 509 passed (3 skipped — honcho SDK not installed locally) Docs: Updated honcho.md, memory-providers.md, tools-reference.md, SKILL.md Co-authored-by: erosika <erosika@users.noreply.github.com>
aj-nt
pushed a commit
to aj-nt/hermes-agent
that referenced
this pull request
May 1, 2026
…, session isolation (NousResearch#10619) Salvaged from PR NousResearch#9884 by erosika. Cherry-picked plugin changes onto current main with minimal core modifications. Plugin changes (plugins/memory/honcho/): - New honcho_reasoning tool (5th tool, splits LLM calls from honcho_context) - Two-layer context injection: base context (summary + representation + card) on contextCadence, dialectic supplement on dialecticCadence - Multi-pass dialectic depth (1-3 passes) with early bail-out on strong signal - Cold/warm prompt selection based on session state - dialecticCadence defaults to 3 (was 1) — ~66% fewer Honcho LLM calls - Session summary injection for conversational continuity - Bidirectional peer targeting on all 5 tools - Correctness fixes: peer param fallback, None guard on set_peer_card, schema validation, signal_sufficient anchored regex, mid->medium level fix Core changes (~20 lines across 3 files): - agent/memory_manager.py: Enhanced sanitize_context() to strip full <memory-context> blocks and system notes (prevents leak from saveMessages) - run_agent.py: gateway_session_key param for stable per-chat Honcho sessions, on_turn_start() call before prefetch_all() for cadence tracking, sanitize_context() on user messages to strip leaked memory blocks - gateway/run.py: skip_memory=True on 2 temp agents (prevents orphan sessions), gateway_session_key threading to main agent Tests: 509 passed (3 skipped — honcho SDK not installed locally) Docs: Updated honcho.md, memory-providers.md, tools-reference.md, SKILL.md Co-authored-by: erosika <erosika@users.noreply.github.com>
02356abc
pushed a commit
to 02356abc/hermes-agent
that referenced
this pull request
May 14, 2026
…, session isolation (NousResearch#10619) Salvaged from PR NousResearch#9884 by erosika. Cherry-picked plugin changes onto current main with minimal core modifications. Plugin changes (plugins/memory/honcho/): - New honcho_reasoning tool (5th tool, splits LLM calls from honcho_context) - Two-layer context injection: base context (summary + representation + card) on contextCadence, dialectic supplement on dialecticCadence - Multi-pass dialectic depth (1-3 passes) with early bail-out on strong signal - Cold/warm prompt selection based on session state - dialecticCadence defaults to 3 (was 1) — ~66% fewer Honcho LLM calls - Session summary injection for conversational continuity - Bidirectional peer targeting on all 5 tools - Correctness fixes: peer param fallback, None guard on set_peer_card, schema validation, signal_sufficient anchored regex, mid->medium level fix Core changes (~20 lines across 3 files): - agent/memory_manager.py: Enhanced sanitize_context() to strip full <memory-context> blocks and system notes (prevents leak from saveMessages) - run_agent.py: gateway_session_key param for stable per-chat Honcho sessions, on_turn_start() call before prefetch_all() for cadence tracking, sanitize_context() on user messages to strip leaked memory blocks - gateway/run.py: skip_memory=True on 2 temp agents (prevents orphan sessions), gateway_session_key threading to main agent Tests: 509 passed (3 skipped — honcho SDK not installed locally) Docs: Updated honcho.md, memory-providers.md, tools-reference.md, SKILL.md Co-authored-by: erosika <erosika@users.noreply.github.com>
gweeteve
pushed a commit
to gweeteve/hermes-agent
that referenced
this pull request
Jun 2, 2026
…, session isolation (NousResearch#10619) Salvaged from PR NousResearch#9884 by erosika. Cherry-picked plugin changes onto current main with minimal core modifications. Plugin changes (plugins/memory/honcho/): - New honcho_reasoning tool (5th tool, splits LLM calls from honcho_context) - Two-layer context injection: base context (summary + representation + card) on contextCadence, dialectic supplement on dialecticCadence - Multi-pass dialectic depth (1-3 passes) with early bail-out on strong signal - Cold/warm prompt selection based on session state - dialecticCadence defaults to 3 (was 1) — ~66% fewer Honcho LLM calls - Session summary injection for conversational continuity - Bidirectional peer targeting on all 5 tools - Correctness fixes: peer param fallback, None guard on set_peer_card, schema validation, signal_sufficient anchored regex, mid->medium level fix Core changes (~20 lines across 3 files): - agent/memory_manager.py: Enhanced sanitize_context() to strip full <memory-context> blocks and system notes (prevents leak from saveMessages) - run_agent.py: gateway_session_key param for stable per-chat Honcho sessions, on_turn_start() call before prefetch_all() for cadence tracking, sanitize_context() on user messages to strip leaked memory blocks - gateway/run.py: skip_memory=True on 2 temp agents (prevents orphan sessions), gateway_session_key threading to main agent Tests: 509 passed (3 skipped — honcho SDK not installed locally) Docs: Updated honcho.md, memory-providers.md, tools-reference.md, SKILL.md Co-authored-by: erosika <erosika@users.noreply.github.com>
Egavasyug
pushed a commit
to Egavasyug/hermes-agent
that referenced
this pull request
Jun 10, 2026
…, session isolation (NousResearch#10619) Salvaged from PR NousResearch#9884 by erosika. Cherry-picked plugin changes onto current main with minimal core modifications. Plugin changes (plugins/memory/honcho/): - New honcho_reasoning tool (5th tool, splits LLM calls from honcho_context) - Two-layer context injection: base context (summary + representation + card) on contextCadence, dialectic supplement on dialecticCadence - Multi-pass dialectic depth (1-3 passes) with early bail-out on strong signal - Cold/warm prompt selection based on session state - dialecticCadence defaults to 3 (was 1) — ~66% fewer Honcho LLM calls - Session summary injection for conversational continuity - Bidirectional peer targeting on all 5 tools - Correctness fixes: peer param fallback, None guard on set_peer_card, schema validation, signal_sufficient anchored regex, mid->medium level fix Core changes (~20 lines across 3 files): - agent/memory_manager.py: Enhanced sanitize_context() to strip full <memory-context> blocks and system notes (prevents leak from saveMessages) - run_agent.py: gateway_session_key param for stable per-chat Honcho sessions, on_turn_start() call before prefetch_all() for cadence tracking, sanitize_context() on user messages to strip leaked memory blocks - gateway/run.py: skip_memory=True on 2 temp agents (prevents orphan sessions), gateway_session_key threading to main agent Tests: 509 passed (3 skipped — honcho SDK not installed locally) Docs: Updated honcho.md, memory-providers.md, tools-reference.md, SKILL.md Co-authored-by: erosika <erosika@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Honcho correctness fixes, cost-safety defaults, bidirectional 5-tool surface, context injection overhaul, and session isolation. Addresses community-reported issues including Cygnus's context budget experience, dialectic parroting, and gateway/CLI correctness bugs.
Replaces #6719 — clean rebase, plugin-only scope.
Context Injection Overhaul
peer.context()(session summary + representation + card) cached with TTL. Static mode header stays in system prompt for prompt-cache stability; live context injected into user message.dialecticCadence, default 3), result cached until next refresh. Supplements the base layer when fresh.Fixes the parroting problem where dialectic output was the sole injection source, causing stale observations and roleplaying responses.
Tool Surface (5 bidirectional tools)
honcho_profilecardto update, omit to readhoncho_searchhoncho_contexthoncho_reasoningreasoning_levelparam (minimal/low/medium/high/max)honcho_concludedelete_idfor PII removalAll tools accept
peer:'user'(default),'ai', or any workspace peer ID.Cost Safety
dialecticCadencedefaults to 3 (was 1) — ~66% fewer Honcho LLM calls. Configurable.contextTokensdefaults to uncapped — cap is opt-in via config/wizard.on_turn_starthook wired inrun_agent.py— fixes broken cadence/injection gating.Multi-Pass Dialectic (dialecticDepth)
dialecticDepth(1–3, clamped) controls how many.chat()calls fire per cycle:.chat()When
dialecticDepthLevelsis not set, passes use proportional reasoning levels relative to the configured base (minimal/base for depth 2, minimal/base/low for depth 3).Correctness
set_peer_card: None guard for unresolvable peer IDget_session_context: cache-miss fallback now respectspeerparam (was silently returning user context for AI peer queries)_resolve_peer_id: return type tightenedstr | None→str; dead None branch removed from_resolve_observer_targettarget=on peer context/card fetches (fixes identity blur)honcho_searchperspective fix under directional observationhermes honcho statushonest failure reportingpeerNameprecedence over gatewayuser_idskip_memoryon temp agents)gateway_session_keyfor stable per-chat continuityinitOnSessionStartfor eager tools-mode initget_session_contextfallback respects peer param"mid"→"medium"in reasoning level validationhoncho_concludeschema:anyOfrequired constraint (schema validators now correctly reject empty calls)_signal_sufficient: anchored regex prevents false matches on version stringsABC Footprint (minimal, honcho-only)
run_agent.py:gateway_session_keyparam + memory provider wiring (+5 lines)gateway/run.py:skip_memoryon 2 temp agents,gateway_session_keyon main agent (+3 lines)agent/memory_manager.py: sanitize regex for context tag variants (+9 lines)Session Strategy (backward compat)
Default stays
per-directory. Setup wizard guides new users toper-session. No breaking changes. Newhermes honcho strategyCLI command.Documentation
Updated:
honcho.md,memory-providers.md,tools-reference.md,cli-commands.md, plugin README, SKILL.md (full accuracy pass — tool descriptions, defaults, dialecticDynamic semantics, agent usage patterns).Can close on merge
Issues: #5667 | PRs: #5658, #4608
Related (cherry-picked, PR remains open): #8424
Validation
247 passed.