fix(honcho): dialectic lifecycle — defaults, retry, prewarm consumption by erosika · Pull Request #12160 · NousResearch/hermes-agent

erosika · 2026-04-18T13:36:31Z

Summary

Chain of correctness and reliability fixes on the Honcho dialectic path. Replaces #9884's lingering gaps and follows up on #10619. Also incorporates @LeonSGP43's #11434 fix for gateway per-user memory scoping.

Closes

Fixes [Bug]: Honcho memory scoping ignores gateway user_id when peer_name is configured #11199 (gateway user_id ignored when peer_name configured)
Supersedes fix(honcho): scope gateway sessions by runtime user id #11434 — @LeonSGP43's commit is cherry-picked into this branch preserving authorship

Cadence & defaults

dialecticCadence wizard default is 2. Setup wizard always writes it; status display and docs show 2. Code keeps a backwards-compat fallback of 1 when the key is unset so existing honcho.json configs that predate the setting keep firing every turn on upgrade.
Cadence gate uniform at all values. Previous if dialectic_cadence > 1: guard let cadence=1 fire twice on turn 0 (sync first-turn + same-turn queue).

Correctness

Session-start prewarm now consumed. prefetch_dialectic wrote to HonchoSessionManager._dialectic_cache but pop_dialectic_result had zero call sites. Turn 1 paid for a duplicate sync .chat(). Prewarm now writes directly to HonchoMemoryProvider._prefetch_result under lock; turn 1 consumes without another call.
Prewarm is dialecticDepth-aware. Single-pass prewarm on a cold peer often returns thin output; multi-pass audit/reconcile runs at init instead.
Silent failure doesn't advance cadence. _last_dialectic_turn advances only when the result is non-empty.
Thread pile-up guard. queue_prefetch skips when a prior dialectic thread is still in-flight.
First-turn sync timeout is recoverable. The background thread writes into _prefetch_result under lock on completion, so the next turn picks it up even when the sync wait timed out.

Liveness + observability

Stale-thread watchdog. A prefetch thread older than timeout × 2 is treated as dead so a hung Honcho call can't block future fires.
Stale-result discard. Pending _prefetch_result is tagged with fire-turn; discarded on read if older than dialecticCadence × 2 turns.
Empty-streak backoff. Consecutive empty returns widen effective cadence (cadence + streak, capped at cadence × 8). Healthy fire resets streak.
liveness_snapshot() method. In-process diagnostic with 7 runtime fields.
hermes honcho status surfaces base reasoning level, cap, and heuristic toggle.
Tool nudge. system_prompt_block tells the model honcho_reasoning accepts reasoning_level minimal/low/medium/high/max per call.

Restored behavior

Query-length reasoning-level scaling (dropped in 9a0ab34c). Scales dialecticReasoningLevel up on longer queries (+1 at ≥120 chars, +2 at ≥400), clamped at reasoningLevelCap. New config: reasoningHeuristic: bool (default true) and reasoningLevelCap: string (default "high"; previously parsed but never enforced).
Short-prompt skip (dropped in ef7f3156). ok / y / thanks / slash commands short-circuit both injection and dialectic.

Setup wizard

Adds reasoning-level step surfacing all five options (minimal, low, medium, high, max). Always writes dialecticReasoningLevel.
Cadence prompt copy updated; recommended range 1–5.

Multi-peer + observation (@LeonSGP43, cherry-picked #11434)

Gateway user_id no longer mutates cfg.peer_name. Threaded through HonchoSessionManager as runtime_user_peer_name and preferred when resolving the user peer. Per-user memory scoping works even when a static peerName is configured.

Dead-code purge

HonchoSessionManager.prefetch_dialectic, _dialectic_cache, set_dialectic_result, pop_dialectic_result — all orphaned after the prewarm refactor.

Config surface additions

{
  "reasoningHeuristic": true,
  "reasoningLevelCap": "high"
}

Both optional; defaults preserve behavior.

Test plan

pytest tests/honcho_plugin/ tests/agent/test_memory_provider.py tests/agent/test_memory_user_id.py tests/run_agent/test_run_agent.py — 561 passed
New coverage:
- TestTrivialPromptHeuristic — short-prompt skip
- TestDialecticCadenceAdvancesOnSuccess — empty-result retry, thread pile-up
- TestSessionStartDialecticPrewarm — prewarm consumed, sync fallback
- TestReasoningHeuristic — length bumps, cap clamp
- TestDialecticLiveness — stale-thread recovery, stale-result discard, empty-streak backoff, snapshot shape
- TestDialecticLifecycleSmoke — end-to-end 8-turn session walk
- TestHonchoUserIdScoping (@LeonSGP43) — runtime user peer scoping

Files

Plugin code, wizard, docs, 20+ new tests. See commits for per-commit diffs.

Several correctness and cost-safety fixes to the Honcho dialectic path after a multi-turn investigation surfaced a chain of silent failures: - dialecticCadence default flipped 3 → 1. PR NousResearch#10619 changed this from 1 to 3 for cost, but existing installs with no explicit config silently went from per-turn dialectic to every-3-turns on upgrade. Restores pre-NousResearch#10619 behavior; 3+ remains available for cost-conscious setups. Docs + wizard + status output updated to match. - Session-start prewarm now consumed. Previously fired a .chat() on init whose result landed in HonchoSessionManager._dialectic_cache and was never read — pop_dialectic_result had zero call sites. Turn 1 paid for a duplicate synchronous dialectic. Prewarm now writes directly to the plugin's _prefetch_result via _prefetch_lock so turn 1 consumes it with no extra call. - Prewarm is now dialecticDepth-aware. A single-pass prewarm can return weak output on cold peers; the multi-pass audit/reconcile cycle is exactly the case dialecticDepth was built for. Prewarm now runs the full configured depth in the background. - Silent dialectic failure no longer burns the cadence window. _last_dialectic_turn now advances only when the result is non-empty. Empty result → next eligible turn retries immediately instead of waiting the full cadence gap. - Thread pile-up guard. queue_prefetch skips when a prior dialectic thread is still in-flight, preventing stacked races on _prefetch_result. - First-turn sync timeout is recoverable. Previously on timeout the background thread's result was stored in a dead local list. Now the thread writes into _prefetch_result under lock so the next turn picks it up. - Cadence gate applies uniformly. At cadence=1 the old "cadence > 1" guard let first-turn sync + same-turn queue_prefetch both fire. Gate now always applies. - Restored query-length reasoning-level scaling, dropped in 9a0ab34c. Scales dialecticReasoningLevel up on longer queries (+1 at ≥120 chars, +2 at ≥400), clamped at reasoningLevelCap. Two new config keys: `reasoningHeuristic` (bool, default true) and `reasoningLevelCap` (string, default "high"; previously parsed but never enforced). Respects dialecticDepthLevels and proportional lighter-early passes. - Restored short-prompt skip, dropped in ef7f315. One-word acknowledgements ("ok", "y", "thanks") and slash commands bypass both injection and dialectic fire. - Purged dead code in session.py: prefetch_dialectic, _dialectic_cache, set_dialectic_result, pop_dialectic_result — all unused after prewarm refactor. Tests: 542 passed across honcho_plugin/, agent/test_memory_provider.py, and run_agent/test_run_agent.py. New coverage: - TestTrivialPromptHeuristic (classifier + prefetch/queue skip) - TestDialecticCadenceAdvancesOnSuccess (empty-result retry, pile-up guard) - TestSessionStartDialecticPrewarm (prewarm consumed, sync fallback) - TestReasoningHeuristic (length bumps, cap clamp, interaction with depth) - TestDialecticLifecycleSmoke (end-to-end 8-turn session walk)

- Revert website/docs and SKILL.md changes; docs unification handled separately - Scrub commit/PR refs and process narration from code comments and test docstrings (no behavior change)

… multi-peer - cli: setup wizard pre-fills dialecticCadence=2 (code default stays 1 so unset → every turn) - honcho.md: fix stale dialecticCadence default in tables, add Session-Start Prewarm subsection (depth runs at init), add Query-Adaptive Reasoning Level subsection, expand Observation section with directional vs unified semantics and per-peer patterns - memory-providers.md: fix stale default, rename Multi-agent/Profiles to Multi-peer setup, add concrete walkthrough for new profiles and sync, document observation toggles + presets, link to honcho.md - SKILL.md: fix stale defaults, add Depth at session start callout

…t discard, empty-streak backoff Hardens the dialectic lifecycle against three failure modes that could leave the prefetch pipeline stuck or injecting stale content: - Stale-thread watchdog: _thread_is_live() treats any prefetch thread older than timeout × 2.0 as dead. A hung Honcho call can no longer block subsequent fires indefinitely. - Stale-result discard: pending _prefetch_result is tagged with its fire turn. prefetch() discards the result if more than cadence × 2 turns passed before a consumer read it (e.g. a run of trivial-prompt turns between fire and read). - Empty-streak backoff: consecutive empty dialectic returns widen the effective cadence (dialectic_cadence + streak, capped at cadence × 8). A healthy fire resets the streak. Prevents the plugin from hammering the backend every turn when the peer graph is cold. - liveness_snapshot() on the provider exposes current turn, last fire, pending fire-at, empty streak, effective cadence, and thread status for in-process diagnostics. - system_prompt_block: nudge the model that honcho_reasoning accepts reasoning_level minimal/low/medium/high/max per call. - hermes honcho status: surface base reasoning level, cap, and heuristic toggle so config drift is visible at a glance. Tests: 550 passed. - TestDialecticLiveness (8 tests): stale-thread recovery, stale-result discard, fresh-result retention, backoff widening, backoff ceiling, streak reset on success, streak increment on empty, snapshot shape. - Existing TestDialecticCadenceAdvancesOnSuccess::test_in_flight_thread_is_not_stacked updated to set _prefetch_thread_started_at so it tests the fresh-thread-blocks branch (stale path covered separately). - test_cli TestCmdStatus fake updated with the new config attrs surfaced in the status block.

…overage - TestDialecticDepth::test_first_turn_runs_dialectic_synchronously: covered by TestSessionStartDialecticPrewarm::test_turn1_falls_back_to_sync_when_prewarm_missing (more realistic — exercises the empty-prewarm → sync-fallback path) - TestDialecticDepth::test_first_turn_dialectic_does_not_double_fire: covered by TestDialecticLifecycleSmoke (turn 1 flow) and TestDialecticCadenceAdvancesOnSuccess::test_empty_dialectic_result_does_not_advance_cadence Both predate the prewarm refactor and test paths that are now fallback behaviors already covered elsewhere.

…wards-compat fallback Setup wizard now always writes dialecticCadence=2 on new configs and surfaces the reasoning level as an explicit step with all five options (minimal / low / medium / high / max), always writing dialecticReasoningLevel. Code keeps a backwards-compat fallback of 1 when dialecticCadence is unset so existing honcho.json configs that predate the setting keep firing every turn on upgrade. New setups via the wizard get 2 explicitly; docs show 2 as the default. Also scrubs editorial lines from code and docs ("max is reserved for explicit tool-path selection", "Unset → every turn; wizard pre-fills 2", and similar process-exposing phrasing) and adds an inline link to app.honcho.dev where the server-side observation sync is mentioned in honcho.md. Recommended cadence range updated to 1-5 across docs and wizard copy.

kshitijk4poor · 2026-04-19T06:05:36Z

Merged via PR #12419. Your commits were cherry-picked onto current main with authorship preserved in git log. @LeonSGP43's #11434 gateway user_id scoping fix was also included. Thanks for the thorough lifecycle fixes, @erosika!

erosika and others added 6 commits April 18, 2026 09:35

chore(honcho): drop docs from PR scope, scrub commentary

a7a3822

- Revert website/docs and SKILL.md changes; docs unification handled separately - Scrub commit/PR refs and process narration from code comments and test docstrings (no behavior change)

fix(honcho): scope gateway sessions by runtime user id

f07ed55

This was referenced Apr 18, 2026

fix(honcho): scope gateway sessions by runtime user id #11434

Closed

[Bug]: Honcho memory scoping ignores gateway user_id when peer_name is configured #11199

Closed

erosika force-pushed the eri/honcho-dialectic-lifecycle branch from ad78217 to 8b477ef Compare April 18, 2026 17:49

kshitijk4poor mentioned this pull request Apr 19, 2026

fix(honcho): dialectic lifecycle, gateway scoping, provider opt-in #12419

Merged

kshitijk4poor closed this Apr 19, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(honcho): dialectic lifecycle — defaults, retry, prewarm consumption#12160

fix(honcho): dialectic lifecycle — defaults, retry, prewarm consumption#12160
erosika wants to merge 7 commits into
NousResearch:mainfrom
erosika:eri/honcho-dialectic-lifecycle

erosika commented Apr 18, 2026 •

edited

Loading

Uh oh!

kshitijk4poor commented Apr 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

erosika commented Apr 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Closes

Cadence & defaults

Correctness

Liveness + observability

Restored behavior

Setup wizard

Multi-peer + observation (@LeonSGP43, cherry-picked #11434)

Dead-code purge

Config surface additions

Test plan

Files

Uh oh!

kshitijk4poor commented Apr 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

erosika commented Apr 18, 2026 •

edited

Loading