feat(honcho): configurable prefetch cadence, injection toggles, and reasoning cap by erosika · Pull Request #3425 · NousResearch/hermes-agent

erosika · 2026-03-27T17:48:00Z

Summary

Default prefetch cadence changed from every-turn to first-turn-only for both context and dialectic -- eliminates 2 redundant Honcho API calls per turn after session start
Per-component injection toggles: users can disable representation, card, AI peer data, or dialectic independently in honcho.yaml
Reasoning level cap prevents _dynamic_reasoning_level from auto-bumping past a configured ceiling, stopping cost escalation from long messages

Config

All new fields in honcho.yaml, with cost-efficient defaults:

# Cadence: "first-turn" (default), "every-turn", or integer N
contextCadence: first-turn
dialecticCadence: first-turn

# Injection toggles
injectRepresentation: true       # user conclusions
injectCard: true                 # structured peer card
injectAiRepresentation: false    # AI peer model (default off)
injectAiCard: false              # AI peer card (default off)
injectDialectic: true            # continuity synthesis

# Reasoning cap (default: same as floor = no auto-bump)
dialecticReasoningLevel: low
dialecticReasoningCap: low

Legacy behavior is one config change: set cadences to every-turn and cap to high.

Motivation

Community-reported cost: ~~$20/3 days (~~$200/mo) on managed Honcho with normal chat usage. Root cause: unconditional per-turn context fetch + dialectic inference with auto-bumped reasoning levels. User rep doesn't change between consecutive turns; dialectic is only useful at session boundaries.

Closes #3422

Test plan

Verify default config produces first-turn-only prefetch (no API calls on turn 2+)
Verify dialecticCadence: every-turn restores legacy behavior
Verify dialecticReasoningCap: low pins reasoning level regardless of message length
Verify injectRepresentation: false suppresses user rep from system prompt
Verify existing honcho.yaml with no new fields works unchanged

…easoning cap Cost-awareness defaults for Honcho integration: - Prefetch cadence: context and dialectic default to "first-turn" instead of unconditional per-turn fetches. Configurable via contextCadence / dialecticCadence in honcho.yaml ("first-turn", "every-turn", or int N). - Injection toggles: per-component control over what gets injected into the system prompt. injectRepresentation, injectCard, injectDialectic default true; injectAiRepresentation, injectAiCard default false (rarely needed, saves tokens). - Reasoning cap: dialecticReasoningCap (default: same as floor) prevents _dynamic_reasoning_level from auto-bumping past the configured ceiling. Stops cost escalation from long messages triggering high reasoning. Existing honcho.yaml files with no new fields get cost-efficient defaults automatically. Users who want legacy per-turn behavior can set cadence to "every-turn" and raise the reasoning cap. Closes NousResearch#3422

…s commands hermes honcho status now shows: prefetch cadence, injection toggles, reasoning level with cap info. hermes honcho tokens now shows: cadence per component, injection summary (enabled/suppressed), reasoning cap alongside level. Users can see exactly what Honcho is doing per-turn and what's costing them money.

…xt stays in prompt Cadence controls when Honcho API calls fire. injectionFrequency controls how many turns the cached result stays in the system prompt — this is where the main LLM input token cost comes from. Options: "every-turn" (default, legacy), "first-turn" (inject once then drop), or integer N (inject for first N turns then suppress). Surfaced in hermes honcho status and hermes honcho tokens.

…easoning cap (NousResearch#3425) Ported from erosika's PR NousResearch#3425 to plugin architecture. - Add dialectic_reasoning_cap config field (ceiling for auto-bump) - Add dialectic_cadence and context_cadence (first-turn/every-turn/N) - Add injection_frequency and per-component injection toggles - Add _bool_opt helper for boolean config resolution - Add _should_prefetch() and increment_turn() to session manager - Update CLI status and tokens display with cadence/injection info

erosika · 2026-04-15T20:42:11Z

Superseded by #9884 — cadence, injection toggles, and reasoning cap are fully reimplemented in the new plugin architecture.

erosika force-pushed the eri/honcho-cost-awareness branch from 19f37b9 to 3e2d60a Compare March 27, 2026 17:53

erosika mentioned this pull request Mar 27, 2026

Honcho PR map: open integration work across community #3276

Closed

erosika force-pushed the eri/honcho-cost-awareness branch from b29c37e to 0692ba2 Compare March 30, 2026 00:47

erosika added 3 commits March 30, 2026 13:45

erosika force-pushed the eri/honcho-cost-awareness branch from 0692ba2 to 0de4baa Compare March 30, 2026 17:45

erosika mentioned this pull request Apr 5, 2026

fix(honcho): plugin drift overhaul -- observation config, chunking, setup wizard, docs, dead code cleanup from migration #5045

Closed

teknium1 mentioned this pull request Apr 5, 2026

fix(honcho): plugin drift overhaul + feat(plugins): CLI registration system #5295

Merged

erosika closed this Apr 15, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(honcho): configurable prefetch cadence, injection toggles, and reasoning cap#3425

feat(honcho): configurable prefetch cadence, injection toggles, and reasoning cap#3425
erosika wants to merge 3 commits into
NousResearch:mainfrom
erosika:eri/honcho-cost-awareness

erosika commented Mar 27, 2026

Uh oh!

erosika commented Apr 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

erosika commented Mar 27, 2026

Summary

Config

Motivation

Test plan

Uh oh!

erosika commented Apr 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant