fix: enforce context_tokens budget on Honcho peer representation#1878
fix: enforce context_tokens budget on Honcho peer representation#1878AzothZephyr wants to merge 2 commits into
Conversation
The Honcho SDK's context() tokens parameter only limits message history retrieval, not the peer_representation and peer_card fields which grow unbounded as Honcho accumulates observations about the user and AI peer. This meant setting contextTokens: 50 in ~/.honcho/config.json had no effect on the massive peer representation blocks (Explicit Observations, Deductive Observations, Inductive Observations, peer card) that were injected into every system prompt — often 4000-5000+ tokens. Fix: after assembling the Honcho context block in _honcho_prefetch(), truncate the total output to fit within the configured context_tokens budget (estimated at 4 chars per token). This ensures the user's configured budget is respected regardless of how much data Honcho's server returns. Ideally the Honcho API itself should respect the tokens param for all returned fields, but until then this client-side enforcement prevents runaway token usage.
The Honcho SDK's context() tokens parameter only limits message history retrieval, not the peer_representation and peer_card fields which grow unbounded as Honcho accumulates observations about the user and AI peer. This meant setting contextTokens in ~/.honcho/config.json had no effect on the massive peer representation blocks (Explicit Observations, Deductive Observations, Inductive Observations, peer card) injected into every system prompt — often 4000-5000+ tokens. Fix: after assembling the Honcho context block in _honcho_prefetch(), truncate the total output to fit within the configured context_tokens budget (estimated at 4 chars per token). Adds three unit tests covering truncation, within-budget passthrough, and no-budget (None) passthrough.
Original work from PR NousResearch#1878.
|
Orb Code Review (powered by GLM 5.1 on Orb Cloud) SummaryEnforces the ArchitectureCleanly placed at the return point of IssuesWarning — Character-based truncation can cut mid-UTF8 character or mid-sentence: assembled = assembled[:char_budget] + "\n\n[… truncated to fit token budget]"
if len(assembled) > char_budget:
cut = assembled[:char_budget].rfind('\n\n')
if cut > char_budget // 2:
assembled = assembled[:cut]
else:
assembled = assembled[:char_budget]
assembled += '\n\n[… truncated to fit token budget]'This is a suggestion, not a blocker — the current approach works, just produces slightly rougher truncation boundaries. Suggestion — The Suggestion — The private attribute access Cross-file impactNone — Assessmentapprove ✅ — Straightforward fix for an unbounded-growth issue. Good test coverage (truncation, no-truncation-within-budget, no-truncation-when-no-budget). The character heuristic is reasonable for a first pass. |
1 similar comment
|
Orb Code Review (powered by GLM 5.1 on Orb Cloud) SummaryEnforces the ArchitectureCleanly placed at the return point of IssuesWarning — Character-based truncation can cut mid-UTF8 character or mid-sentence: assembled = assembled[:char_budget] + "\n\n[… truncated to fit token budget]"
if len(assembled) > char_budget:
cut = assembled[:char_budget].rfind('\n\n')
if cut > char_budget // 2:
assembled = assembled[:cut]
else:
assembled = assembled[:char_budget]
assembled += '\n\n[… truncated to fit token budget]'This is a suggestion, not a blocker — the current approach works, just produces slightly rougher truncation boundaries. Suggestion — The Suggestion — The private attribute access Cross-file impactNone — Assessmentapprove ✅ — Straightforward fix for an unbounded-growth issue. Good test coverage (truncation, no-truncation-within-budget, no-truncation-when-no-budget). The character heuristic is reasonable for a first pass. |
|
Likely duplicate of #3265 (closed) which enforced contextTokens budget on Honcho prefetch. If that fix was merged, this may already be resolved. |
|
Likely duplicate of #3265 (closed) which enforced contextTokens budget on Honcho prefetch. |
|
Already fixed on main. The Honcho integration was extracted into a plugin (PR #5295) and the same |
What does this PR do?
Enforces the user's configured
contextTokensbudget on Honcho'speer_representationandpeer_cardfields, which were previously injected into the system prompt at full size regardless of the token limit.The Honcho SDK's
session.context(tokens=N)parameter only limits message history retrieval. Thepeer_representationandpeer_cardfields — containing Explicit Observations, Deductive Observations, Inductive Observations, and the structured peer card — are returned in full by the Honcho server regardless of thetokensvalue. This meant settingcontextTokens: 50in~/.honcho/config.jsonhad zero effect, and users would see 4000-5000+ tokens of Honcho context injected every turn.The fix adds client-side truncation in
_honcho_prefetch()after assembling the full context block, capping it tocontext_tokens * 4characters. Ideally the Honcho API itself should respect thetokensparam for all returned fields, but until that's addressed server-side this prevents runaway token usage.Related Issue
Type of Change
Changes Made
run_agent.py: After assembling the Honcho context block in_honcho_prefetch(), readself._honcho._context_tokensand truncate the assembled string totoken_budget * 4characters if it exceeds the budget. Adds a[… truncated to fit token budget]marker when truncation occurs.tests/test_run_agent.py: Three new unit tests:test_honcho_prefetch_truncates_to_token_budget— verifies large context is truncated at budgettest_honcho_prefetch_no_truncation_within_budget— verifies small context passes through intacttest_honcho_prefetch_no_truncation_when_no_budget— verifiesNonebudget means no truncationHow to Test
contextTokens: 500in~/.honcho/config.json(underhosts.hermes)"enabled": true)pytest tests/test_run_agent.py::TestHonchoPrefetchScheduling::test_honcho_prefetch_truncates_to_token_budget tests/test_run_agent.py::TestHonchoPrefetchScheduling::test_honcho_prefetch_no_truncation_within_budget tests/test_run_agent.py::TestHonchoPrefetchScheduling::test_honcho_prefetch_no_truncation_when_no_budget -vChecklist
Code
fix(agent): enforce context_tokens budget on Honcho peer representation)Documentation & Housekeeping
Screenshots / Logs
Before fix (contextTokens: 50, completely ignored):
After fix (contextTokens: 500, properly enforced):