fix(agent): honor model.context_length override below 64K floor by ismell0992-afk · Pull Request #8962 · NousResearch/hermes-agent

ismell0992-afk · 2026-04-13T10:52:16Z

What does this PR do?

Two call sites hard-coded MINIMUM_CONTEXT_LENGTH (64K) as an immovable floor, silently defeating the model.context_length config override that their own error message tells users to reach for:

"Choose a model with at least 64K context, or set model.context_length in config.yaml to override."

AIAgent.__init__ rejected any compressor whose context was below the floor, regardless of whether the user had set an override. The override is the only way to run a sub-64K model, so the reject was unreachable for its intended case.
ContextCompressor.__init__ floored threshold_tokens at 64K even when total context was below 64K. On an opt-in 32K model this pushed the compression threshold to 32K — equal to total context — so compression would literally never fire.

Motivating case: hermes-brain:qwen3-14b-ctx32k, a Modelfile wrapping qwen3:14b with num_ctx 32768 for a ~15K-token baseline system prompt plus real conversation history. Without both fixes, startup fails at (1) and — once (1) is bypassed — compression never fires at (2).

Type of Change

Bug fix (non-breaking change that fixes an issue)

Changes Made

run_agent.py — extract the inline 64K reject block from AIAgent.__init__ into _check_minimum_context_length(), which skips when self._config_context_length is not None (the user opt-in the error message promises).
agent/context_compressor.py — when self.context_length < MINIMUM_CONTEXT_LENGTH, use the raw percentage as threshold_tokens instead of clamping to the 64K floor. Above the floor, behavior is unchanged.
tests/run_agent/test_switch_model_context.py — 4 new _check_minimum_context_length_* cases (rejects without override, accepts with override, accepts above floor, no-ops on 0).
tests/agent/test_context_compressor.py — test_threshold_floor_skipped_for_opt_in_tiny_models proves an opt-in 32K model with threshold_percent=0.50 gets threshold_tokens=16384 (not 32768).

How to Test

pytest tests/agent/test_context_compressor.py tests/run_agent/test_switch_model_context.py -q — passes (47 cases total).
With a sub-64K local model declared in config.yaml via model.context_length: 32768, hermes chat -q "hi" no longer raises "is below the minimum 64,000 required".
Watch the same session: compression fires when the conversation crosses ~50% of total context, not at 100%.

Checklist

Conventional Commits
Tests added
PR contains only changes related to this fix
Tested on Linux (Ubuntu)

Two call sites hard-coded MINIMUM_CONTEXT_LENGTH (64K) as an immovable floor, silently defeating the ``model.context_length`` config override that their own error message tells users to reach for: "Choose a model with at least 64K context, or set model.context_length in config.yaml to override." 1. ``AIAgent.__init__`` rejected any compressor whose context was below the floor, regardless of whether the user had set an override. The override is the only way to run a sub-64K model, so the reject was unreachable for its intended case. Extracted the check into a new ``_check_minimum_context_length`` helper that skips when ``self._config_context_length`` is not None. 2. ``ContextCompressor.__init__`` floored ``threshold_tokens`` at 64K even when total context was below 64K. On an opt-in 32K model this pushed the compression threshold to 32K — equal to total context — so compression would literally never fire. For models whose total context is below the floor, use the raw percentage instead. Motivating case: ``hermes-brain:qwen3-14b-ctx32k``, a Modelfile wrapping qwen3:14b with ``num_ctx 32768`` for a ~15K-token baseline system prompt plus real conversation history. Without both fixes, startup fails at (1) and — once (1) is fixed — compression never fires at (2). Adds parametrized tests for both paths: - _check_minimum_context_length: rejects low context, accepts override, accepts above-floor, no-ops on 0. - ContextCompressor: opt-in 32K model gets 16384-token threshold, not 32768. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

alt-glitch · 2026-04-27T23:11:58Z

Likely duplicate of #9142 — same root cause: model.context_length config override doesn't bypass 64K MINIMUM_CONTEXT_LENGTH floor in AIAgent.init and ContextCompressor. Also duplicated by #11097.

This was referenced Apr 25, 2026

fix(run_agent): honor model.context_length override for sub-64K models #11097

Open

fix(config): respect model.context_length override for sub-64K models #9142

Open

alt-glitch added type/bug Something isn't working P2 Medium — degraded but workaround exists comp/agent Core agent loop, run_agent.py, prompt builder labels Apr 27, 2026

alt-glitch mentioned this pull request May 12, 2026

All models rejected with "context window below minimum 64,000 tokens" — Telegram completely down #24140

Closed

This was referenced May 19, 2026

fix(context-length): context_window alias, config override bypasses 64K guard, extend 32K underreport guard to MiniMax #28634

Closed

fix(agent-init): honour model.context_length override below 64K floor (#8430) #28949

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(agent): honor model.context_length override below 64K floor#8962

fix(agent): honor model.context_length override below 64K floor#8962
ismell0992-afk wants to merge 1 commit into
NousResearch:mainfrom
ismell0992-afk:pr/agent-honor-ctx-override

ismell0992-afk commented Apr 13, 2026

Uh oh!

alt-glitch commented Apr 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ismell0992-afk commented Apr 13, 2026

What does this PR do?

Type of Change

Changes Made

How to Test

Checklist

Uh oh!

alt-glitch commented Apr 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants