fix(config): propagate max_tokens from config.yaml to AI transport (#20741) by Beandon13 · Pull Request #20769 · NousResearch/hermes-agent

Beandon13 · 2026-05-06T14:26:42Z

Summary

Fixes #20741. The model.max_tokens key documented in cli-config.yaml.example was silently ignored — never read from config and never forwarded to AIAgent.__init__() or ChatCompletionsTransport.build_kwargs(). On providers without a hardcoded output cap (Ollama Cloud, zai, custom OpenAI-compatible endpoints) the parameter was omitted from the API call entirely, causing finish_reason="length" truncation.

Three independent gaps fixed:

CLI path (cli.py): HermesCLI.__init__ now reads model.max_tokens from CLI_CONFIG (with HERMES_MAX_TOKENS env-var override) and stores it as self.max_tokens. Both the interactive agent and the background agent AIAgent(...) construction sites pass max_tokens=self.max_tokens.
Gateway path (gateway/run.py): new _resolve_config_max_tokens() helper reads model.max_tokens from ~/.hermes/config.yaml (same env-var override). _resolve_runtime_agent_kwargs() includes the value; _resolve_turn_agent_config() copies it into the runtime dict so all six AIAgent(...) sites receive it via **turn_route["runtime"] or **runtime_kwargs.
Transport layer (ChatCompletionsTransport.build_kwargs): already honours max_tokens when non-None — no changes needed.

Test proof

$ python -m pytest tests/test_max_tokens_config_propagation.py -v --override-ini="addopts="
...
tests/test_max_tokens_config_propagation.py::TestResolveConfigMaxTokens::test_returns_none_when_unset PASSED
tests/test_max_tokens_config_propagation.py::TestResolveConfigMaxTokens::test_reads_max_tokens_from_model_section PASSED
tests/test_max_tokens_config_propagation.py::TestResolveConfigMaxTokens::test_env_var_overrides_config PASSED
tests/test_max_tokens_config_propagation.py::TestResolveConfigMaxTokens::test_env_var_works_without_config_file PASSED
tests/test_max_tokens_config_propagation.py::TestResolveConfigMaxTokens::test_invalid_env_var_returns_none PASSED
tests/test_max_tokens_config_propagation.py::TestResolveRuntimeAgentKwargsMaxTokens::test_max_tokens_included_when_configured PASSED
tests/test_max_tokens_config_propagation.py::TestResolveRuntimeAgentKwargsMaxTokens::test_max_tokens_none_when_not_configured PASSED
tests/test_max_tokens_config_propagation.py::TestCliMaxTokensLogic::test_reads_from_model_section PASSED
tests/test_max_tokens_config_propagation.py::TestCliMaxTokensLogic::test_env_var_wins_over_config PASSED
tests/test_max_tokens_config_propagation.py::TestCliMaxTokensLogic::test_none_when_unset_everywhere PASSED
tests/test_max_tokens_config_propagation.py::TestCliMaxTokensLogic::test_invalid_env_var_returns_none PASSED
tests/test_max_tokens_config_propagation.py::TestCliMaxTokensLogic::test_integer_coercion_from_string_in_config PASSED
tests/test_max_tokens_config_propagation.py::TestCliMaxTokensLogic::test_string_model_section_returns_none PASSED

13 passed, 1 warning in 0.85s

Files changed

File	Change
`cli.py`	Read `model.max_tokens` / `HERMES_MAX_TOKENS`; store as `self.max_tokens`; pass to both `AIAgent` sites
`gateway/run.py`	Add `_resolve_config_max_tokens()`; include `max_tokens` in `_resolve_runtime_agent_kwargs()` and `_resolve_turn_agent_config()`
`tests/test_max_tokens_config_propagation.py`	13 new tests covering all three layers
`CHANGELOG.md`	Unreleased entry

🤖 Generated with Claude Code

…ousResearch#20741) model.max_tokens documented in cli-config.yaml.example was silently ignored — never read from config and never forwarded to AIAgent.__init__() or ChatCompletionsTransport.build_kwargs(). On providers without a hardcoded output cap (Ollama Cloud, zai, custom OpenAI-compatible endpoints) the parameter was omitted from the API call entirely, causing truncation. Three gaps fixed: 1. cli.py: HermesCLI.__init__ now reads model.max_tokens from CLI_CONFIG (falling back to HERMES_MAX_TOKENS env var) and stores it as self.max_tokens. Both the interactive and background AIAgent construction sites pass max_tokens=self.max_tokens. 2. gateway/run.py: new _resolve_config_max_tokens() helper reads model.max_tokens from ~/.hermes/config.yaml (with HERMES_MAX_TOKENS env-var override). _resolve_runtime_agent_kwargs() returns the value; and _resolve_turn_agent_config() copies it into the runtime dict so all six AIAgent(...) call sites receive it via **turn_route["runtime"] or **runtime_kwargs. 3. ChatCompletionsTransport.build_kwargs() already honours max_tokens when non-None — no changes needed there. Also adds tests/test_max_tokens_config_propagation.py with 13 tests covering _resolve_config_max_tokens, _resolve_runtime_agent_kwargs, and the CLI max_tokens resolution logic (all 13 pass). Fixes: NousResearch#20741 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

teknium1 · 2026-06-11T06:55:39Z

Automated hermes-sweeper review: this PR's fix is already implemented on main via the merged #39864 path.

Evidence:

cli.py:3262 on origin/main reads HERMES_MAX_TOKENS and model.max_tokens into self.max_tokens.
hermes_cli/cli_agent_setup_mixin.py:352 passes max_tokens=self.max_tokens into AIAgent, and the CLI background path passes turn_route["runtime"].get("max_tokens").
gateway/run.py:1320 resolves the gateway precedence chain and returns max_tokens; gateway/run.py also threads it through turn runtime.
tests/gateway/test_max_tokens_propagation.py covers Bug: max_tokens from config.yaml is silently ignored — never propagated to AIAgent, causing output truncation on Ollama Cloud / zai / custom endpoints #20741 and the precedence chain.
Bug: max_tokens from config.yaml is silently ignored — never propagated to AIAgent, causing output truncation on Ollama Cloud / zai / custom endpoints #20741 was closed by a maintainer as fixed in fix(gateway): propagate max_tokens from config.yaml to AIAgent (#20741) #39864, merged to main and contained in v2026.6.5.

Thanks for the original fix and test work here. The main-branch implementation preserves the same core behavior and widens it to the per-provider cap path as well.

alt-glitch added type/bug Something isn't working P2 Medium — degraded but workaround exists comp/cli CLI entry point, hermes_cli/, setup wizard comp/gateway Gateway runner, session dispatch, delivery area/config Config system, migrations, profiles labels May 6, 2026

alt-glitch mentioned this pull request May 6, 2026

fix(gateway): propagate max_tokens from config.yaml to AIAgent #20804

Closed

digitalbase mentioned this pull request May 21, 2026

feat(config): add per-model max_tokens overlay #29705

Open

alt-glitch mentioned this pull request Jun 5, 2026

fix(gateway): propagate max_tokens from config.yaml to AIAgent (#20741) #39864

Merged

teknium1 closed this Jun 11, 2026

teknium1 added the sweeper:implemented-on-main Sweeper: behavior already present on current main label Jun 11, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(config): propagate max_tokens from config.yaml to AI transport (#20741)#20769

fix(config): propagate max_tokens from config.yaml to AI transport (#20741)#20769
Beandon13 wants to merge 1 commit into
NousResearch:mainfrom
Beandon13:fix/hermes-20741-max-tokens-config

Beandon13 commented May 6, 2026

Uh oh!

teknium1 commented Jun 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Beandon13 commented May 6, 2026

Summary

Test proof

Files changed

Uh oh!

teknium1 commented Jun 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants