Skip to content

fix(gateway): honor custom_providers max_tokens when constructing AIAgent#20121

Closed
konsisumer wants to merge 1 commit into
NousResearch:mainfrom
konsisumer:fix/issue-20004-custom-providers-max-tokens
Closed

fix(gateway): honor custom_providers max_tokens when constructing AIAgent#20121
konsisumer wants to merge 1 commit into
NousResearch:mainfrom
konsisumer:fix/issue-20004-custom-providers-max-tokens

Conversation

@konsisumer

Copy link
Copy Markdown
Contributor

Per-provider max_tokens set under custom_providers (or the new-style providers dict) was dropped during config normalization and never reached AIAgent, so the gateway always used provider transport defaults regardless of the user's cap.

What changed and why

  • hermes_cli/config.py: add max_tokens to _KNOWN_KEYS in _normalize_custom_provider_entry and preserve positive int values in the normalized entry — without this, the key was dropped (and a spurious "unknown config keys" warning was logged).
  • hermes_cli/runtime_provider.py: propagate max_tokens from _get_named_custom_provider (legacy list, v12 dict, and credential-pool branches) and from _resolve_named_custom_runtime so the resolved runtime dict carries the cap.
  • gateway/run.py: include max_tokens in _resolve_runtime_agent_kwargs and the fallback-provider helper (with a fallback to top-level model.max_tokens), and forward it through _resolve_turn_agent_config so AIAgent(**turn_route["runtime"]) receives the value.
  • tests/hermes_cli/test_custom_provider_max_tokens.py: 10 new tests covering normalization (positive int, zero/negative rejection, non-int rejection, no spurious unknown-key warning), runtime propagation through the legacy list and v12 dict paths, omission semantics, and gateway precedence (runtime wins, falls back to model.max_tokens, returns None when neither is set).

Precedence is now: custom_providers[].max_tokens (carried on the runtime dict) → model.max_tokens (global) → None (provider transport default).

How to test

  • pytest tests/hermes_cli/test_custom_provider_max_tokens.py -q (10 passed locally)
  • pytest tests/hermes_cli/test_runtime_provider_resolution.py tests/hermes_cli/test_custom_provider_context_length.py tests/hermes_cli/test_config.py -q (173 passed)
  • Broader sweep: pytest tests/hermes_cli/ tests/gateway/ -q shows only pre-existing platform/flaky failures (systemd D-Bus on macOS, whatsapp/discord adapter tests, an SSE-keepalive timing test) that also fail on main.
  • Manual: set custom_providers: [{name: ark, base_url: ..., max_tokens: 131072}] and confirm via gateway logs that the agent's max_tokens is 131072 instead of the provider default.

What platforms tested on

  • macOS on darwin-arm64 (local)

Fixes #20004

…gent

The per-provider max_tokens cap set in custom_providers (or the new-style
providers dict) was dropped during config normalization and never reached
AIAgent. The gateway therefore fell back to the provider transport default
even when the user had explicitly raised the cap. Whitelist max_tokens in
the normalizer, propagate it through runtime provider resolution, and
forward it via the gateway runtime dict with a fallback to model.max_tokens
so a global cap is still honoured. Fixes NousResearch#20004.
@alt-glitch alt-glitch added type/bug Something isn't working P2 Medium — degraded but workaround exists comp/gateway Gateway runner, session dispatch, delivery comp/cli CLI entry point, hermes_cli/, setup wizard area/config Config system, migrations, profiles labels May 5, 2026
@alt-glitch

Copy link
Copy Markdown
Collaborator

Related to #20004, #19782, #19452 — part of the ongoing max_tokens propagation fix cluster for custom providers.

@konsisumer

Copy link
Copy Markdown
Contributor Author

Closing — deferring to #20149 by @Sanjays2402 which addresses the same. Reopen if that PR stalls.

@konsisumer konsisumer closed this May 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/config Config system, migrations, profiles comp/cli CLI entry point, hermes_cli/, setup wizard comp/gateway Gateway runner, session dispatch, delivery P2 Medium — degraded but workaround exists type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

max_tokens config from custom_providers is not passed to AIAgent

2 participants