Skip to content

fix(agent): honor model.max_tokens in config.yaml#19515

Closed
CoreyNoDream wants to merge 1 commit into
NousResearch:mainfrom
CoreyNoDream:fix/config-max-tokens-19360
Closed

fix(agent): honor model.max_tokens in config.yaml#19515
CoreyNoDream wants to merge 1 commit into
NousResearch:mainfrom
CoreyNoDream:fix/config-max-tokens-19360

Conversation

@CoreyNoDream

Copy link
Copy Markdown
Contributor

Summary

Fixes #19360 — all tool-using agents fail with HTTP 400 when routed through an OpenAI-compatible proxy to Anthropic models, because Hermes sends chat completion requests without max_tokens, which Anthropic's Messages API requires when tools is present.

  • AIAgent.__init__ accepts max_tokens but no caller ever passes it — so self.max_tokens was always None and the OpenAI SDK omitted the field entirely.
  • This PR lets users set it via config.yamlmodel.max_tokens, mirroring the existing model.context_length block right above the new code in run_agent.py.
  • Validation: non-int / "4K" / non-positive values log a clear warning and fall back to the model default (same shape as the context_length warning) rather than crashing.

Config usage

model:
  default: claude-opus-4-6-thinking
  provider: custom
  base_url: http://192.168.1.13:8045/v1
  api_key: sk-xxx
  max_tokens: 16384   # new

Test plan

  • New regression suite tests/run_agent/test_max_tokens_config.py (5 cases: valid int, numeric string, "4K" warn, 0 warn, absent → default None).
  • Verified the new tests fail on main without the fix (4/5 fail — only the None-default case passes, which is the pre-existing behavior) and all pass with it.
  • Sibling suite test_invalid_context_length_warning.py still green — the new block only reuses _model_cfg, doesn't alter the existing _config_context_length flow.

AIAgent's constructor accepted max_tokens but no caller passed it, so
self.max_tokens was always None and the OpenAI SDK omitted the field
from chat completion requests.

When routing to an Anthropic model via an OpenAI-compatible proxy
(e.g. LiteLLM, Antigravity Manager) with tools enabled, Anthropic's
Messages API rejects the translated request with HTTP 400 because
max_tokens is required in that path. Users had no way to set it.

Mirror the existing model.context_length reading block: read
_model_cfg.max_tokens, validate as a positive integer, surface a
clear warning on bad values ("4K", 0, -1, etc.), and assign to
self.max_tokens when valid.

Fixes NousResearch#19360.
@alt-glitch alt-glitch added type/bug Something isn't working P2 Medium — degraded but workaround exists comp/agent Core agent loop, run_agent.py, prompt builder area/config Config system, migrations, profiles duplicate This issue or pull request already exists labels May 4, 2026
@alt-glitch

Copy link
Copy Markdown
Collaborator

Likely duplicate of #19452 — same fix for model.max_tokens config.yaml wiring for AIAgent. Both address #19360.

1 similar comment
@alt-glitch

Copy link
Copy Markdown
Collaborator

Likely duplicate of #19452 — same fix for model.max_tokens config.yaml wiring for AIAgent. Both address #19360.

@teknium1

teknium1 commented May 7, 2026

Copy link
Copy Markdown
Contributor

Thanks for the PR. Closing as duplicate of #19452 (by @LeonSGP43), which was merged via #21298. Both PRs solve the same model.max_tokens plumbing — @LeonSGP43's landed first and also covers the gateway agent-cache signature update. Feel free to resubmit any additional tests you think are missing.

@teknium1 teknium1 closed this May 7, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/config Config system, migrations, profiles comp/agent Core agent loop, run_agent.py, prompt builder duplicate This issue or pull request already exists P2 Medium — degraded but workaround exists type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] Missing max_tokens in API request body causes HTTP 400 when tools are used with custom OpenAI-compatible proxy

3 participants