fix(agent): honor model.max_tokens in config.yaml#19515
Closed
CoreyNoDream wants to merge 1 commit into
Closed
Conversation
AIAgent's constructor accepted max_tokens but no caller passed it, so
self.max_tokens was always None and the OpenAI SDK omitted the field
from chat completion requests.
When routing to an Anthropic model via an OpenAI-compatible proxy
(e.g. LiteLLM, Antigravity Manager) with tools enabled, Anthropic's
Messages API rejects the translated request with HTTP 400 because
max_tokens is required in that path. Users had no way to set it.
Mirror the existing model.context_length reading block: read
_model_cfg.max_tokens, validate as a positive integer, surface a
clear warning on bad values ("4K", 0, -1, etc.), and assign to
self.max_tokens when valid.
Fixes NousResearch#19360.
Collaborator
1 similar comment
Collaborator
Contributor
|
Thanks for the PR. Closing as duplicate of #19452 (by @LeonSGP43), which was merged via #21298. Both PRs solve the same |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes #19360 — all tool-using agents fail with HTTP 400 when routed through an OpenAI-compatible proxy to Anthropic models, because Hermes sends chat completion requests without
max_tokens, which Anthropic's Messages API requires whentoolsis present.AIAgent.__init__acceptsmax_tokensbut no caller ever passes it — soself.max_tokenswas alwaysNoneand the OpenAI SDK omitted the field entirely.config.yaml→model.max_tokens, mirroring the existingmodel.context_lengthblock right above the new code inrun_agent.py."4K"/ non-positive values log a clear warning and fall back to the model default (same shape as thecontext_lengthwarning) rather than crashing.Config usage
Test plan
tests/run_agent/test_max_tokens_config.py(5 cases: valid int, numeric string,"4K"warn,0warn, absent → defaultNone).mainwithout the fix (4/5 fail — only theNone-default case passes, which is the pre-existing behavior) and all pass with it.test_invalid_context_length_warning.pystill green — the new block only reuses_model_cfg, doesn't alter the existing_config_context_lengthflow.