Skip to content

fix(agent): honor configured model max_tokens#19452

Closed
LeonSGP43 wants to merge 1 commit into
NousResearch:mainfrom
LeonSGP43:codex/model-max-tokens-config
Closed

fix(agent): honor configured model max_tokens#19452
LeonSGP43 wants to merge 1 commit into
NousResearch:mainfrom
LeonSGP43:codex/model-max-tokens-config

Conversation

@LeonSGP43

Copy link
Copy Markdown
Contributor

Summary

  • read model.max_tokens from config.yaml when AIAgent is not constructed with an explicit max_tokens
  • keep constructor-provided max_tokens as the highest-priority value
  • include model.max_tokens in the gateway agent cache signature so config edits rebuild cached agents

Why

Custom OpenAI-compatible proxies that translate tool-using chat-completions requests to Anthropic Messages can reject requests that omit max_tokens. Hermes already had an internal max_tokens path, but users could not set it from config.yaml, so custom-provider setups had no stable way to emit the field.

Verification

scripts/run_tests.sh tests/run_agent/test_run_agent.py::TestInit::test_model_max_tokens_from_config tests/run_agent/test_run_agent.py::TestInit::test_constructor_max_tokens_wins_over_config tests/gateway/test_agent_cache.py::TestAgentConfigSignature::test_max_tokens_change_busts_cache tests/gateway/test_agent_cache.py::TestExtractCacheBustingConfig::test_reads_model_context_length

Result: 4 passed, with the existing tests/conftest.py event-loop deprecation warning.

Fixes #19360

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/agent Core agent loop, run_agent.py, prompt builder comp/gateway Gateway runner, session dispatch, delivery P2 Medium — degraded but workaround exists type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] Missing max_tokens in API request body causes HTTP 400 when tools are used with custom OpenAI-compatible proxy

2 participants