fix(transport): add default max_tokens for custom providers with tools#19686
fix(transport): add default max_tokens for custom providers with tools#19686Cyrene963 wants to merge 2 commits into
Conversation
When a custom OpenAI-compatible proxy forwards to Anthropic models, the request fails with HTTP 400 if tools are present but max_tokens is not set. Anthropic's Messages API requires max_tokens when tool schemas are included. Add a fallback: when provider is custom, tools are present, and no max_tokens is configured, default to 4096. This matches the Bedrock transport pattern (self.max_tokens or 4096). Fixes NousResearch#19360
|
Likely duplicate of #19452 |
|
Closing as duplicate of #19452 (LeonSGP43) — same root cause (custom provider missing max_tokens when tools present, HTTP 400 on Anthropic proxies). Local patch remains active in Cyrene963/hermes-patches until #19452 is merged. |
|
Re-evaluating closure status for #19686 I closed this as a duplicate earlier, but I rechecked the referenced upstream PR(s) and none of them are merged yet:
Because the underlying fix does not appear to have landed upstream, closing this solely as a duplicate may have been premature. I am reopening this PR so it can remain trackable unless maintainers prefer a different canonical PR. |
|
This looks implemented on current Evidence from this automated hermes-sweeper review:
The earlier duplicate PRs mentioned in the thread (#19452 and #19515) were not merged, but the underlying missing- |
Summary
Fixes #19360
When Hermes Agent is configured with a custom OpenAI-compatible proxy that forwards to Anthropic models, all API requests fail with HTTP 400 when tools are present but
max_tokensis not set.Root Cause
Anthropic's Messages API requires
max_tokenswhen tool schemas are included. The chat_completions transport already has provider-specific defaults for NVIDIA NIM (16384), Qwen (65536), and Kimi (32000), but custom providers had no fallback.Fix
Added a fallback in
agent/transports/chat_completions.py: when the provider is custom, tools are present, and nomax_tokensis configured, default to 4096. This matches the existing Bedrock transport pattern (self.max_tokens or 4096).Behavior
max_tokens→ sendsmax_tokens: 4096max_tokenssent (unchanged)max_tokens→ uses configured value (unchanged)Closes #19360