Skip to content

fix(transport): add default max_tokens for custom providers with tools#19686

Closed
Cyrene963 wants to merge 2 commits into
NousResearch:mainfrom
Cyrene963:fix/max_tokens_custom_proxy
Closed

fix(transport): add default max_tokens for custom providers with tools#19686
Cyrene963 wants to merge 2 commits into
NousResearch:mainfrom
Cyrene963:fix/max_tokens_custom_proxy

Conversation

@Cyrene963

Copy link
Copy Markdown

Summary

Fixes #19360

When Hermes Agent is configured with a custom OpenAI-compatible proxy that forwards to Anthropic models, all API requests fail with HTTP 400 when tools are present but max_tokens is not set.

Root Cause

Anthropic's Messages API requires max_tokens when tool schemas are included. The chat_completions transport already has provider-specific defaults for NVIDIA NIM (16384), Qwen (65536), and Kimi (32000), but custom providers had no fallback.

Fix

Added a fallback in agent/transports/chat_completions.py: when the provider is custom, tools are present, and no max_tokens is configured, default to 4096. This matches the existing Bedrock transport pattern (self.max_tokens or 4096).

Behavior

  • Custom provider with tools and no configured max_tokens → sends max_tokens: 4096
  • Custom provider without tools → no max_tokens sent (unchanged)
  • Custom provider with configured max_tokens → uses configured value (unchanged)
  • Non-custom providers → unchanged (existing provider-specific defaults)

Closes #19360

When a custom OpenAI-compatible proxy forwards to Anthropic models,
the request fails with HTTP 400 if tools are present but max_tokens
is not set. Anthropic's Messages API requires max_tokens when tool
schemas are included.

Add a fallback: when provider is custom, tools are present, and no
max_tokens is configured, default to 4096. This matches the Bedrock
transport pattern (self.max_tokens or 4096).

Fixes NousResearch#19360
@alt-glitch alt-glitch added type/bug Something isn't working P2 Medium — degraded but workaround exists comp/agent Core agent loop, run_agent.py, prompt builder provider/anthropic Anthropic native Messages API labels May 4, 2026
@alt-glitch

Copy link
Copy Markdown
Collaborator

Likely duplicate of #19452 — same root cause (custom provider missing max_tokens when tools present, HTTP 400 on Anthropic proxies). Multiple competing PRs for #19360 already open: #19452, #19515.

@alt-glitch

Copy link
Copy Markdown
Collaborator

Likely duplicate of #19452

@Cyrene963

Copy link
Copy Markdown
Author

Closing as duplicate of #19452 (LeonSGP43) — same root cause (custom provider missing max_tokens when tools present, HTTP 400 on Anthropic proxies).

Local patch remains active in Cyrene963/hermes-patches until #19452 is merged.

@Cyrene963 Cyrene963 closed this May 4, 2026
@Cyrene963 Cyrene963 reopened this May 25, 2026
@Cyrene963

Copy link
Copy Markdown
Author

Re-evaluating closure status for #19686

I closed this as a duplicate earlier, but I rechecked the referenced upstream PR(s) and none of them are merged yet:

Because the underlying fix does not appear to have landed upstream, closing this solely as a duplicate may have been premature. I am reopening this PR so it can remain trackable unless maintainers prefer a different canonical PR.

@teknium1

Copy link
Copy Markdown
Contributor

This looks implemented on current main by the newer custom-provider profile path.

Evidence from this automated hermes-sweeper review:

  • plugins/model-providers/custom/__init__.py:66-70 now gives the custom provider a default_max_tokens=65536 fallback when no model-specific max_tokens is configured. This was added in 09ec26c66a130051412e747d49a7ea96f2862b57.
  • agent/transports/chat_completions.py:515-522 reads profile.get_max_tokens(model) and emits it through max_tokens_param_fn when neither an ephemeral nor user-configured max token value is present.
  • providers/base.py:148-160 confirms the default provider hook returns self.default_max_tokens.
  • A local read-only check against current main showed ChatCompletionsTransport with provider_profile=get_provider_profile("custom"), tools present, and no configured max tokens produces max_tokens=65536 while preserving tools.

The earlier duplicate PRs mentioned in the thread (#19452 and #19515) were not merged, but the underlying missing-max_tokens behavior is covered by the later mainline implementation.

@teknium1 teknium1 closed this Jun 11, 2026
@teknium1 teknium1 added the sweeper:implemented-on-main Sweeper: behavior already present on current main label Jun 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/agent Core agent loop, run_agent.py, prompt builder P2 Medium — degraded but workaround exists provider/anthropic Anthropic native Messages API sweeper:implemented-on-main Sweeper: behavior already present on current main type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] Missing max_tokens in API request body causes HTTP 400 when tools are used with custom OpenAI-compatible proxy

3 participants