fix(transport): add default max_tokens for custom providers with tools by Cyrene963 · Pull Request #19686 · NousResearch/hermes-agent

Cyrene963 · 2026-05-04T10:58:30Z

Summary

Fixes #19360

When Hermes Agent is configured with a custom OpenAI-compatible proxy that forwards to Anthropic models, all API requests fail with HTTP 400 when tools are present but max_tokens is not set.

Root Cause

Anthropic's Messages API requires max_tokens when tool schemas are included. The chat_completions transport already has provider-specific defaults for NVIDIA NIM (16384), Qwen (65536), and Kimi (32000), but custom providers had no fallback.

Fix

Added a fallback in agent/transports/chat_completions.py: when the provider is custom, tools are present, and no max_tokens is configured, default to 4096. This matches the existing Bedrock transport pattern (self.max_tokens or 4096).

Behavior

Custom provider with tools and no configured max_tokens → sends max_tokens: 4096
Custom provider without tools → no max_tokens sent (unchanged)
Custom provider with configured max_tokens → uses configured value (unchanged)
Non-custom providers → unchanged (existing provider-specific defaults)

Closes #19360

When a custom OpenAI-compatible proxy forwards to Anthropic models, the request fails with HTTP 400 if tools are present but max_tokens is not set. Anthropic's Messages API requires max_tokens when tool schemas are included. Add a fallback: when provider is custom, tools are present, and no max_tokens is configured, default to 4096. This matches the Bedrock transport pattern (self.max_tokens or 4096). Fixes NousResearch#19360

alt-glitch · 2026-05-04T11:01:40Z

Likely duplicate of #19452 — same root cause (custom provider missing max_tokens when tools present, HTTP 400 on Anthropic proxies). Multiple competing PRs for #19360 already open: #19452, #19515.

alt-glitch · 2026-05-04T11:02:21Z

Likely duplicate of #19452

Cyrene963 · 2026-05-04T14:20:41Z

Closing as duplicate of #19452 (LeonSGP43) — same root cause (custom provider missing max_tokens when tools present, HTTP 400 on Anthropic proxies).

Local patch remains active in Cyrene963/hermes-patches until #19452 is merged.

Cyrene963 · 2026-05-25T08:52:31Z

Re-evaluating closure status for #19686

I closed this as a duplicate earlier, but I rechecked the referenced upstream PR(s) and none of them are merged yet:

fix(agent): honor configured model max_tokens #19452: closed, merged=False — fix(agent): honor configured model max_tokens #19452
fix(agent): honor model.max_tokens in config.yaml #19515: closed, merged=False — fix(agent): honor model.max_tokens in config.yaml #19515

Because the underlying fix does not appear to have landed upstream, closing this solely as a duplicate may have been premature. I am reopening this PR so it can remain trackable unless maintainers prefer a different canonical PR.

teknium1 · 2026-06-11T04:15:36Z

This looks implemented on current main by the newer custom-provider profile path.

Evidence from this automated hermes-sweeper review:

plugins/model-providers/custom/__init__.py:66-70 now gives the custom provider a default_max_tokens=65536 fallback when no model-specific max_tokens is configured. This was added in 09ec26c66a130051412e747d49a7ea96f2862b57.
agent/transports/chat_completions.py:515-522 reads profile.get_max_tokens(model) and emits it through max_tokens_param_fn when neither an ephemeral nor user-configured max token value is present.
providers/base.py:148-160 confirms the default provider hook returns self.default_max_tokens.
A local read-only check against current main showed ChatCompletionsTransport with provider_profile=get_provider_profile("custom"), tools present, and no configured max tokens produces max_tokens=65536 while preserving tools.

The earlier duplicate PRs mentioned in the thread (#19452 and #19515) were not merged, but the underlying missing-max_tokens behavior is covered by the later mainline implementation.

alt-glitch added type/bug Something isn't working P2 Medium — degraded but workaround exists comp/agent Core agent loop, run_agent.py, prompt builder provider/anthropic Anthropic native Messages API labels May 4, 2026

fix: extract is_custom_provider from params before use

5b41994

Cyrene963 closed this May 4, 2026

Cyrene963 reopened this May 25, 2026

teknium1 closed this Jun 11, 2026

teknium1 added the sweeper:implemented-on-main Sweeper: behavior already present on current main label Jun 11, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(transport): add default max_tokens for custom providers with tools#19686

fix(transport): add default max_tokens for custom providers with tools#19686
Cyrene963 wants to merge 2 commits into
NousResearch:mainfrom
Cyrene963:fix/max_tokens_custom_proxy

Cyrene963 commented May 4, 2026

Uh oh!

alt-glitch commented May 4, 2026

Uh oh!

alt-glitch commented May 4, 2026

Uh oh!

Cyrene963 commented May 4, 2026

Uh oh!

Cyrene963 commented May 25, 2026

Uh oh!

teknium1 commented Jun 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Cyrene963 commented May 4, 2026

Summary

Root Cause

Fix

Behavior

Uh oh!

alt-glitch commented May 4, 2026

Uh oh!

alt-glitch commented May 4, 2026

Uh oh!

Cyrene963 commented May 4, 2026

Uh oh!

Cyrene963 commented May 25, 2026

Uh oh!

teknium1 commented Jun 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants