Skip to content

[Bug] Missing max_tokens in API request body causes HTTP 400 when tools are used with custom OpenAI-compatible proxy #19360

@autodeal

Description

@autodeal

Bug Description

When Hermes Agent is configured with a custom provider pointing to an OpenAI-compatible reverse proxy (e.g., Antigravity Manager, LiteLLM) that forwards to Anthropic models, ALL API requests fail with HTTP 400 "Request contains an invalid argument" if the request body includes tools (function calling).
Root cause: Hermes sends chat completion requests without max_tokens in the JSON body. When the proxy translates OpenAI-format → Anthropic-format, Anthropic's Messages API requires max_tokens when tools are present. Without it, Anthropic rejects the call.
Impact: All tool-using agents fail when routed through OpenAI-compatible proxies to Anthropic models. The issue is easy to miss because:

  • Requests without tools work fine
  • Clients like AutoClaw/OpenClaw that don't send tools are unaffected
  • Direct Anthropic provider works fine (uses native format)

Steps to Reproduce

  1. Configure Hermes with provider: custom, base_url pointing to an OpenAI-compatible proxy
  2. Set model to any Anthropic model (e.g. claude-opus-4-6-thinking)
  3. Send any query — Hermes will include tools array in the request
  4. Observe HTTP 400: "Request contains an invalid argument."

Expected Behavior

  • (a) Expose model.max_tokens in config.yaml so users can set a value, OR
  • (b) Automatically include a sensible max_tokens default (e.g., 4096) when tools are present with a custom provider

Actual Behavior

No max_tokens is sent. The AIAgent constructor accepts max_tokens (default None = use model default), but:

  1. There is no way to set it via config.yaml
  2. When None, the OpenAI SDK omits the field entirely
  3. The "model default" assumption breaks for Anthropic-through-proxy scenarios

Affected Component

Other, Agent Core (conversation loop, context compression, memory)

Messaging Platform (if gateway-related)

No response

Debug Report

Working (curl with max_tokens):

curl -s http://proxy:8045/v1/chat/completions \
  -H "Authorization: Bearer sk-xxx" \
  -d '{"model":"claude-opus-4-6-thinking","messages":[{"role":"user","content":"hi"}],"tools":[...],"max_tokens":4096}'
# → 200 OK

Operating System

macOS 15.2

Python Version

No response

Hermes Version

v0.12.0 (2026.4.30, commit 167b564)

Additional Logs / Traceback (optional)

Root Cause Analysis (optional)

No response

Proposed Fix (optional)

Approach 1 — Expose max_tokens in config.yaml:
In run_agent.py around line 1864, add:
_model_cfg = _agent_cfg.get("model", {})

if isinstance(_model_cfg, dict):

_config_context_length = _model_cfg.get("context_length")

# NEW:

_config_max_tokens = _model_cfg.get("max_tokens")

if _config_max_tokens is not None:

    try:

        self.max_tokens = int(_config_max_tokens)

    except (TypeError, ValueError):

        pass

User config would then support:
model:

default: claude-opus-4-6-thinking

provider: custom

base_url: http://192.168.1.13:8045/v1

api_key: sk-xxx

max_tokens: 16384

Approach 2 — Smarter default: when using a custom provider with tools, automatically set max_tokens to a reasonable value (e.g., 4096) rather than omitting it.

Are you willing to submit a PR for this?

  • I'd like to fix this myself and submit a PR

Metadata

Metadata

Assignees

No one assigned

    Labels

    P2Medium — degraded but workaround existsarea/configConfig system, migrations, profilescomp/agentCore agent loop, run_agent.py, prompt buildersweeper:implemented-on-mainSweeper: behavior already present on current maintype/bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions