You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[Bug]: On a Claude Max 20x subscription with a valid OAuth access token from ~/.claude/.credentials.json, every Hermes request to native Anthropic (provider: anthropic, https://api.anthropic.com/v1/messages) is rejected with HTTP 400 #15080
On a Claude Max 20x subscription with a valid OAuth access token from ~/.claude/.credentials.json, every Hermes request to native
Anthropic (provider: anthropic, https://api.anthropic.com/v1/messages) is rejected with HTTP 400:
⚠ API call failed (attempt 1/3): BadRequestError [HTTP 400]
🔌 Provider: anthropic Model: claude-sonnet-4-6
🌐 Endpoint: https://api.anthropic.com
📝 Error: HTTP 400: You're out of extra usage. Add more at claude.ai/settings/usage and keep going.
📋 Details: {'type': 'error', 'error': {'type': 'invalid_request_error',
'message': "You're out of extra usage. Add more at claude.ai/settings/usage and keep going.",
'request_id': 'req_011CaNQLGvmMm1h3P6yREkBF'}}
⚠ Non-retryable error (HTTP 400) — trying fallback...
❌ Non-retryable client error (HTTP 400). Aborting.
This fires on every agent turn, starting with the first Initializing agent... call on an empty conversation. Retries and model
switches (claude-opus-4-7, claude-opus-4-6, claude-sonnet-4-6, claude-haiku-4-5) all produce the identical 400.
The error message is misleading: the Max subscription is not exhausted. Response headers from a successful probe against the same
account and token show 5h-utilization: 0.03, 7d-utilization: 0.0, 5h-status: allowed. The account simply does not have
pay-as-you-go API credits (overage-status: rejected, overage-disabled-reason: org_level_disabled_until) — which should be
irrelevant, since requests ought to be served out of the Max lane, not the overage lane.
Isolated testing against the raw Anthropic API (same token, same headers Hermes sends) narrows the trigger down to a single
variable: the presence of the tools parameter in the request body. With tools omitted, /v1/messages returns 200 OK on every model.
With even a single dummy tool present, the same call returns the 400 above. Hermes cannot function without tools, so native
Anthropic + Claude Max OAuth is unusable in this state.
PR #10576's system-prompt sanitizer has been applied locally (verified via post-sanitize payload dump that it reaches the wire). It
does not change the outcome for this trigger path — the 400 is driven by the tools parameter, not by Hermes-specific phrases in
the system prompt. Detailed reproduction, header values, and what I tried are in the other fields.
Steps to Reproduce
● Steps to Reproduce
Prerequisite
Active Claude Max (or Pro) subscription with a valid OAuth login via claude /login or claude setup-token (i.e.
~/.claude/.credentials.json exists with a non-expired claudeAiOauth.accessToken).
No pay-as-you-go API credits on the same Anthropic organization (overage lane disabled — this is the default for Max-only users).
Reproduce via Hermes
Ensure a clean Anthropic auth path (no competing tokens):
Make sure no env token shadows the credential file
Set the active model/provider to native Anthropic:
hermes config set model.provider anthropic
hermes config set model.default claude-sonnet-4-6
(Reproduces identically on claude-opus-4-7, claude-opus-4-6, claude-haiku-4-5.)
Launch Hermes in a fresh shell and send any message:
hermes
> test
Expected: agent responds.
Actual: BadRequestError [HTTP 400] … You're out of extra usage. on the very first turn, before any tool call is ever invoked.
Non-retryable — Hermes aborts.
Reproduce in isolation (no Hermes involved)
The following script, using only python3 stdlib, demonstrates that the trigger is the tools parameter in the request body —
independent of Hermes:
tools=False → 200
tools=True → 400 {"type":"error","error":{"type":"invalid_request_error","message":"You're out of extra usage. Add more at
claude.ai/settings/usage and keep going."}}
Flipping with_tools is the only change; headers, system prompt, messages, and model stay identical.
What does not change the outcome
For the tools=True case, I verified that none of the following flip the 400 back to 200:
different user-agent strings (claude-cli/2.1.74, /2.1.119, /2.1.200, without parens, without (external, cli), empty, or a
realistic Node-style Claude Code UA)
removing or changing x-app: cli
dropping claude-code-20250219 beta (yields 401 instead, not 200)
adding or removing the interleaved-thinking / fine-grained-tool-streaming betas
switching the model to any of opus-4-7, opus-4-6, sonnet-4-6, haiku-4-5
using an empty system prompt vs. the full post-sanitize Hermes system prompt (also reproduced with PR fix(anthropic): sanitize oauth system prompt for Claude Max proxy #10576 applied locally plus
additional red-team phrase rewrites — verified via post-sanitize dump that the replacements reach the wire)
Frequency
100% reproducible. Happens on every invocation of hermes against provider: anthropic with Claude Max OAuth once tools are attached.
Expected Behavior
Expected Behavior
A Claude Max subscriber whose OAuth token authenticates /v1/messages should be able to send tools-carrying requests and have them
served out of the Max lane, not re-classified into the overage lane — at least as long as the subscription budget is not exhausted
and the identity headers Hermes sends (claude-cli/* (external, cli), x-app: cli, Claude Code / OAuth beta headers) are the
supported way for an external OAuth client to present itself.
Concretely, the minimal reproduction script in the previous section should return 200 for both tools=False and tools=True — the
same way the claude CLI and Paperclip's claude-local adapter (which spawn the official CLI as a subprocess on the same account)
succeed with tool-use against this account today.
If Anthropic's infrastructure is not willing to route tools-carrying Bearer-OAuth traffic to the Max lane from third-party clients
at all, then Hermes should at minimum:
Classify this 400 distinctly and actionably. Right now the error message is literally You're out of extra usage. Add more at
claude.ai/settings/usage and keep going. — which is misleading when the Max subscription is 3% utilized. Hermes should detect the
specific invalid_request_error + "out of extra usage" signature on an OAuth/Claude-Max request and surface a single, accurate hint
to the user, e.g.:
▎ Anthropic rejected your tools-carrying OAuth request as overage, even though your Max budget is not exhausted. Your account
appears to route external OAuth tool-use to the overage lane, which is disabled. Options: (a) switch to OpenRouter or another
provider, (b) add API credits at claude.ai/settings/usage, (c) use a subprocess Anthropic adapter (not yet available).
2. Not retry it 3× as if it were transient. It is deterministic; retrying burns latency and noise.
3. Offer an automatic fallback when a fallback provider is configured (e.g. OpenRouter) instead of aborting, or at least recommend
the concrete hermes model command to switch.
4. Longer-term: provide a subprocess-style Anthropic adapter analogous to Paperclip's claude-local (shell out to the official
claude CLI), so Max-OAuth users whose direct-API tool-use requests are re-classified can still use Anthropic natively through
Hermes.
Actual Behavior
● Actual Behavior
Every agent turn against provider: anthropic fails on the first outbound request with HTTP 400, before any tool is ever invoked by
the model. Hermes attempts the configured retry sequence, sees the server marking the error non-retryable, attempts the fallback
chain, and aborts the turn. No assistant response is produced.
Full terminal output (fresh hermes session, single test prompt)
Welcome to Hermes Agent! Type your message or /help for commands.
✦ Tip: hermes logs -f follows agent.log in real time. --level WARNING --since 1h filters output.
⚠ API call failed (attempt 1/3): BadRequestError [HTTP 400]
🔌 Provider: anthropic Model: claude-sonnet-4-6
🌐 Endpoint: https://api.anthropic.com
📝 Error: HTTP 400: You're out of extra usage. Add more at claude.ai/settings/usage and keep going.
📋 Details: {'type': 'error', 'error': {'type': 'invalid_request_error', 'message': "You're out of extra usage. Add more at
claude.ai/settings/usage and keep going."}, 'request_id': 'req_011CaNPjgbLHY6m88qSvxnmT'}
⚠ Non-retryable error (HTTP 400) — trying fallback...
❌ Non-retryable error (HTTP 400): HTTP 400: You're out of extra usage. Add more at claude.ai/settings/usage and keep going.
❌ Non-retryable client error (HTTP 400). Aborting.
🔌 Provider: anthropic Model: claude-sonnet-4-6
🌐 Endpoint: https://api.anthropic.com
💡 This type of error won't be fixed by retrying.
─ ⚕ Hermes ──────────────────────────────────────────────────────────────────
Error: Error code: 400 - {'type': 'error', 'error': {'type':
'invalid_request_error', 'message': "You're out of extra usage. Add more at
claude.ai/settings/usage and keep going."}, 'request_id':
'req_011CaNPjgbLHY6m88qSvxnmT'}
Exit via /exit — no assistant turn ever lands.
Relevant ~/.hermes/logs/agent.log excerpt around the failure
2026-04-24 11:53:40,662 INFO agent.auxiliary_client: Vision auto-detect: using main provider anthropic (claude-sonnet-4-6)
2026-04-24 11:53:41,563 INFO agent.auxiliary_client: Vision auto-detect: using main provider anthropic (claude-sonnet-4-6)
2026-04-24 11:53:41,994 INFO agent.auxiliary_client: Vision auto-detect: using main provider anthropic (claude-sonnet-4-6)
2026-04-24 11:53:44,205 INFO agent.auxiliary_client: Vision auto-detect: using main provider anthropic (claude-sonnet-4-6)
2026-04-24 11:53:44,693 INFO agent.auxiliary_client: Auxiliary auto-detect: using main provider anthropic (claude-sonnet-4-6)
2026-04-24 11:53:45,617 ERROR [20260424_115340_255733] root: Non-retryable client error: Error code: 400 - {'type': 'error',
'error': {'type': 'invalid_request_error', 'message': "You're out of extra usage. Add more at claude.ai/settings/usage and keep
going."}, 'request_id': 'req_011CaNPjgbLHY6m88qSvxnmT'}
No traceback — the Anthropic SDK raises anthropic.BadRequestError cleanly, Hermes catches it in the non-retryable branch and
aborts.
Request shape at the point of failure
Captured via a local debug dump in agent/anthropic_adapter.py::build_anthropic_kwargs (env-gated), fired on the actual failing
call:
model = claude-sonnet-4-6
is_oauth = True # token correctly classified as OAuth
base_url = https://api.anthropic.com
n_tools = 41 # includes mcp_engram_mem_*, terminal, read_file, write_file, etc.
n_messages = 1 # single user message: "test"
system = list of 2 text blocks, total ~21.7 KB
block 0: Claude Code identity prefix (57 chars)
block 1: Hermes-assembled system prompt (memory, profile, skill catalog)
Confirmed the OAuth sanitize branch is entered (identity prefix prepended, PR #10576 replacements applied, local red-team phrase
replacements applied — all verified via a post-sanitize dump written immediately before the SDK call). The 400 still fires.
Minimal reproduction outside Hermes
The same account, same token, same headers — but issued directly against https://api.anthropic.com/v1/messages with python3 stdlib
— returns the same 400 whenever a tools array is included in the body, and 200 OK when it is omitted. Full script and output are in
the Reproduction section.
Observed rate-limit / billing headers on a parallel successful probe (same token, no tools)
The Max subscription itself is clearly healthy; the 400 on the failing call comes from Anthropic routing the tools-carrying request
into the overage lane (which is disabled for the org), not from any real exhaustion of entitlement.
Complete log snippet from ~/.hermes/logs/agent.log covering the most recent
failing turn — plugin discovery, MCP tool registration, vision/auxiliary
auto-detect resolving to the main Anthropic provider, then the non-retryable
400 on the first outbound call:
2026-04-24 11:53:39,227 INFO hermes_cli.plugins: Plugin 'openai' registered image_gen provider: openai
2026-04-24 11:53:39,227 INFO hermes_cli.plugins: Plugin 'openai-codex' registered image_gen provider: openai-codex
2026-04-24 11:53:39,249 INFO hermes_cli.plugins: Plugin 'xai' registered image_gen provider: xai
2026-04-24 11:53:39,250 INFO hermes_cli.plugins: Plugin discovery complete: 4 found, 3 enabled
2026-04-24 11:53:39,696 INFO run_agent: Loaded environment variables from /home/alrik/.hermes/.env
2026-04-24 11:53:40,354 INFO tools.mcp_tool: MCP server 'engram' (stdio): registered 11 tool(s)
2026-04-24 11:53:40,355 INFO tools.mcp_tool: MCP: registered 11 tool(s) from 1 server(s)
2026-04-24 11:53:40,662 INFO agent.auxiliary_client: Vision auto-detect: using main provider anthropic (claude-sonnet-4-6)
2026-04-24 11:53:41,563 INFO agent.auxiliary_client: Vision auto-detect: using main provider anthropic (claude-sonnet-4-6)
2026-04-24 11:53:41,994 INFO agent.auxiliary_client: Vision auto-detect: using main provider anthropic (claude-sonnet-4-6)
2026-04-24 11:53:44,205 INFO agent.auxiliary_client: Vision auto-detect: using main provider anthropic (claude-sonnet-4-6)
2026-04-24 11:53:44,693 INFO agent.auxiliary_client: Auxiliary auto-detect: using main provider anthropic (claude-sonnet-4-6)
2026-04-24 11:53:45,617 ERROR [20260424_115340_255733] root: Non-retryable client error:
Error code: 400 - {'type': 'error', 'error': {'type': 'invalid_request_error',
'message': "You're out of extra usage. Add more at claude.ai/settings/usage and keep going."},
'request_id': 'req_011CaNPjgbLHY6m88qSvxnmT'}
No Python traceback is surfaced — the Anthropic SDK raises `anthropic.BadRequestError`
cleanly and Hermes handles it in the non-retryable error branch via
`run_agent.py` → `error_classifier.py`, which correctly decides against
retrying. So the flow is well-behaved; the problem is upstream at Anthropic's request classifier.
Root Cause Analysis (optional)
This is not a logic bug in Hermes. Hermes's Anthropic path is correct;
the failure is a server-side classifier at Anthropic that routes tools-carrying OAuth/Bearer requests from third-party clients into
the overage billing lane, which is disabled on Max-only accounts
without pay-as-you-go credits.
What I verified inside Hermes
Token resolution: resolve_anthropic_token() correctly returns the
OAuth access token from ~/.claude/.credentials.json (priority 3;
env vars 1, 2, 4 empty/commented out).
_is_oauth_token(token) returns True.
self._is_anthropic_oauth is True at the build_anthropic_kwargs
call site in run_agent.py (line ~1963, ~6910).
OAuth branch in anthropic_adapter.py::build_anthropic_kwargs
(line ~1510) is entered: Claude Code identity prefix prepended, _sanitize_oauth_system_text applied, tool names prefixed with mcp_.
PR #10576 sanitizes Hermes-specific phrases out of the system prompt.
Applied locally, verified via post-sanitize dump that rewrites reach
the wire. 400 persists unchanged. I also added local rewrites for
content-filter-adjacent terms from the skill catalogue (Jailbreak, godmode:, obliteratus:, Remove refusal behaviors, red-teaming).
Still 400. The classifier reacts to the presence of tools
independently of system-prompt content.
Why Paperclip works on the same account
paperclip-company-runtime/packages/adapters/claude-local/ spawns the
official claude CLI as a subprocess and exchanges messages over its
stdio JSON protocol. The actual /v1/messages request carrying tools: [...] is issued by Anthropic's own CLI, which the
infrastructure recognises as a first-class Claude Code session and
keeps in the Max lane. No header spoofing from external Python
reproduces that classification.
What Hermes can fix
Detect the specific 400 signature on an OAuth path and emit an
accurate, actionable message (in agent/error_classifier.py + agent/anthropic_adapter.py).
Skip the retry loop for this deterministic failure.
Auto-fallback or recommend hermes model to switch provider.
(Larger) Ship a subprocess-style Anthropic adapter analogous to
Paperclip's claude-local, so
Proposed Fix (optional)
This is not a logic bug inside Hermes. Hermes's Anthropic path is doing
everything correctly — token resolution, OAuth classification, Claude Code
identity prepend, tool-name prefix, SDK construction. The failure is server-
side at Anthropic: /v1/messages routes tools-carrying OAuth/Bearer
requests from third-party clients into the overage billing lane, which is
disabled on Max-only accounts with no pay-as-you-go credits.
Code path verified end-to-end
hermes_cli/runtime_provider.py::resolve_runtime_provider
Returns {provider: "anthropic", api_mode: "anthropic_messages",
base_url: "https://api.anthropic.com", api_key: ,
source: "claude_code", credential_pool: }.
Confirmed: source is claude_code, not a stale pool entry.
run_agent.py (around line 1963):
self._is_anthropic_oauth = _is_oauth_token(effective_key) if _is_native_anthropic else False
Confirmed True at the build-kwargs call site.
run_agent.py::_build_api_kwargs (around line 6895-6915):
Delegates to agent/transports/anthropic.py with is_oauth=self._is_anthropic_oauth — propagates True.
agent/anthropic_adapter.py::build_anthropic_kwargs (OAuth branch,
around line 1510):
Prepends _CLAUDE_CODE_SYSTEM_PREFIX block.
Runs _sanitize_oauth_system_text on each system text block.
Prefixes tool names with mcp_.
Confirmed via an env-gated post-sanitize dump: the sanitized text and
mcp_-prefixed tool names are what hits the SDK.
build_anthropic_client (around line 417) selects the OAuth branch: auth_token=api_key, headers {anthropic-beta: common + OAuth betas,
user-agent: claude-cli/ (external, cli), x-app: cli}.
SDK call client.messages.create(**kwargs) issues a real request
to https://api.anthropic.com/v1/messages. Server returns 400
invalid_request_error with message "You're out of extra usage."
whenever tools is present in the body, independent of header or
payload variations this client can realistically control.
Isolation confirms it is not Hermes
Running the minimal reproduction script against the raw API with
Python stdlib — same token from ~/.claude/.credentials.json, same
headers the adapter sets — yields:
tools=False → 200
tools=True → 400 "out of extra usage"
There is no code change inside the Hermes process tree that converts
the second line into a 200. The trigger is the tools parameter in the
request body, as observed by Anthropic's infrastructure.
PR #10576 sanitizes Hermes-specific phrases out of the system prompt to
bypass a different facet of the same classifier. It is correct and
should land. On accounts in this classification state it is, however,
insufficient — the classifier also reacts to the mere presence of tools, independent of any system-prompt content. I applied PR #10576
locally plus additional rewrites for clearly content-filter-adjacent
terms surfaced in the skill catalogue (Jailbreak, godmode:, obliteratus:, Remove refusal behaviors, red-teaming). Verified via
post-sanitize dump that all replacements reach the wire. 400 unchanged.
Why Paperclip's same-account path succeeds
paperclip-company-runtime/packages/adapters/claude-local/ does not call
/v1/messages directly. It spawns the official claude CLI as a subprocess
and exchanges messages over its stdio/JSON protocol. The underlying HTTPS
request that carries tools: [...] is issued by Anthropic's own CLI,
which the infrastructure recognises as a first-class Claude Code session
and keeps in the Max lane. There is no combination of headers in third-
party Python code that has so far reproduced that classification from
outside the official client.
Implication for fixes
The only changes Hermes itself can make are:
Detect this specific 400 shape (invalid_request_error + "out of extra usage" on an OAuth/Claude-Max path where 5h-utilization
is low) and emit an accurate, actionable message instead of the
misleading raw body.
Do not retry; classify as deterministic.
Offer automatic fallback if one is configured, or recommend the exact hermes model command to switch.
Optionally ship a subprocess-style Anthropic adapter analogous to paperclip-company-runtime/packages/adapters/claude-local/ so that
Max-OAuth users can keep tool-use working on native Anthropic.
The first three can live in agent/error_classifier.py and agent/anthropic_adapter.py. The fourth is a new adapter module
alongside the existing bedrock_adapter.py / gemini_cloudcode_adapter.py.
Proposed Fix
Two-part proposal, smallest-first.
Part 1 — accurate error surface (small, high value)
In agent/error_classifier.py, recognise the specific signature
status_code == 400
AND body.error.type == "invalid_request_error"
AND "out of extra usage" in body.error.message
AND request was on OAuth auth (Bearer, sk-ant-oat01-*)
and classify it as a distinct, non-retryable, non-fallback-recoverable
error kind (e.g. AnthropicOAuthToolsReclassified). In the user-facing
message, replace the raw upstream string with something like:
Anthropic rejected this tools-carrying OAuth request as overage even
though the Max subscription is not exhausted. Your account currently
routes external OAuth tool-use to the overage lane, which is disabled.
Options:
- Switch provider: hermes config set model.provider openrouter
- Add API credits: https://claude.ai/settings/usage
- Use Claude Code directly for this task.
In run_agent.py, skip the attempt-1/3 retry loop for this kind and
abort immediately with the above message. Removes log noise and user
confusion.
Part 2 — subprocess adapter (larger, optional but durable)
New module agent/anthropic_claude_local_adapter.py, reachable via a
config knob such as:
Spawn claude via subprocess in a persistent session.
Marshal Hermes's OpenAI-style messages + tools into the CLI's JSON
stdio protocol (reuse the format Paperclip uses — paperclip-company-runtime/packages/adapters/claude-local/ is a
reference implementation).
Stream CLI stdout back as Anthropic-shape deltas so existing consumers
(context_compressor, prompt_caching, usage_pricing) keep working
unchanged.
Fall back to the existing direct adapter if claude is not installed.
Benefit: Max-OAuth users keep native Anthropic with tool-use. The direct
adapter stays for API-key users (who don't hit this classifier path).
I'm happy to draft Part 1 as a PR. Part 2 is larger and probably deserves
design discussion first.
Bug Description
Bug Description
On a Claude Max 20x subscription with a valid OAuth access token from ~/.claude/.credentials.json, every Hermes request to native
Anthropic (provider: anthropic, https://api.anthropic.com/v1/messages) is rejected with HTTP 400:
⚠ API call failed (attempt 1/3): BadRequestError [HTTP 400]
🔌 Provider: anthropic Model: claude-sonnet-4-6
🌐 Endpoint: https://api.anthropic.com
📝 Error: HTTP 400: You're out of extra usage. Add more at claude.ai/settings/usage and keep going.
📋 Details: {'type': 'error', 'error': {'type': 'invalid_request_error',
'message': "You're out of extra usage. Add more at claude.ai/settings/usage and keep going.",
'request_id': 'req_011CaNQLGvmMm1h3P6yREkBF'}}
⚠ Non-retryable error (HTTP 400) — trying fallback...
❌ Non-retryable client error (HTTP 400). Aborting.
This fires on every agent turn, starting with the first Initializing agent... call on an empty conversation. Retries and model
switches (claude-opus-4-7, claude-opus-4-6, claude-sonnet-4-6, claude-haiku-4-5) all produce the identical 400.
The error message is misleading: the Max subscription is not exhausted. Response headers from a successful probe against the same
account and token show 5h-utilization: 0.03, 7d-utilization: 0.0, 5h-status: allowed. The account simply does not have
pay-as-you-go API credits (overage-status: rejected, overage-disabled-reason: org_level_disabled_until) — which should be
irrelevant, since requests ought to be served out of the Max lane, not the overage lane.
Isolated testing against the raw Anthropic API (same token, same headers Hermes sends) narrows the trigger down to a single
variable: the presence of the tools parameter in the request body. With tools omitted, /v1/messages returns 200 OK on every model.
With even a single dummy tool present, the same call returns the 400 above. Hermes cannot function without tools, so native
Anthropic + Claude Max OAuth is unusable in this state.
PR #10576's system-prompt sanitizer has been applied locally (verified via post-sanitize payload dump that it reaches the wire). It
does not change the outcome for this trigger path — the 400 is driven by the tools parameter, not by Hermes-specific phrases in
the system prompt. Detailed reproduction, header values, and what I tried are in the other fields.
Steps to Reproduce
● Steps to Reproduce
Prerequisite
~/.claude/.credentials.json exists with a non-expired claudeAiOauth.accessToken).
Reproduce via Hermes
Make sure no env token shadows the credential file
grep -nE '^ANTHROPIC_(TOKEN|API_KEY)=' ~/.hermes/.env
Both should be empty or commented out
hermes config set model.provider anthropic
hermes config set model.default claude-sonnet-4-6
hermes
> test
Actual: BadRequestError [HTTP 400] … You're out of extra usage. on the very first turn, before any tool call is ever invoked.
Non-retryable — Hermes aborts.
Reproduce in isolation (no Hermes involved)
The following script, using only python3 stdlib, demonstrates that the trigger is the tools parameter in the request body —
independent of Hermes:
import json, os, urllib.request, urllib.error
token = json.load(open(os.path.expanduser('~/.claude/.credentials.json')))['claudeAiOauth']['accessToken']
headers = {
"authorization": f"Bearer {token}",
"anthropic-version": "2023-06-01",
"anthropic-beta":
"interleaved-thinking-2025-05-14,fine-grained-tool-streaming-2025-05-14,claude-code-20250219,oauth-2025-04-20",
"user-agent": "claude-cli/2.1.119 (external, cli)",
"x-app": "cli",
"content-type": "application/json",
}
for with_tools in (False, True):
payload = {
"model": "claude-opus-4-7",
"max_tokens": 10,
"system": [{"type": "text", "text": "You are Claude Code, Anthropic's official CLI for Claude."}],
"messages": [{"role": "user", "content": "hi"}],
}
if with_tools:
payload["tools"] = [{
"name": "mcp_ping",
"description": "x",
"input_schema": {"type": "object", "properties": {}},
}]
Observed output on an affected account:
tools=False → 200
tools=True → 400 {"type":"error","error":{"type":"invalid_request_error","message":"You're out of extra usage. Add more at
claude.ai/settings/usage and keep going."}}
Flipping with_tools is the only change; headers, system prompt, messages, and model stay identical.
What does not change the outcome
For the tools=True case, I verified that none of the following flip the 400 back to 200:
realistic Node-style Claude Code UA)
additional red-team phrase rewrites — verified via post-sanitize dump that the replacements reach the wire)
Frequency
100% reproducible. Happens on every invocation of hermes against provider: anthropic with Claude Max OAuth once tools are attached.
Expected Behavior
Expected Behavior
A Claude Max subscriber whose OAuth token authenticates /v1/messages should be able to send tools-carrying requests and have them
served out of the Max lane, not re-classified into the overage lane — at least as long as the subscription budget is not exhausted
and the identity headers Hermes sends (claude-cli/* (external, cli), x-app: cli, Claude Code / OAuth beta headers) are the
supported way for an external OAuth client to present itself.
Concretely, the minimal reproduction script in the previous section should return 200 for both tools=False and tools=True — the
same way the claude CLI and Paperclip's claude-local adapter (which spawn the official CLI as a subprocess on the same account)
succeed with tool-use against this account today.
If Anthropic's infrastructure is not willing to route tools-carrying Bearer-OAuth traffic to the Max lane from third-party clients
at all, then Hermes should at minimum:
claude.ai/settings/usage and keep going. — which is misleading when the Max subscription is 3% utilized. Hermes should detect the
specific invalid_request_error + "out of extra usage" signature on an OAuth/Claude-Max request and surface a single, accurate hint
to the user, e.g.:
▎ Anthropic rejected your tools-carrying OAuth request as overage, even though your Max budget is not exhausted. Your account
appears to route external OAuth tool-use to the overage lane, which is disabled. Options: (a) switch to OpenRouter or another
provider, (b) add API credits at claude.ai/settings/usage, (c) use a subprocess Anthropic adapter (not yet available).
2. Not retry it 3× as if it were transient. It is deterministic; retrying burns latency and noise.
3. Offer an automatic fallback when a fallback provider is configured (e.g. OpenRouter) instead of aborting, or at least recommend
the concrete hermes model command to switch.
4. Longer-term: provide a subprocess-style Anthropic adapter analogous to Paperclip's claude-local (shell out to the official
claude CLI), so Max-OAuth users whose direct-API tool-use requests are re-classified can still use Anthropic natively through
Hermes.
Actual Behavior
● Actual Behavior
Every agent turn against provider: anthropic fails on the first outbound request with HTTP 400, before any tool is ever invoked by
the model. Hermes attempts the configured retry sequence, sees the server marking the error non-retryable, attempts the fallback
chain, and aborts the turn. No assistant response is produced.
Full terminal output (fresh hermes session, single test prompt)
Welcome to Hermes Agent! Type your message or /help for commands.
✦ Tip: hermes logs -f follows agent.log in real time. --level WARNING --since 1h filters output.
────────────────────────────────────────
● test
Initializing agent...
────────────────────────────────────────
⚠ API call failed (attempt 1/3): BadRequestError [HTTP 400]
🔌 Provider: anthropic Model: claude-sonnet-4-6
🌐 Endpoint: https://api.anthropic.com
📝 Error: HTTP 400: You're out of extra usage. Add more at claude.ai/settings/usage and keep going.
📋 Details: {'type': 'error', 'error': {'type': 'invalid_request_error', 'message': "You're out of extra usage. Add more at
claude.ai/settings/usage and keep going."}, 'request_id': 'req_011CaNPjgbLHY6m88qSvxnmT'}
⚠ Non-retryable error (HTTP 400) — trying fallback...
❌ Non-retryable error (HTTP 400): HTTP 400: You're out of extra usage. Add more at claude.ai/settings/usage and keep going.
❌ Non-retryable client error (HTTP 400). Aborting.
🔌 Provider: anthropic Model: claude-sonnet-4-6
🌐 Endpoint: https://api.anthropic.com
💡 This type of error won't be fixed by retrying.
─ ⚕ Hermes ──────────────────────────────────────────────────────────────────
Error: Error code: 400 - {'type': 'error', 'error': {'type':
'invalid_request_error', 'message': "You're out of extra usage. Add more at
claude.ai/settings/usage and keep going."}, 'request_id':
'req_011CaNPjgbLHY6m88qSvxnmT'}
Exit via /exit — no assistant turn ever lands.
Relevant ~/.hermes/logs/agent.log excerpt around the failure
2026-04-24 11:53:40,662 INFO agent.auxiliary_client: Vision auto-detect: using main provider anthropic (claude-sonnet-4-6)
2026-04-24 11:53:41,563 INFO agent.auxiliary_client: Vision auto-detect: using main provider anthropic (claude-sonnet-4-6)
2026-04-24 11:53:41,994 INFO agent.auxiliary_client: Vision auto-detect: using main provider anthropic (claude-sonnet-4-6)
2026-04-24 11:53:44,205 INFO agent.auxiliary_client: Vision auto-detect: using main provider anthropic (claude-sonnet-4-6)
2026-04-24 11:53:44,693 INFO agent.auxiliary_client: Auxiliary auto-detect: using main provider anthropic (claude-sonnet-4-6)
2026-04-24 11:53:45,617 ERROR [20260424_115340_255733] root: Non-retryable client error: Error code: 400 - {'type': 'error',
'error': {'type': 'invalid_request_error', 'message': "You're out of extra usage. Add more at claude.ai/settings/usage and keep
going."}, 'request_id': 'req_011CaNPjgbLHY6m88qSvxnmT'}
No traceback — the Anthropic SDK raises anthropic.BadRequestError cleanly, Hermes catches it in the non-retryable branch and
aborts.
Request shape at the point of failure
Captured via a local debug dump in agent/anthropic_adapter.py::build_anthropic_kwargs (env-gated), fired on the actual failing
call:
model = claude-sonnet-4-6
is_oauth = True # token correctly classified as OAuth
base_url = https://api.anthropic.com
n_tools = 41 # includes mcp_engram_mem_*, terminal, read_file, write_file, etc.
n_messages = 1 # single user message: "test"
system = list of 2 text blocks, total ~21.7 KB
block 0: Claude Code identity prefix (57 chars)
block 1: Hermes-assembled system prompt (memory, profile, skill catalog)
Confirmed the OAuth sanitize branch is entered (identity prefix prepended, PR #10576 replacements applied, local red-team phrase
replacements applied — all verified via a post-sanitize dump written immediately before the SDK call). The 400 still fires.
Minimal reproduction outside Hermes
The same account, same token, same headers — but issued directly against https://api.anthropic.com/v1/messages with python3 stdlib
— returns the same 400 whenever a tools array is included in the body, and 200 OK when it is omitted. Full script and output are in
the Reproduction section.
Observed rate-limit / billing headers on a parallel successful probe (same token, no tools)
anthropic-ratelimit-unified-status: allowed
anthropic-ratelimit-unified-5h-status: allowed
anthropic-ratelimit-unified-5h-utilization: 0.03
anthropic-ratelimit-unified-7d-status: allowed
anthropic-ratelimit-unified-7d-utilization: 0.0
anthropic-ratelimit-unified-overage-status: rejected
anthropic-ratelimit-unified-overage-disabled-reason: org_level_disabled_until
The Max subscription itself is clearly healthy; the 400 on the failing call comes from Anthropic routing the tools-carrying request
into the overage lane (which is disabled for the org), not from any real exhaustion of entitlement.
Affected Component
Other, Configuration (config.yaml, .env, hermes setup), Agent Core (conversation loop, context compression, memory), CLI (interactive chat)
Messaging Platform (if gateway-related)
N/A (CLI only)
Debug Report
Operating System
Fedora 43 (kernel 6.19.10-200.fc43.x86_64, x86_64)
Python Version
system: 3.11.9 / hermes venv: 3.11.15.
Hermes Version
0.11.0 (2026.4.23)
Additional Logs / Traceback (optional)
Root Cause Analysis (optional)
This is not a logic bug in Hermes. Hermes's Anthropic path is correct;
the failure is a server-side classifier at Anthropic that routes
tools-carrying OAuth/Bearer requests from third-party clients intothe overage billing lane, which is disabled on Max-only accounts
without pay-as-you-go credits.
What I verified inside Hermes
resolve_anthropic_token()correctly returns theOAuth access token from
~/.claude/.credentials.json(priority 3;env vars 1, 2, 4 empty/commented out).
_is_oauth_token(token)returns True.self._is_anthropic_oauthis True at thebuild_anthropic_kwargscall site in
run_agent.py(line ~1963, ~6910).anthropic_adapter.py::build_anthropic_kwargs(line ~1510) is entered: Claude Code identity prefix prepended,
_sanitize_oauth_system_textapplied, tool names prefixed withmcp_.build_anthropic_client(line ~417) selects the OAuth branch:Bearer auth, correct beta headers,
user-agent: claude-cli/... (external, cli),x-app: cli.Isolation proves it is not Hermes
Raw API call with stdlib urllib, same token, same headers Hermes sends:
Only variable: the
toolsparameter.PR #10576 is insufficient here
PR #10576 sanitizes Hermes-specific phrases out of the system prompt.
Applied locally, verified via post-sanitize dump that rewrites reach
the wire. 400 persists unchanged. I also added local rewrites for
content-filter-adjacent terms from the skill catalogue (
Jailbreak,godmode:,obliteratus:,Remove refusal behaviors,red-teaming).Still 400. The classifier reacts to the presence of
toolsindependently of system-prompt content.
Why Paperclip works on the same account
paperclip-company-runtime/packages/adapters/claude-local/spawns theofficial
claudeCLI as a subprocess and exchanges messages over itsstdio JSON protocol. The actual
/v1/messagesrequest carryingtools: [...]is issued by Anthropic's own CLI, which theinfrastructure recognises as a first-class Claude Code session and
keeps in the Max lane. No header spoofing from external Python
reproduces that classification.
What Hermes can fix
accurate, actionable message (in
agent/error_classifier.py+agent/anthropic_adapter.py).hermes modelto switch provider.Paperclip's
claude-local, soProposed Fix (optional)
This is not a logic bug inside Hermes. Hermes's Anthropic path is doing
everything correctly — token resolution, OAuth classification, Claude Code
identity prepend, tool-name prefix, SDK construction. The failure is server-
side at Anthropic:
/v1/messagesroutestools-carrying OAuth/Bearerrequests from third-party clients into the overage billing lane, which is
disabled on Max-only accounts with no pay-as-you-go credits.
Code path verified end-to-end
hermes_cli/runtime_provider.py::resolve_runtime_providerReturns {provider: "anthropic", api_mode: "anthropic_messages",
base_url: "https://api.anthropic.com", api_key: ,
source: "claude_code", credential_pool: }.
Confirmed: source is
claude_code, not a stale pool entry.run_agent.py(around line 1963):self._is_anthropic_oauth = _is_oauth_token(effective_key) if _is_native_anthropic else False
Confirmed
Trueat the build-kwargs call site.run_agent.py::_build_api_kwargs(around line 6895-6915):Delegates to
agent/transports/anthropic.pywithis_oauth=self._is_anthropic_oauth— propagates True.agent/transports/anthropic.py::build_kwargs→ callsagent/anthropic_adapter.py::build_anthropic_kwargs(is_oauth=True).agent/anthropic_adapter.py::build_anthropic_kwargs(OAuth branch,around line 1510):
_CLAUDE_CODE_SYSTEM_PREFIXblock._sanitize_oauth_system_texton each system text block.mcp_.Confirmed via an env-gated post-sanitize dump: the sanitized text and
mcp_-prefixed tool names are what hits the SDK.
build_anthropic_client(around line 417) selects the OAuth branch:auth_token=api_key, headers {anthropic-beta: common + OAuth betas,user-agent: claude-cli/ (external, cli), x-app: cli}.
SDK call
client.messages.create(**kwargs)issues a real requestto https://api.anthropic.com/v1/messages. Server returns 400
invalid_request_error with message "You're out of extra usage."
whenever
toolsis present in the body, independent of header orpayload variations this client can realistically control.
Isolation confirms it is not Hermes
Running the minimal reproduction script against the raw API with
Python stdlib — same token from ~/.claude/.credentials.json, same
headers the adapter sets — yields:
There is no code change inside the Hermes process tree that converts
the second line into a 200. The trigger is the
toolsparameter in therequest body, as observed by Anthropic's infrastructure.
Relationship to PR #10576
PR #10576 sanitizes Hermes-specific phrases out of the system prompt to
bypass a different facet of the same classifier. It is correct and
should land. On accounts in this classification state it is, however,
insufficient — the classifier also reacts to the mere presence of
tools, independent of any system-prompt content. I applied PR #10576locally plus additional rewrites for clearly content-filter-adjacent
terms surfaced in the skill catalogue (
Jailbreak,godmode:,obliteratus:,Remove refusal behaviors,red-teaming). Verified viapost-sanitize dump that all replacements reach the wire. 400 unchanged.
Why Paperclip's same-account path succeeds
paperclip-company-runtime/packages/adapters/claude-local/does not call/v1/messages directly. It spawns the official
claudeCLI as a subprocessand exchanges messages over its stdio/JSON protocol. The underlying HTTPS
request that carries
tools: [...]is issued by Anthropic's own CLI,which the infrastructure recognises as a first-class Claude Code session
and keeps in the Max lane. There is no combination of headers in third-
party Python code that has so far reproduced that classification from
outside the official client.
Implication for fixes
The only changes Hermes itself can make are:
invalid_request_error+"out of extra usage"on an OAuth/Claude-Max path where 5h-utilizationis low) and emit an accurate, actionable message instead of the
misleading raw body.
hermes modelcommand to switch.paperclip-company-runtime/packages/adapters/claude-local/so thatMax-OAuth users can keep tool-use working on native Anthropic.
The first three can live in
agent/error_classifier.pyandagent/anthropic_adapter.py. The fourth is a new adapter modulealongside the existing
bedrock_adapter.py/gemini_cloudcode_adapter.py.Proposed Fix
Two-part proposal, smallest-first.
Part 1 — accurate error surface (small, high value)
In
agent/error_classifier.py, recognise the specific signatureand classify it as a distinct, non-retryable, non-fallback-recoverable
error kind (e.g.
AnthropicOAuthToolsReclassified). In the user-facingmessage, replace the raw upstream string with something like:
In
run_agent.py, skip the attempt-1/3 retry loop for this kind andabort immediately with the above message. Removes log noise and user
confusion.
Part 2 — subprocess adapter (larger, optional but durable)
New module
agent/anthropic_claude_local_adapter.py, reachable via aconfig knob such as:
When
transport: claude_local:claudeviasubprocessin a persistent session.stdio protocol (reuse the format Paperclip uses —
paperclip-company-runtime/packages/adapters/claude-local/is areference implementation).
(
context_compressor,prompt_caching,usage_pricing) keep workingunchanged.
claudeis not installed.Benefit: Max-OAuth users keep native Anthropic with tool-use. The direct
adapter stays for API-key users (who don't hit this classifier path).
I'm happy to draft Part 1 as a PR. Part 2 is larger and probably deserves
design discussion first.
Are you willing to submit a PR for this?