[Bug]: On a Claude Max 20x subscription with a valid OAuth access token from ~/.claude/.credentials.json, every Hermes request to native Anthropic (provider: anthropic, https://api.anthropic.com/v1/messages) is rejected with HTTP 400

### Bug Description

Bug Description                                                                                                                    
                                                        
  On a Claude Max 20x subscription with a valid OAuth access token from ~/.claude/.credentials.json, every Hermes request to native  
  Anthropic (provider: anthropic, https://api.anthropic.com/v1/messages) is rejected with HTTP 400:
                                                                                                                                     
  ⚠  API call failed (attempt 1/3): BadRequestError [HTTP 400]
     🔌 Provider: anthropic  Model: claude-sonnet-4-6                                                                                
     🌐 Endpoint: https://api.anthropic.com                                                                                          
     📝 Error: HTTP 400: You're out of extra usage. Add more at claude.ai/settings/usage and keep going.                             
     📋 Details: {'type': 'error', 'error': {'type': 'invalid_request_error',                                                        
                  'message': "You're out of extra usage. Add more at claude.ai/settings/usage and keep going.",                      
                  'request_id': 'req_011CaNQLGvmMm1h3P6yREkBF'}}                                                                     
  ⚠ Non-retryable error (HTTP 400) — trying fallback...                                                                              
  ❌ Non-retryable client error (HTTP 400). Aborting.                                                                                
                                                                                                                                     
  This fires on every agent turn, starting with the first Initializing agent... call on an empty conversation. Retries and model     
  switches (claude-opus-4-7, claude-opus-4-6, claude-sonnet-4-6, claude-haiku-4-5) all produce the identical 400.                    
                                                                                                                                     
  The error message is misleading: the Max subscription is not exhausted. Response headers from a successful probe against the same  
  account and token show 5h-utilization: 0.03, 7d-utilization: 0.0, 5h-status: allowed. The account simply does not have
  pay-as-you-go API credits (overage-status: rejected, overage-disabled-reason: org_level_disabled_until) — which should be          
  irrelevant, since requests ought to be served out of the Max lane, not the overage lane.
                                         
  Isolated testing against the raw Anthropic API (same token, same headers Hermes sends) narrows the trigger down to a single        
  variable: the presence of the tools parameter in the request body. With tools omitted, /v1/messages returns 200 OK on every model.
  With even a single dummy tool present, the same call returns the 400 above. Hermes cannot function without tools, so native        
  Anthropic + Claude Max OAuth is unusable in this state.
                                         
  PR #10576's system-prompt sanitizer has been applied locally (verified via post-sanitize payload dump that it reaches the wire). It
   does not change the outcome for this trigger path — the 400 is driven by the tools parameter, not by Hermes-specific phrases in
  the system prompt. Detailed reproduction, header values, and what I tried are in the other fields. 

### Steps to Reproduce

● Steps to Reproduce                                                                                                                 
                                                                                                                                     
  Prerequisite                                                                                                                       
   
  - Active Claude Max (or Pro) subscription with a valid OAuth login via claude /login or claude setup-token (i.e.                   
  ~/.claude/.credentials.json exists with a non-expired claudeAiOauth.accessToken).
  - No pay-as-you-go API credits on the same Anthropic organization (overage lane disabled — this is the default for Max-only users).
                                                                                                                                     
  Reproduce via Hermes                   
                                                                                                                                     
  1. Ensure a clean Anthropic auth path (no competing tokens):                                                                       
  # Make sure no env token shadows the credential file
  grep -nE '^ANTHROPIC_(TOKEN|API_KEY)=' ~/.hermes/.env                                                                              
  # Both should be empty or commented out                                                                                            
  2. Set the active model/provider to native Anthropic:                                                                              
  hermes config set model.provider anthropic                                                                                         
  hermes config set model.default claude-sonnet-4-6                                                                                  
  2. (Reproduces identically on claude-opus-4-7, claude-opus-4-6, claude-haiku-4-5.)                                                 
  3. Launch Hermes in a fresh shell and send any message:                                                                            
  hermes                                                                                                                             
  # > test                                                                                                                           
  4. Expected: agent responds.                                                                                                       
  Actual: BadRequestError [HTTP 400] … You're out of extra usage. on the very first turn, before any tool call is ever invoked.
  Non-retryable — Hermes aborts.                                                                                                     
                                                                   
  Reproduce in isolation (no Hermes involved)                                                                                        
                                                                   
  The following script, using only python3 stdlib, demonstrates that the trigger is the tools parameter in the request body —        
  independent of Hermes:
                                                                                                                                     
  import json, os, urllib.request, urllib.error                    
                                                                                                                                     
  token = json.load(open(os.path.expanduser('~/.claude/.credentials.json')))['claudeAiOauth']['accessToken']                         
                                                                                                                                     
  headers = {                                                                                                                        
      "authorization": f"Bearer {token}",                          
      "anthropic-version": "2023-06-01", 
      "anthropic-beta":
  "interleaved-thinking-2025-05-14,fine-grained-tool-streaming-2025-05-14,claude-code-20250219,oauth-2025-04-20",                    
      "user-agent": "claude-cli/2.1.119 (external, cli)",
      "x-app": "cli",                                                                                                                
      "content-type": "application/json",                                                                                            
  }                                                                                                                                  
                                                                                                                                     
  for with_tools in (False, True):                                                                                                   
      payload = {                        
          "model": "claude-opus-4-7",                                                                                                
          "max_tokens": 10,                                        
          "system": [{"type": "text", "text": "You are Claude Code, Anthropic's official CLI for Claude."}],
          "messages": [{"role": "user", "content": "hi"}],                                                                           
      }
      if with_tools:                                                                                                                 
          payload["tools"] = [{                                    
              "name": "mcp_ping",        
              "description": "x",                                                                                                    
              "input_schema": {"type": "object", "properties": {}},
          }]                                                                                                                         
                                                                   
      req = urllib.request.Request(                                                                                                  
          "https://api.anthropic.com/v1/messages",
          data=json.dumps(payload).encode(),                                                                                         
          headers=headers,                                         
      )                                  
      try:
          r = urllib.request.urlopen(req, timeout=30)
          print(f"tools={with_tools} → {r.status}")                                                                                  
      except urllib.error.HTTPError as e:                                                                                            
          print(f"tools={with_tools} → {e.code} {e.read().decode()[:160]}")                                                          
                                                                                                                                     
  Observed output on an affected account:                                                                                            
                                         
  tools=False → 200                                                                                                                  
  tools=True  → 400 {"type":"error","error":{"type":"invalid_request_error","message":"You're out of extra usage. Add more at        
  claude.ai/settings/usage and keep going."}}                                                                                        
                                                                                                                                     
  Flipping with_tools is the only change; headers, system prompt, messages, and model stay identical.                                
                                                                   
  What does not change the outcome                                                                                                   
                                                                   
  For the tools=True case, I verified that none of the following flip the 400 back to 200:                                           
  - different user-agent strings (claude-cli/2.1.74, /2.1.119, /2.1.200, without parens, without (external, cli), empty, or a
  realistic Node-style Claude Code UA)                                                                                               
  - removing or changing x-app: cli                                
  - dropping claude-code-20250219 beta (yields 401 instead, not 200)                                                                 
  - adding or removing the interleaved-thinking / fine-grained-tool-streaming betas                                                  
  - switching the model to any of opus-4-7, opus-4-6, sonnet-4-6, haiku-4-5                                                          
  - adding/removing thinking: {type: "enabled", budget_tokens: ...}                                                                  
  - using an empty system prompt vs. the full post-sanitize Hermes system prompt (also reproduced with PR #10576 applied locally plus
   additional red-team phrase rewrites — verified via post-sanitize dump that the replacements reach the wire)                       
                                                                                                                                     
  Frequency                                                                                                                          
                                                                                                                                     
  100% reproducible. Happens on every invocation of hermes against provider: anthropic with Claude Max OAuth once tools are attached.

### Expected Behavior

Expected Behavior                                                                                                                  
                                                                                                                                    
  A Claude Max subscriber whose OAuth token authenticates /v1/messages should be able to send tools-carrying requests and have them  
  served out of the Max lane, not re-classified into the overage lane — at least as long as the subscription budget is not exhausted
  and the identity headers Hermes sends (claude-cli/* (external, cli), x-app: cli, Claude Code / OAuth beta headers) are the         
  supported way for an external OAuth client to present itself.
                                         
  Concretely, the minimal reproduction script in the previous section should return 200 for both tools=False and tools=True — the    
  same way the claude CLI and Paperclip's claude-local adapter (which spawn the official CLI as a subprocess on the same account)
  succeed with tool-use against this account today.                                                                                  
                                                        
  If Anthropic's infrastructure is not willing to route tools-carrying Bearer-OAuth traffic to the Max lane from third-party clients 
  at all, then Hermes should at minimum:
                                                                                                                                     
  1. Classify this 400 distinctly and actionably. Right now the error message is literally You're out of extra usage. Add more at    
  claude.ai/settings/usage and keep going. — which is misleading when the Max subscription is 3% utilized. Hermes should detect the
  specific invalid_request_error + "out of extra usage" signature on an OAuth/Claude-Max request and surface a single, accurate hint 
  to the user, e.g.:                                    
                                         
  ▎ Anthropic rejected your tools-carrying OAuth request as overage, even though your Max budget is not exhausted. Your account      
  appears to route external OAuth tool-use to the overage lane, which is disabled. Options: (a) switch to OpenRouter or another 
  provider, (b) add API credits at claude.ai/settings/usage, (c) use a subprocess Anthropic adapter (not yet available).             
  2. Not retry it 3× as if it were transient. It is deterministic; retrying burns latency and noise.
  3. Offer an automatic fallback when a fallback provider is configured (e.g. OpenRouter) instead of aborting, or at least recommend 
  the concrete hermes model command to switch.                                                                                       
  4. Longer-term: provide a subprocess-style Anthropic adapter analogous to Paperclip's claude-local (shell out to the official      
  claude CLI), so Max-OAuth users whose direct-API tool-use requests are re-classified can still use Anthropic natively through      
  Hermes.     

### Actual Behavior

● Actual Behavior                                                                                                                    
                                                                                                                                     
  Every agent turn against provider: anthropic fails on the first outbound request with HTTP 400, before any tool is ever invoked by 
  the model. Hermes attempts the configured retry sequence, sees the server marking the error non-retryable, attempts the fallback
  chain, and aborts the turn. No assistant response is produced.                                                                     
                                                        
  Full terminal output (fresh hermes session, single test prompt)                                                                    
   
  Welcome to Hermes Agent! Type your message or /help for commands.                                                                  
  ✦ Tip: hermes logs -f follows agent.log in real time. --level WARNING --since 1h filters output.                                   
                                                                                                                                     
  ────────────────────────────────────────                                                                                           
  ● test                                                                                                                             
                                                                                                                                     
  Initializing agent...                                 
  ────────────────────────────────────────

  ⚠  API call failed (attempt 1/3): BadRequestError [HTTP 400]                                                                       
     🔌 Provider: anthropic  Model: claude-sonnet-4-6
     🌐 Endpoint: https://api.anthropic.com                                                                                          
     📝 Error: HTTP 400: You're out of extra usage. Add more at claude.ai/settings/usage and keep going.                             
     📋 Details: {'type': 'error', 'error': {'type': 'invalid_request_error', 'message': "You're out of extra usage. Add more at     
  claude.ai/settings/usage and keep going."}, 'request_id': 'req_011CaNPjgbLHY6m88qSvxnmT'}                                          
  ⚠ Non-retryable error (HTTP 400) — trying fallback...                                                                              
  ❌ Non-retryable error (HTTP 400): HTTP 400: You're out of extra usage. Add more at claude.ai/settings/usage and keep going.       
  ❌ Non-retryable client error (HTTP 400). Aborting.                                                                                
     🔌 Provider: anthropic  Model: claude-sonnet-4-6                                                                                
     🌐 Endpoint: https://api.anthropic.com                                                                                          
     💡 This type of error won't be fixed by retrying.                                                                               
   ─  ⚕ Hermes  ──────────────────────────────────────────────────────────────────
      Error: Error code: 400 - {'type': 'error', 'error': {'type':                                                                   
      'invalid_request_error', 'message': "You're out of extra usage. Add more at                                                    
      claude.ai/settings/usage and keep going."}, 'request_id':                                                                      
      'req_011CaNPjgbLHY6m88qSvxnmT'}                                                                                                
                                                        
  Exit via /exit — no assistant turn ever lands.                                                                                     
                                                        
  Relevant ~/.hermes/logs/agent.log excerpt around the failure                                                                       
                                                        
  2026-04-24 11:53:40,662 INFO agent.auxiliary_client: Vision auto-detect: using main provider anthropic (claude-sonnet-4-6)         
  2026-04-24 11:53:41,563 INFO agent.auxiliary_client: Vision auto-detect: using main provider anthropic (claude-sonnet-4-6)         
  2026-04-24 11:53:41,994 INFO agent.auxiliary_client: Vision auto-detect: using main provider anthropic (claude-sonnet-4-6)         
  2026-04-24 11:53:44,205 INFO agent.auxiliary_client: Vision auto-detect: using main provider anthropic (claude-sonnet-4-6)         
  2026-04-24 11:53:44,693 INFO agent.auxiliary_client: Auxiliary auto-detect: using main provider anthropic (claude-sonnet-4-6)      
  2026-04-24 11:53:45,617 ERROR [20260424_115340_255733] root: Non-retryable client error: Error code: 400 - {'type': 'error',       
  'error': {'type': 'invalid_request_error', 'message': "You're out of extra usage. Add more at claude.ai/settings/usage and keep    
  going."}, 'request_id': 'req_011CaNPjgbLHY6m88qSvxnmT'}                                                                            
                                                                                                                                     
  No traceback — the Anthropic SDK raises anthropic.BadRequestError cleanly, Hermes catches it in the non-retryable branch and       
  aborts.                                
                                                                                                                                     
  Request shape at the point of failure                                                                                              
                                         
  Captured via a local debug dump in agent/anthropic_adapter.py::build_anthropic_kwargs (env-gated), fired on the actual failing     
  call:                                                 
                                                                                                                                     
  model      = claude-sonnet-4-6                        
  is_oauth   = True                    # token correctly classified as OAuth
  base_url   = https://api.anthropic.com                                                                                             
  n_tools    = 41                      # includes mcp_engram_mem_*, terminal, read_file, write_file, etc.                            
  n_messages = 1                       # single user message: "test"                                                                 
  system     = list of 2 text blocks, total ~21.7 KB                                                                                 
               block 0: Claude Code identity prefix (57 chars)                                                                       
               block 1: Hermes-assembled system prompt (memory, profile, skill catalog)                                              
                                                                                                                                     
  Confirmed the OAuth sanitize branch is entered (identity prefix prepended, PR #10576 replacements applied, local red-team phrase   
  replacements applied — all verified via a post-sanitize dump written immediately before the SDK call). The 400 still fires.        
                                                                                                                                     
  Minimal reproduction outside Hermes                                                                                                
                                         
  The same account, same token, same headers — but issued directly against https://api.anthropic.com/v1/messages with python3 stdlib 
  — returns the same 400 whenever a tools array is included in the body, and 200 OK when it is omitted. Full script and output are in
   the Reproduction section.                                                                                                         
                                                        
  Observed rate-limit / billing headers on a parallel successful probe (same token, no tools)                                        
   
  anthropic-ratelimit-unified-status:                    allowed                                                                     
  anthropic-ratelimit-unified-5h-status:                 allowed                                                                     
  anthropic-ratelimit-unified-5h-utilization:            0.03                                                                        
  anthropic-ratelimit-unified-7d-status:                 allowed                                                                     
  anthropic-ratelimit-unified-7d-utilization:            0.0                                                                         
  anthropic-ratelimit-unified-overage-status:            rejected
  anthropic-ratelimit-unified-overage-disabled-reason:   org_level_disabled_until                                                    
                                                                                                                                     
  The Max subscription itself is clearly healthy; the 400 on the failing call comes from Anthropic routing the tools-carrying request
   into the overage lane (which is disabled for the org), not from any real exhaustion of entitlement.    

### Affected Component

Other, Configuration (config.yaml, .env, hermes setup), Agent Core (conversation loop, context compression, memory), CLI (interactive chat)

### Messaging Platform (if gateway-related)

N/A (CLI only)

### Debug Report

```shell
`hermes debug share` would upload private data (memories, tokens,
  Telegram/Slack creds). Manually curated summary below.                                                                             
                                                                                                                                     
  ## Provider / model                                                                                                                
  - model.provider = anthropic, api_mode = anthropic_messages                                                                        
  - base_url = https://api.anthropic.com                                                                                             
  - default model = claude-sonnet-4-6 (same 400 on opus-4-7, opus-4-6, haiku-4-5)                                                    
                                                                                                                                     
  ## Auth — verified clean                                                                                                           
  - ANTHROPIC_TOKEN, CLAUDE_CODE_OAUTH_TOKEN, ANTHROPIC_API_KEY envs: all empty                                                      
  - ~/.claude/.credentials.json: valid, subscriptionType=max,                                                                        
    rateLimitTier=default_claude_max_20x, accessToken ~7h remaining                                                                  
  - resolve_anthropic_token() returns the file-based token,                                                                          
    _is_oauth_token() returns True                                                                                                   
  - Stale credential_pool.anthropic entries removed from ~/.hermes/auth.json                                                         
    and all profiles (per hermes-anthropic-auth-debugging skill)                                                                     
  - Leftover model.base_url=https://chatgpt.com/backend-api/codex removed                                                            
    from Anthropic profiles (unrelated separate fix)                                                                                 
                                                                                                                                     
  ## Call shape at failure (from env-gated dump in build_anthropic_kwargs)                                                           
  - is_oauth=True, model=claude-sonnet-4-6, n_tools=41, n_messages=1                                                                 
  - system: 2 blocks, ~21.7 KB total (Claude Code prefix + Hermes system)                                                            
  - PR #10576 sanitizer applied and verified via post-sanitize dump                                                                  
  - Additional local rewrites for content-filter-adjacent terms                                                                      
    (Jailbreak, godmode:, obliteratus:, Remove refusal behaviors, red-teaming)                                                       
    — verified reach the wire. 400 persists.                                                                                         
                                                                                                                                     
  ## Subscription is NOT exhausted                                                                                                   
  Headers on a successful tools=False probe (same token):                                                                            
    5h-status: allowed, 5h-utilization: 0.03                                                                                         
    7d-status: allowed, 7d-utilization: 0.0                                                                                          
    overage-status: rejected, overage-disabled-reason: org_level_disabled_until                                                      
                                                                                                                                     
  The 400 comes from reclassification into the overage lane, not from                                                                
  real entitlement exhaustion.                                                                                                       
                                                                                                                                     
  ## Cross-client check on the same account/token                                                                                    
  - `claude` CLI (official): works with tool-use                                                                                     
  - Paperclip (spawns `claude` as subprocess via                                                                                     
    paperclip-company-runtime/packages/adapters/claude-local/): works    
  - Hermes (direct HTTPS to /v1/messages): 400 as soon as `tools` is present
```

### Operating System

Fedora 43 (kernel 6.19.10-200.fc43.x86_64, x86_64) 

### Python Version

system: 3.11.9 / hermes venv: 3.11.15.

### Hermes Version

0.11.0 (2026.4.23)

### Additional Logs / Traceback (optional)

```shell
Complete log snippet from ~/.hermes/logs/agent.log covering the most recent
  failing turn — plugin discovery, MCP tool registration, vision/auxiliary                                                           
  auto-detect resolving to the main Anthropic provider, then the non-retryable                                                       
  400 on the first outbound call:                                                                                                    
                                                                                                                                     
  2026-04-24 11:53:39,227 INFO hermes_cli.plugins: Plugin 'openai' registered image_gen provider: openai                             
  2026-04-24 11:53:39,227 INFO hermes_cli.plugins: Plugin 'openai-codex' registered image_gen provider: openai-codex                 
  2026-04-24 11:53:39,249 INFO hermes_cli.plugins: Plugin 'xai' registered image_gen provider: xai                                   
  2026-04-24 11:53:39,250 INFO hermes_cli.plugins: Plugin discovery complete: 4 found, 3 enabled                                     
  2026-04-24 11:53:39,696 INFO run_agent: Loaded environment variables from /home/alrik/.hermes/.env                                 
  2026-04-24 11:53:40,354 INFO tools.mcp_tool: MCP server 'engram' (stdio): registered 11 tool(s)                                    
  2026-04-24 11:53:40,355 INFO tools.mcp_tool: MCP: registered 11 tool(s) from 1 server(s)                                           
  2026-04-24 11:53:40,662 INFO agent.auxiliary_client: Vision auto-detect: using main provider anthropic (claude-sonnet-4-6)         
  2026-04-24 11:53:41,563 INFO agent.auxiliary_client: Vision auto-detect: using main provider anthropic (claude-sonnet-4-6)         
  2026-04-24 11:53:41,994 INFO agent.auxiliary_client: Vision auto-detect: using main provider anthropic (claude-sonnet-4-6)         
  2026-04-24 11:53:44,205 INFO agent.auxiliary_client: Vision auto-detect: using main provider anthropic (claude-sonnet-4-6)         
  2026-04-24 11:53:44,693 INFO agent.auxiliary_client: Auxiliary auto-detect: using main provider anthropic (claude-sonnet-4-6)      
  2026-04-24 11:53:45,617 ERROR [20260424_115340_255733] root: Non-retryable client error:                                           
    Error code: 400 - {'type': 'error', 'error': {'type': 'invalid_request_error',                                                   
    'message': "You're out of extra usage. Add more at claude.ai/settings/usage and keep going."},                                   
    'request_id': 'req_011CaNPjgbLHY6m88qSvxnmT'}                                                                                    
                                                                                                                                     
  No Python traceback is surfaced — the Anthropic SDK raises `anthropic.BadRequestError`                                             
  cleanly and Hermes handles it in the non-retryable error branch via                                                                
  `run_agent.py` → `error_classifier.py`, which correctly decides against                                                            
  retrying. So the flow is well-behaved; the problem is upstream at Anthropic's                                                      
  request classifier.
```

### Root Cause Analysis (optional)

  This is not a logic bug in Hermes. Hermes's Anthropic path is correct;
  the failure is a server-side classifier at Anthropic that routes                                                                   
  `tools`-carrying OAuth/Bearer requests from third-party clients into                                                               
  the overage billing lane, which is disabled on Max-only accounts                                                                   
  without pay-as-you-go credits.                                                                                                     
                                                                                                                                     
  ## What I verified inside Hermes                                                                                                   
                                                                         
  - Token resolution: `resolve_anthropic_token()` correctly returns the                                                              
    OAuth access token from `~/.claude/.credentials.json` (priority 3;   
    env vars 1, 2, 4 empty/commented out).                                                                                           
  - `_is_oauth_token(token)` returns True.                                                                                           
  - `self._is_anthropic_oauth` is True at the `build_anthropic_kwargs`                                                               
    call site in `run_agent.py` (line ~1963, ~6910).                                                                                 
  - OAuth branch in `anthropic_adapter.py::build_anthropic_kwargs`                                                                   
    (line ~1510) is entered: Claude Code identity prefix prepended,                                                                  
    `_sanitize_oauth_system_text` applied, tool names prefixed with                                                                  
    `mcp_`.                                                                                                                          
  - `build_anthropic_client` (line ~417) selects the OAuth branch:                                                                   
    Bearer auth, correct beta headers, `user-agent: claude-cli/...                                                                   
    (external, cli)`, `x-app: cli`.                                                                                                  
  - Post-sanitize dump confirms the sanitized payload reaches the SDK.                                                               
  - SDK issues the request, Anthropic returns 400.                                                                                   
                                                                                                                                     
  ## Isolation proves it is not Hermes                                                                                               
                                                                                                                                     
  Raw API call with stdlib urllib, same token, same headers Hermes sends:                                                            
                                                                         
      tools=False → 200                                                                                                              
      tools=True  → 400 "out of extra usage"                             
                                                                                                                                     
  Only variable: the `tools` parameter.                                                                                              
                                         
  ## PR #10576 is insufficient here                                                                                                  
                                                                         
  PR #10576 sanitizes Hermes-specific phrases out of the system prompt.                                                              
  Applied locally, verified via post-sanitize dump that rewrites reach
  the wire. 400 persists unchanged. I also added local rewrites for                                                                  
  content-filter-adjacent terms from the skill catalogue (`Jailbreak`,                                                               
  `godmode:`, `obliteratus:`, `Remove refusal behaviors`, `red-teaming`).                                                            
  Still 400. The classifier reacts to the presence of `tools`                                                                        
  independently of system-prompt content.                                                                                            
                                                                                                                                     
  ## Why Paperclip works on the same account                                                                                         
                                                                         
  `paperclip-company-runtime/packages/adapters/claude-local/` spawns the                                                             
  official `claude` CLI as a subprocess and exchanges messages over its  
  stdio JSON protocol. The actual `/v1/messages` request carrying                                                                    
  `tools: [...]` is issued by Anthropic's own CLI, which the
  infrastructure recognises as a first-class Claude Code session and                                                                 
  keeps in the Max lane. No header spoofing from external Python         
  reproduces that classification.                                                                                                    
                                                                         
  ## What Hermes can fix                                                                                                             
                                                                         
  1. Detect the specific 400 signature on an OAuth path and emit an                                                                  
     accurate, actionable message (in `agent/error_classifier.py` +      
     `agent/anthropic_adapter.py`).                                                                                                  
  2. Skip the retry loop for this deterministic failure.                                                                             
  3. Auto-fallback or recommend `hermes model` to switch provider.
  4. (Larger) Ship a subprocess-style Anthropic adapter analogous to                                                                 
     Paperclip's `claude-local`, so

### Proposed Fix (optional)

This is not a logic bug inside Hermes. Hermes's Anthropic path is doing
  everything correctly — token resolution, OAuth classification, Claude Code                                                         
  identity prepend, tool-name prefix, SDK construction. The failure is server-                                                       
  side at Anthropic: `/v1/messages` routes `tools`-carrying OAuth/Bearer                                                             
  requests from third-party clients into the overage billing lane, which is                                                          
  disabled on Max-only accounts with no pay-as-you-go credits.                                                                       
                                                                                                                                     
  ## Code path verified end-to-end                                                                                                   
                                                                                                                                     
  1. `hermes_cli/runtime_provider.py::resolve_runtime_provider`                                                                      
     Returns {provider: "anthropic", api_mode: "anthropic_messages",     
     base_url: "https://api.anthropic.com", api_key: <OAuth token>,                                                                  
     source: "claude_code", credential_pool: <pool>}.                                                                                
     Confirmed: source is `claude_code`, not a stale pool entry.                                                                     
                                                                                                                                     
  2. `run_agent.py` (around line 1963):                                                                                              
         self._is_anthropic_oauth = _is_oauth_token(effective_key) if _is_native_anthropic else False                                
     Confirmed `True` at the build-kwargs call site.                                                                                 
                                                                                                                                     
  3. `run_agent.py::_build_api_kwargs` (around line 6895-6915):                                                                      
     Delegates to `agent/transports/anthropic.py` with                                                                               
     `is_oauth=self._is_anthropic_oauth` — propagates True.                                                                          
                                                                                                                                     
  4. `agent/transports/anthropic.py::build_kwargs` → calls                                                                           
     `agent/anthropic_adapter.py::build_anthropic_kwargs(is_oauth=True)`.                                                            
                                                                                                                                     
  5. `agent/anthropic_adapter.py::build_anthropic_kwargs` (OAuth branch,                                                             
     around line 1510):                                                                                                              
     - Prepends `_CLAUDE_CODE_SYSTEM_PREFIX` block.                                                                                  
     - Runs `_sanitize_oauth_system_text` on each system text block.                                                                 
     - Prefixes tool names with `mcp_`.                                                                                              
     Confirmed via an env-gated post-sanitize dump: the sanitized text and                                                           
     mcp_-prefixed tool names are what hits the SDK.                                                                                 
                                                                                                                                     
  6. `build_anthropic_client` (around line 417) selects the OAuth branch:                                                            
     `auth_token=api_key`, headers {anthropic-beta: common + OAuth betas,                                                            
     user-agent: claude-cli/<version> (external, cli), x-app: cli}.                                                                  
                                                                                                                                     
  7. SDK call `client.messages.create(**kwargs)` issues a real request                                                               
     to https://api.anthropic.com/v1/messages. Server returns 400                                                                    
     invalid_request_error with message "You're out of extra usage."                                                                 
     whenever `tools` is present in the body, independent of header or   
     payload variations this client can realistically control.                                                                       
                                                                                                                                     
  ## Isolation confirms it is not Hermes                                                                                             
                                                                                                                                     
  Running the minimal reproduction script against the raw API with                                                                   
  Python stdlib — same token from ~/.claude/.credentials.json, same      
  headers the adapter sets — yields:                                                                                                 
                                                                                                                                     
      tools=False → 200                                                                                                              
      tools=True  → 400 "out of extra usage"                                                                                         
                                                                                                                                     
  There is no code change inside the Hermes process tree that converts                                                               
  the second line into a 200. The trigger is the `tools` parameter in the                                                            
  request body, as observed by Anthropic's infrastructure.                                                                           
                                                                         
  ## Relationship to PR #10576                                                                                                       
                                                                         
  PR #10576 sanitizes Hermes-specific phrases out of the system prompt to                                                            
  bypass a different facet of the same classifier. It is correct and     
  should land. On accounts in this classification state it is, however,                                                              
  insufficient — the classifier also reacts to the mere presence of                                                                  
  `tools`, independent of any system-prompt content. I applied PR #10576                                                             
  locally plus additional rewrites for clearly content-filter-adjacent                                                               
  terms surfaced in the skill catalogue (`Jailbreak`, `godmode:`,                                                                    
  `obliteratus:`, `Remove refusal behaviors`, `red-teaming`). Verified via                                                           
  post-sanitize dump that all replacements reach the wire. 400 unchanged.                                                            
                                                                                                                                     
  ## Why Paperclip's same-account path succeeds                                                                                      
                                                                                                                                     
  `paperclip-company-runtime/packages/adapters/claude-local/` does not call                                                          
  /v1/messages directly. It spawns the official `claude` CLI as a subprocess
  and exchanges messages over its stdio/JSON protocol. The underlying HTTPS                                                          
  request that carries `tools: [...]` is issued by Anthropic's own CLI,                                                              
  which the infrastructure recognises as a first-class Claude Code session                                                           
  and keeps in the Max lane. There is no combination of headers in third-                                                            
  party Python code that has so far reproduced that classification from                                                              
  outside the official client.                                                                                                       
                                                                                                                                     
  ## Implication for fixes                                                                                                           
                                                                         
  The only changes Hermes itself can make are:                                                                                       
                                                                         
  - Detect this specific 400 shape (`invalid_request_error` +                                                                        
    `"out of extra usage"` on an OAuth/Claude-Max path where 5h-utilization
    is low) and emit an accurate, actionable message instead of the                                                                  
    misleading raw body.                                                                                                             
  - Do not retry; classify as deterministic.                                                                                         
  - Offer automatic fallback if one is configured, or recommend the exact                                                            
    `hermes model` command to switch.                                                                                                
  - Optionally ship a subprocess-style Anthropic adapter analogous to                                                                
    `paperclip-company-runtime/packages/adapters/claude-local/` so that                                                              
    Max-OAuth users can keep tool-use working on native Anthropic.                                                                   
                                                                                                                                     
  The first three can live in `agent/error_classifier.py` and                                                                        
  `agent/anthropic_adapter.py`. The fourth is a new adapter module                                                                   
  alongside the existing `bedrock_adapter.py` / `gemini_cloudcode_adapter.py`.                                                       
                                                                                                                                     
  ---                                                                                                                                
  Proposed Fix                                                                                                                       
                                                                         
  Two-part proposal, smallest-first.     
                                                                                                                                     
  ## Part 1 — accurate error surface (small, high value)                                                                             
                                                                                                                                     
  In `agent/error_classifier.py`, recognise the specific signature                                                                   
                                                                         
      status_code == 400                                                                                                             
      AND body.error.type == "invalid_request_error"
      AND "out of extra usage" in body.error.message                                                                                 
      AND request was on OAuth auth (Bearer, sk-ant-oat01-*)                                                                         
                                                                                                                                     
  and classify it as a distinct, non-retryable, non-fallback-recoverable                                                             
  error kind (e.g. `AnthropicOAuthToolsReclassified`). In the user-facing                                                            
  message, replace the raw upstream string with something like:                                                                      
                                                                         
      Anthropic rejected this tools-carrying OAuth request as overage even                                                           
      though the Max subscription is not exhausted. Your account currently
      routes external OAuth tool-use to the overage lane, which is disabled.                                                         
      Options:                                                                                                                       
        - Switch provider:  hermes config set model.provider openrouter                                                              
        - Add API credits:  https://claude.ai/settings/usage                                                                         
        - Use Claude Code directly for this task.                                                                                    
                                                                                                                                     
  In `run_agent.py`, skip the attempt-1/3 retry loop for this kind and                                                               
  abort immediately with the above message. Removes log noise and user                                                               
  confusion.                                                                                                                         
                                                                         
  ## Part 2 — subprocess adapter (larger, optional but durable)                                                                      
                                                                         
  New module `agent/anthropic_claude_local_adapter.py`, reachable via a                                                              
  config knob such as:                                                   
                                                                                                                                     
      model:                                                             
        provider: anthropic              
        api_mode: anthropic_messages
        transport: claude_local   # new; default "direct"                                                                            
                                                                                                                                     
  When `transport: claude_local`:                                                                                                    
                                                                                                                                     
  - Spawn `claude` via `subprocess` in a persistent session.                                                                         
  - Marshal Hermes's OpenAI-style messages + tools into the CLI's JSON
    stdio protocol (reuse the format Paperclip uses —                                                                                
    `paperclip-company-runtime/packages/adapters/claude-local/` is a                                                                 
    reference implementation).                                                                                                       
  - Stream CLI stdout back as Anthropic-shape deltas so existing consumers                                                           
    (`context_compressor`, `prompt_caching`, `usage_pricing`) keep working                                                           
    unchanged.                                                                                                                       
  - Fall back to the existing direct adapter if `claude` is not installed.                                                           
                                                                                                                                     
  Benefit: Max-OAuth users keep native Anthropic with tool-use. The direct                                                           
  adapter stays for API-key users (who don't hit this classifier path).                                                              
                                                                                                                                     
  I'm happy to draft Part 1 as a PR. Part 2 is larger and probably deserves                                                          
  design discussion first.                                                    

### Are you willing to submit a PR for this?

- [ ] I'd like to fix this myself and submit a PR

[Bug]: On a Claude Max 20x subscription with a valid OAuth access token from ~/.claude/.credentials.json, every Hermes request to native Anthropic (provider: anthropic, https://api.anthropic.com/v1/messages) is rejected with HTTP 400 #15080

Description

Bug Description

Steps to Reproduce

Make sure no env token shadows the credential file

Both should be empty or commented out

> test

Expected Behavior

Actual Behavior

Affected Component

Messaging Platform (if gateway-related)

Debug Report

Operating System

Python Version

Hermes Version

Additional Logs / Traceback (optional)

Root Cause Analysis (optional)

What I verified inside Hermes

Isolation proves it is not Hermes

PR #10576 is insufficient here

Why Paperclip works on the same account

What Hermes can fix

Proposed Fix (optional)

Code path verified end-to-end

Isolation confirms it is not Hermes

Relationship to PR #10576

Why Paperclip's same-account path succeeds

Implication for fixes

Part 1 — accurate error surface (small, high value)

Part 2 — subprocess adapter (larger, optional but durable)

Are you willing to submit a PR for this?

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions