fix(anthropic): work around OAuth third-party billing-lane classifier (#15080) by thundron · Pull Request #24250 · NousResearch/hermes-agent

thundron · 2026-05-12T07:56:39Z

note: I'm using this locally (both on Windows WSL2 + macOS); this is code generated by a Pi coding agent, that was fixed by Hermes initially, which then broke, and I made Pi fix-back Hermes. not sure how it happened, but it happened.

feel free to take inspiration, let me know things to change or just close it as noise

What does this PR do?

Adds a workaround for Anthropic's plan-vs-extra-usage classifier on OAuth (Pro/Max) requests. The classifier returns HTTP 400 ("Third-party apps now draw from your extra usage…") when a request fingerprints as a non-Claude-Code agent and the account has no overage credit configured. Issue #15080 has the original report and discussion; this PR addresses the two empirical triggers I bisected:

The tool-name set matches hermes's snake_case convention (terminal, read_file, session_search, …) rather than Claude Code's PascalCase canonicals (Bash, Read, Task, …).
The system prompt carries hermes-flavored multi-block content (SOUL.md, AGENTS.md, memory).

Renaming a single tool is sub-threshold; the whole set has to look canonical/neutral, AND the system prompt has to be the bare Claude Code identity line. With both mitigations the request passes the classifier on accounts that previously got 400 on every turn.

Default behavior is unchanged. Stealth mode starts off. The error classifier flags the specific 400 (FailoverReason.oauth_third_party_classifier), and run_agent escalates to full stealth on the first match and retries — same pattern as the existing
oauth_long_context_beta_forbidden recovery. Accounts with overage credit never escalate and see zero behavior change.

Related Issue

Fixes #15080

Type of Change

🐛 Bug fix (non-breaking change that fixes an issue)

Changes Made

agent/oauth_compat.py (new): StealthMode enum (OFF/RENAME_ONLY/FULL_STEALTH), HERMES_TO_CLAUDE_CODE rename map, CLAUDE_CODE_TOOLS canonical set, ToolNameMap (per-session, thread-safe, idempotent, collision-detecting), is_third_party_classifier_rejection(), apply_to_kwargs().
agent/anthropic_adapter.py: build_anthropic_kwargs gains optional oauth_stealth_mode and oauth_tool_map params; legacy mcp_-prefix path preserved when mode==OFF.
agent/transports/anthropic.py: build_kwargs threads the new params through; normalize_response reverse-renames tool_use names via the map so the dispatcher sees original hermes names.
agent/error_classifier.py: new FailoverReason.oauth_third_party_classifier and classification rule. Narrow match (requires both "extra usage" and the claude.ai/settings/usage URL); won't collide with the existing 429 long-context-tier rule.
run_agent.py: reads agent.oauth_stealth: auto|on|off|rename_only|full_stealth from config.yaml (default auto); session state for mode + tool map; reactive retry handler mirrors the oauth_long_context_beta_forbidden pattern at the same call site.
tests/agent/test_oauth_compat.py (new): 33 unit tests.
tests/agent/test_anthropic_adapter.py: 4 integration tests confirming build_anthropic_kwargs honors the new kwargs across modes; non-OAuth requests untouched.
tests/agent/test_error_classifier.py: 5 classification tests (both error phrasings, status-code gate, non-collision with the 429 tier rule, enum-membership invariant).

How to Test

Reproduce the underlying classifier behavior without hermes (stdlib only, confirms the fix targets the right thing):

  import json, urllib.request, urllib.error
  TOKEN = "<your sk-ant-oat OAuth token>"
  H = {"authorization": f"Bearer {TOKEN}", "anthropic-version": "2023-06-01",
       "anthropic-beta": "claude-code-20250219,oauth-2025-04-20",
       "user-agent": "claude-cli/2.1.74 (external, cli)", "x-app": "cli",
       "content-type": "application/json"}
  HERMES = ["browser_back","browser_click","browser_console","browser_get_images",
      "browser_navigate","browser_press","browser_scroll","browser_snapshot","browser_type",
      "browser_vision","clarify","delegate_task","execute_code","memory","patch","process",
      "read_file","search_files","session_search","skill_manage","skill_view","skills_list",
      "terminal","text_to_speech","todo","vision_analyze","web_extract","web_search","write_file"]
  def call(names, label):
      tools = [{"name":n,"description":"x","input_schema":{"type":"object","properties":{}}} for n in names]
      p = {"model":"claude-opus-4-7","max_tokens":16,
           "system":[{"type":"text","text":"You are Claude Code, Anthropic's official CLI for Claude."}],
           "messages":[{"role":"user","content":"hi"}], "tools": tools}
      try:
          r = urllib.request.urlopen(urllib.request.Request(
              "https://api.anthropic.com/v1/messages",
              data=json.dumps(p).encode(), headers=H), timeout=30)
          print(f"{label:<20} → {r.status}")
      except urllib.error.HTTPError as e:
          print(f"{label:<20} → {e.code}")
  call([],                                "no tools")
  call(["Bash"],                          "single CC name")
  call([f"x{i}" for i in range(29)],      "anonymous names")
  call(HERMES,                            "hermes names")

On an affected account: first three return 200, the last returns 400 with the classifier error.

Verify the fix end-to-end with hermes, on a previously-failing account, default config:

  hermes -z "say hi in 3 words"

Without this PR: API call failed after 3 retries: HTTP 400: Third-party apps…
With this PR: response text, plus a one-line stderr notice on first turn explaining the escalation to full stealth and the config key to skip the probe.

Verify a tool call round-trips (model emits Bash, transport reverse-maps to terminal):

  hermes -z "run uname and tell me the output" --yolo

Should produce normal output, no tool not found errors.

Run the tests:

  pytest tests/agent/test_oauth_compat.py \
         tests/agent/test_anthropic_adapter.py \
         tests/agent/test_error_classifier.py -q

321 tests, all pass on my setup.

Checklist

Code

I've read the Contributing Guide
My commit messages follow Conventional Commits (fix(anthropic):)
I searched for existing PRs to make sure this isn't a duplicate
My PR contains only changes related to this fix
I've run pytest tests/agent/test_oauth_compat.py tests/agent/test_anthropic_adapter.py tests/agent/test_error_classifier.py -q - all pass. Full suite has 2 pre-existing failures on main (test_try_nous_uses_pool_entry, test_long_lived_prefix_cache_e2e_openrouter) unrelated to this change.
I've added tests for my changes
I've tested on my platform: Ubuntu/WSL2 (Linux dktp-monster 6.6.87.2-microsoft-standard-WSL2 x86_64). Mac verification pending.

Documentation & Housekeeping

Docs not yet updated. A short note belongs under website/docs/user-guide/ or the troubleshooting section explaining the agent.oauth_stealth config knob and the auto-recovery behavior. Happy to add in this PR or a follow-up.
cli-config.yaml.example not yet updated — should add oauth_stealth: auto under agent: with a one-line comment.
No architecture/workflow changes affecting CONTRIBUTING.md or AGENTS.md.
Cross-platform considered. The patch is pure Python with no platform-specific calls; same code path on macOS/Linux/Windows. Mac smoke test pending.
No user-visible tool-behavior changes (tool names are renamed only on the wire; the dispatcher sees originals).

────────────────────────────────────────────────────────────────────────────────

Disclosure

The bisection and patch were developed in a long debugging session assisted by an AI coding agent (Claude). I drove the investigation, made the architectural decisions (per-session map vs module-global, auto default with reactive recovery, narrow classifier rule), and own the code. Happy to discuss any line in review.

benmont · 2026-05-16T14:31:19Z

One gap I noticed: the fine-grained-tool-streaming beta header isn't stripped in any stealth mode. Per issue #15080, this header independently signals a tool-use request to the classifier — even in RENAME_ONLY mode, the header remains in extra_headers and may still route the request into the overage lane on Max-only accounts. Stripping it alongside the tool rename in apply_to_kwargs() when stealth mode is active (similar to how drop_context_1m_beta is handled) would close that gap.

thundron · 2026-05-17T16:59:25Z

One gap I noticed: the fine-grained-tool-streaming beta header isn't stripped in any stealth mode. Per issue #15080, this header independently signals a tool-use request to the classifier — even in RENAME_ONLY mode, the header remains in extra_headers and may still route the request into the overage lane on Max-only accounts. Stripping it alongside the tool rename in apply_to_kwargs() when stealth mode is active (similar to how drop_context_1m_beta is handled) would close that gap.

thank you for the heads up! I actually found some more things to adjust too thanks to that

…h#15080) Anthropic's plan-vs-extra-usage classifier on OAuth (Pro/Max) requests returns HTTP 400 ("Third-party apps now draw from your extra usage…") when a request fingerprints as non-Claude-Code and the account has no overage credit. Bisection (NousResearch#15080) isolates two independent triggers: 1. Tool-name set matches hermes snake_case (terminal, read_file, session_search, …) rather than Claude Code's PascalCase canonicals (Bash, Read, Task, …). 2. Multi-block system prompt with hermes-flavored content (SOUL.md, AGENTS.md, memory). New agent/oauth_compat owns both mitigations behind a StealthMode enum (OFF | RENAME_ONLY | FULL_STEALTH) and a per-session ToolNameMap with forward+reverse mapping, idempotency, and collision handling. Default behavior is unchanged: mode starts OFF, error_classifier flags the specific 400 as FailoverReason.oauth_third_party_classifier, and run_agent escalates to FULL_STEALTH on first match and retries once (mirrors the existing oauth_long_context_beta_forbidden recovery). Accounts with overage credit never escalate and see zero behavior change. Config: agent.oauth_stealth: auto|on|off|rename_only|full_stealth in config.yaml (default: auto). 321 tests pass — 33 new in test_oauth_compat, 4 integration in test_anthropic_adapter, 5 classification in test_error_classifier. Signed-off-by: thundron <la@thundron.dev> --- Rebase note: ported onto upstream main after the run_agent.py refactor (~14k → ~4k lines). The original commit's seven run_agent.py hunks now live in: - agent/agent_init.py (init: oauth_stealth_mode, ToolNameMap) - agent/chat_completion_helpers.py (build_kwargs threading: main path, summary path, retry path) - agent/conversation_loop.py (retry-state flag, reactive recovery block, normalize_response oauth_tool_map threading on both main + length-truncation paths) An additional sibling site discovered during the port — the length- truncation recovery in conversation_loop.py that rebuilds the assistant message from the truncated response — also receives the oauth_tool_map so stealth-renamed tool names are reversed in the rebuilt continuation message. Without that, the next iteration's tool registry would not recognize the renamed tool. This site did not exist in the same form in the pre-refactor file; the fix is preserved by virtue of porting both normalize_response callsites. 321 of 321 tests in the directly-affected modules pass (test_oauth_compat, test_anthropic_adapter, test_error_classifier). Broader sweep: 0 regressions attributable to this port (10 of the 12 sweep failures pre-exist on origin/main; the remaining 2 pass in isolation and are pre-existing test-isolation issues).

…Research#15080) Empirical inspection of the Claude Code 2.1.143 Windows binary confirms CC does NOT send the fine-grained-tool-streaming-2025-05-14 beta header. Grepping the binary for interned beta strings shows every other beta CC uses as a plain string, but fine-grained-tool-streaming-2025-05-14 only appears inside a bundled documentation skill, which itself instructs: Remove the effort-2025-11-24 and fine-grained-tool-streaming-2025-05-14 beta headers (GA on 4.6) CC opts in to fine-grained tool streaming at the per-tool level via `eager_input_streaming: true` gated on CLAUDE_CODE_ENABLE_FINE_GRAINED_TOOL_STREAMING and the tengu_fgts feature flag, NOT via the global beta header. Sending the beta on OAuth requests is a fingerprint divergence from real CC and may contribute to the plan-vs-extra-usage classifier rejection already worked around in commit 041957564 (NousResearch#15080). Even if not an independent classifier trigger, exact CC beta-set parity is the whole point of the OAuth path. Scope: - Strip the beta only on the OAuth code path (is_oauth_request=True); x-api-key callers may still target older Claude (4.5/4.1) endpoints that benefit from the explicit opt-in. - Threaded through both build_anthropic_client and the fast-mode extra_headers override in build_anthropic_kwargs, otherwise the per-request fast-mode header would silently reintroduce the beta we stripped at client level. Tests (324 pass, +3 new): - test_oauth_strips_fine_grained_tool_streaming_beta (new) - test_api_key_still_sends_fine_grained_tool_streaming_beta (new) - test_fast_mode_oauth_strips_fine_grained_tool_streaming_beta (new) - Updated test_setup_token_uses_auth_token and test_oauth_drop_context_1m_beta_strips_only_1m to assert the beta is absent (was incorrectly asserted present). Credit: PR reviewer flagged this gap; this commit confirms the wider scope (not just stealth mode) using direct evidence from the CC binary.

alt-glitch added type/bug Something isn't working comp/agent Core agent loop, run_agent.py, prompt builder provider/anthropic Anthropic native Messages API area/auth Authentication, OAuth, credential pools P2 Medium — degraded but workaround exists labels May 12, 2026

aldoeliacim mentioned this pull request May 12, 2026

fix(anthropic): harden Claude Code OAuth proxy request shape #23361

Open

thundron added 2 commits May 17, 2026 19:17

thundron force-pushed the fix/oauth-third-party-classifier-15080 branch from 0419575 to 168655a Compare May 18, 2026 08:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(anthropic): work around OAuth third-party billing-lane classifier (#15080)#24250

fix(anthropic): work around OAuth third-party billing-lane classifier (#15080)#24250
thundron wants to merge 2 commits into
NousResearch:mainfrom
thundron:fix/oauth-third-party-classifier-15080

thundron commented May 12, 2026 •

edited

Loading

Uh oh!

benmont commented May 16, 2026

Uh oh!

thundron commented May 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

thundron commented May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code

Documentation & Housekeeping

Uh oh!

benmont commented May 16, 2026

Uh oh!

thundron commented May 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

thundron commented May 12, 2026 •

edited

Loading