Sonnet 4.6 / Opus 4.7 return generic 429 on Claude Max OAuth while Haiku 4.5 succeeds — appears to be new Anthropic-side enforcement beyond header inspection

## Summary

As of 2026-04-28, requests to Sonnet 4.5/4.6 and Opus 4.6/4.7 from Hermes via the Claude Max OAuth credential return `HTTP 429 {"type":"error","error":{"type":"rate_limit_error","message":"Error"}}` with no `anthropic-ratelimit-*` response headers. Haiku 4.5 succeeds normally on the same token, same code path, same session.

The real `claude` CLI (v2.1.122) hits Sonnet successfully with the same OAuth token from the same machine in the same minute. So this is not a quota issue and not a token issue — it's something Anthropic is doing to distinguish real Claude Code from third-party clients on Sonnet/Opus.

## Environment

- Hermes Agent: latest (post `hermes update` 2026-04-28)
- macOS arm64
- Anthropic provider, OAuth subscription token (sk-ant-oat01-…)
- Account: Claude Max 20x, organizationRateLimitTier `default_claude_max_20x`, hasExtraUsageEnabled: true
- Anthropic status page: All Systems Operational at time of testing

## Reproduction

Same OAuth token, two requests run within seconds of each other.

**Hermes-shaped request (curl) — 429:**

```bash
TOKEN=$(security find-generic-password -s "Claude Code-credentials" -w | jq -r .claudeAiOauth.accessToken)
curl -sD - -X POST 'https://api.anthropic.com/v1/messages?beta=true' \
  -H "Authorization: Bearer $TOKEN" \
  -H "Accept: application/json" \
  -H "anthropic-version: 2023-06-01" \
  -H "anthropic-beta: claude-code-20250219,oauth-2025-04-20,interleaved-thinking-2025-05-14,context-management-2025-06-27,prompt-caching-scope-2026-01-05,advisor-tool-2026-03-01,advanced-tool-use-2025-11-20,context-1m-2025-08-07,effort-2025-11-24,cache-diagnosis-2026-04-07" \
  -H "anthropic-dangerous-direct-browser-access: true" \
  -H "user-agent: claude-cli/2.1.122 (external, sdk-cli)" \
  -H "x-app: cli" \
  -H "x-claude-code-session-id: $(uuidgen)" \
  -H "x-client-request-id: $(uuidgen)" \
  -H "x-stainless-arch: arm64" \
  -H "x-stainless-lang: js" \
  -H "x-stainless-os: MacOS" \
  -H "x-stainless-package-version: 0.81.0" \
  -H "x-stainless-runtime: node" \
  -H "x-stainless-runtime-version: v24.3.0" \
  -d '{"model":"claude-sonnet-4-6","max_tokens":10,"system":"x","messages":[{"role":"user","content":"hi"}]}'
```

Result: `HTTP 429`, body `{"type":"error","error":{"type":"rate_limit_error","message":"Error"},"request_id":"req_011CaWyhtmFS2ob5td4b1mmj"}`. Response has **no** `anthropic-ratelimit-*` headers — only generic Cloudflare headers and `x-should-retry: true`.

**Real `claude` CLI — 200:**

```bash
claude -p "hi" --model claude-sonnet-4-6
# → "<friendly assistant reply>"
```

Same token. Same model. Same machine. Within seconds.

The successful response shows the account has plenty of quota:
- `anthropic-ratelimit-unified-5h-utilization: 0.09`
- `anthropic-ratelimit-unified-7d-utilization: 0.16`
- `anthropic-ratelimit-unified-7d_sonnet-utilization: 0.20`
- `anthropic-ratelimit-unified-overage-disabled-reason: org_level_disabled_until`
- `anthropic-ratelimit-unified-overage-status: rejected`

So overage is disabled at org level (which is fine — base quota is 80% available), but the underlying gate is something else.

## What's identical between real-claude and Hermes-spoofed

- Same OAuth token
- Same `?beta=true` URL
- All `anthropic-beta` values match
- `anthropic-dangerous-direct-browser-access: true`
- `user-agent: claude-cli/2.1.122 (external, sdk-cli)`
- All `x-stainless-*` values match (arch, lang, os, package-version 0.81.0, runtime, runtime-version v24.3.0)
- `x-app: cli`
- Synthetic `x-client-request-id` and `x-claude-code-session-id` UUIDs

## What's different

Things real claude sends that Hermes doesn't:

1. **Body shape**: real claude includes `metadata`, `output_config`, `thinking`, `context_management`, `diagnostics` top-level fields. Hermes sends only `model`, `messages`, `system`, `tools`, `max_tokens`.
2. **TLS fingerprint**: curl/openssl vs Bun/Node — different JA3/JA4 likely.
3. **Streaming**: real claude uses `stream: true` always.

## Hypothesis

Anthropic added a new enforcement layer for Sonnet/Opus on subscription OAuth, separate from the existing prompt-text content filter. It probably keys on either TLS fingerprint or required body fields (most likely the structured `metadata.user_id` / `output_config` / `context_management` fields that real Claude Code adds).

## Affected

- Hermes Anthropic native provider (`agent/anthropic_adapter.py`'s `build_anthropic_client`)
- Any user on Claude Max / Claude Pro OAuth selecting Sonnet 4.5+ or Opus 4.6+ as primary or fallback
- Telegram, Discord, webui, api_server — all platforms

## Workaround

Switch the primary model to `claude-haiku-4-5-20251001` (Haiku is unaffected). Sonnet/Opus can stay in fallback chain but they'll always 429 until this is fixed.

## What might fix it

1. Have `build_anthropic_client` always emit the body fields that real Claude Code emits: `metadata: {user_id: <hashed-account-uuid>}`, `output_config: {...}`, `thinking: {type: "adaptive"}`, `context_management: {...}`, plus `stream: true` by default.
2. If the gate is TLS-level, the SDK already uses Node's https stack — should match. But if Anthropic is fingerprinting handshake details specific to Bun, that's harder.

Happy to provide gateway logs, full request dumps, or run additional diagnostics. Multiple `request_id`s above can be cross-referenced server-side.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sonnet 4.6 / Opus 4.7 return generic 429 on Claude Max OAuth while Haiku 4.5 succeeds — appears to be new Anthropic-side enforcement beyond header inspection #17169

Summary

Environment

Reproduction

What's identical between real-claude and Hermes-spoofed

What's different

Hypothesis

Affected

Workaround

What might fix it

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Sonnet 4.6 / Opus 4.7 return generic 429 on Claude Max OAuth while Haiku 4.5 succeeds — appears to be new Anthropic-side enforcement beyond header inspection #17169

Description

Summary

Environment

Reproduction

What's identical between real-claude and Hermes-spoofed

What's different

Hypothesis

Affected

Workaround

What might fix it

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions