fix(anthropic): reactive recovery for OAuth 1M-context beta rejection by teknium1 · Pull Request #17752 · NousResearch/hermes-agent

teknium1 · 2026-04-30T04:15:14Z

Summary

OAuth users with 1M-capable subscriptions keep the full 1M context window. OAuth users whose subscriptions don't include 1M context get one transparent retry with the beta stripped and a session-wide disable, instead of a hard failure.

Closes #17680 (thanks @JayGwod for the clean reproduction).

Why reactive instead of unconditional omit

#17680 proposes always omitting context-1m-2025-08-07 from OAuth requests. That protects affected subscriptions but silently downgrades every other OAuth user's context window from 1M to 200K — there's no user-visible signal, just truncated context on the next long session. Reactive recovery lets the provider's own error signal be the ground truth: attempt full fidelity, recover on actual rejection, persist within the session so we don't re-probe every turn.

Changes

File	What
agent/error_classifier.py	`FailoverReason.oauth_long_context_beta_forbidden` — matches 400 + "long context beta" + "not yet available". Narrow enough to not collide with the 429 "extra usage" tier-gate pattern.
agent/anthropic_adapter.py	`_common_betas_for_base_url`, `build_anthropic_client`, `build_anthropic_kwargs` gain `drop_context_1m_beta` kwarg (default False).
agent/transports/anthropic.py	`build_kwargs` forwards the flag.
run_agent.py	`self._oauth_1m_beta_disabled` session flag + `oauth_1m_beta_retry_attempted` per-turn guard + recovery branch next to `image_shrink_retry_attempted`. `_rebuild_anthropic_client` honors the flag; main `build_kwargs` call threads it for fast-mode `extra_headers`.
hermes_cli/doctor.py	`hermes doctor` /v1/models probe retries once on the same error so affected subscriptions don't falsely report as unreachable.
hermes_cli/models.py	`_fetch_anthropic_models` retries once on the same error so OAuth model discovery works.

Recovery flow

First request goes out with the 1M beta (same as today).
If Anthropic returns 400 with "long context beta is not yet available for this subscription", classifier returns oauth_long_context_beta_forbidden.
Retry loop flips self._oauth_1m_beta_disabled = True, closes + rebuilds the Anthropic client (which reads the flag via _rebuild_anthropic_client), logs 🔕 OAuth subscription doesn't support the 1M-context beta — disabled for this session and retrying..., retries once.
Remaining turns in the session reuse the reduced-beta client — no repeated probing.
Retry-once guard: if the second attempt still fails (shouldn't, but…), fall through to normal error handling.

Validation

2190 tests/agent/ tests pass.
94 adjacent integration tests pass (test_long_context_tier_429, test_bedrock_integration, test_ctx_halving_fix).
New unit tests cover:
- Classifier pattern match (positive case + collision-guard against the 429 tier-gate + generic-400 negative).
- build_anthropic_client default keeps 1M for OAuth.
- build_anthropic_client with drop_context_1m_beta=True strips only 1M, preserves every other beta.
- build_anthropic_kwargs fast-mode extra_headers: default keeps 1M, flag strips only 1M.
Interactive sanity-check: exact error body from fix(anthropic): omit 1m context beta for native OAuth #17680 classifies correctly, tier-gate 429 unchanged, adapter helper produces the expected beta lists.

Full live verification of the rejection path requires an OAuth subscription that doesn't include 1M context — not reproducible locally without access to one. Happy to gate on a user report if wanted, or merge and iterate.

@JayGwod

Keep context-1m-2025-08-07 in OAuth requests by default so 1M-capable subscriptions retain full context. When Anthropic rejects a request with 400 'long context beta is not yet available for this subscription', disable the beta for the rest of the session, rebuild the client, and retry once. Addresses #17680 (thanks @JayGwod for the clean reproduction) without forcing every OAuth user off the 1M context window. Changes: - agent/error_classifier.py: new FailoverReason.oauth_long_context_beta_forbidden; pattern matches 400 + 'long context beta' + 'not yet available'. Narrow enough that the existing 429 tier-gate pattern keeps its own reason. - agent/anthropic_adapter.py: _common_betas_for_base_url, build_anthropic_client, build_anthropic_kwargs gain drop_context_1m_beta kwarg. Default=False (1M stays). OAuth OAUTH_ONLY_BETAS unchanged. - agent/transports/anthropic.py: build_kwargs forwards the flag. - run_agent.py: self._oauth_1m_beta_disabled flag, retry-once guard, recovery branch next to the image-shrink path. _rebuild_anthropic_client honors the flag. The main build_kwargs call site threads it through for fast-mode extra_headers. - hermes_cli/doctor.py, hermes_cli/models.py: sibling OAuth /v1/models probes get the same reactive retry — previously they'd falsely report the Anthropic API as unreachable for affected subscriptions. Tests: 2190 tests/agent/ + 94 adjacent integration tests pass. New unit tests cover the classifier pattern (including the collision guard against the 429 tier-gate) and the drop_context_1m_beta adapter behavior (default keeps 1M, flag strips only 1M while preserving every other beta).

@JayGwod

…NousResearch#17752) Keep context-1m-2025-08-07 in OAuth requests by default so 1M-capable subscriptions retain full context. When Anthropic rejects a request with 400 'long context beta is not yet available for this subscription', disable the beta for the rest of the session, rebuild the client, and retry once. Addresses NousResearch#17680 (thanks @JayGwod for the clean reproduction) without forcing every OAuth user off the 1M context window. Changes: - agent/error_classifier.py: new FailoverReason.oauth_long_context_beta_forbidden; pattern matches 400 + 'long context beta' + 'not yet available'. Narrow enough that the existing 429 tier-gate pattern keeps its own reason. - agent/anthropic_adapter.py: _common_betas_for_base_url, build_anthropic_client, build_anthropic_kwargs gain drop_context_1m_beta kwarg. Default=False (1M stays). OAuth OAUTH_ONLY_BETAS unchanged. - agent/transports/anthropic.py: build_kwargs forwards the flag. - run_agent.py: self._oauth_1m_beta_disabled flag, retry-once guard, recovery branch next to the image-shrink path. _rebuild_anthropic_client honors the flag. The main build_kwargs call site threads it through for fast-mode extra_headers. - hermes_cli/doctor.py, hermes_cli/models.py: sibling OAuth /v1/models probes get the same reactive retry — previously they'd falsely report the Anthropic API as unreachable for affected subscriptions. Tests: 2190 tests/agent/ + 94 adjacent integration tests pass. New unit tests cover the classifier pattern (including the collision guard against the 429 tier-gate) and the drop_context_1m_beta adapter behavior (default keeps 1M, flag strips only 1M while preserving every other beta).

@JayGwod

…NousResearch#17752) Keep context-1m-2025-08-07 in OAuth requests by default so 1M-capable subscriptions retain full context. When Anthropic rejects a request with 400 'long context beta is not yet available for this subscription', disable the beta for the rest of the session, rebuild the client, and retry once. Addresses NousResearch#17680 (thanks @JayGwod for the clean reproduction) without forcing every OAuth user off the 1M context window. Changes: - agent/error_classifier.py: new FailoverReason.oauth_long_context_beta_forbidden; pattern matches 400 + 'long context beta' + 'not yet available'. Narrow enough that the existing 429 tier-gate pattern keeps its own reason. - agent/anthropic_adapter.py: _common_betas_for_base_url, build_anthropic_client, build_anthropic_kwargs gain drop_context_1m_beta kwarg. Default=False (1M stays). OAuth OAUTH_ONLY_BETAS unchanged. - agent/transports/anthropic.py: build_kwargs forwards the flag. - run_agent.py: self._oauth_1m_beta_disabled flag, retry-once guard, recovery branch next to the image-shrink path. _rebuild_anthropic_client honors the flag. The main build_kwargs call site threads it through for fast-mode extra_headers. - hermes_cli/doctor.py, hermes_cli/models.py: sibling OAuth /v1/models probes get the same reactive retry — previously they'd falsely report the Anthropic API as unreachable for affected subscriptions. Tests: 2190 tests/agent/ + 94 adjacent integration tests pass. New unit tests cover the classifier pattern (including the collision guard against the 429 tier-gate) and the drop_context_1m_beta adapter behavior (default keeps 1M, flag strips only 1M while preserving every other beta).

@JayGwod

…NousResearch#17752) Keep context-1m-2025-08-07 in OAuth requests by default so 1M-capable subscriptions retain full context. When Anthropic rejects a request with 400 'long context beta is not yet available for this subscription', disable the beta for the rest of the session, rebuild the client, and retry once. Addresses NousResearch#17680 (thanks @JayGwod for the clean reproduction) without forcing every OAuth user off the 1M context window. Changes: - agent/error_classifier.py: new FailoverReason.oauth_long_context_beta_forbidden; pattern matches 400 + 'long context beta' + 'not yet available'. Narrow enough that the existing 429 tier-gate pattern keeps its own reason. - agent/anthropic_adapter.py: _common_betas_for_base_url, build_anthropic_client, build_anthropic_kwargs gain drop_context_1m_beta kwarg. Default=False (1M stays). OAuth OAUTH_ONLY_BETAS unchanged. - agent/transports/anthropic.py: build_kwargs forwards the flag. - run_agent.py: self._oauth_1m_beta_disabled flag, retry-once guard, recovery branch next to the image-shrink path. _rebuild_anthropic_client honors the flag. The main build_kwargs call site threads it through for fast-mode extra_headers. - hermes_cli/doctor.py, hermes_cli/models.py: sibling OAuth /v1/models probes get the same reactive retry — previously they'd falsely report the Anthropic API as unreachable for affected subscriptions. Tests: 2190 tests/agent/ + 94 adjacent integration tests pass. New unit tests cover the classifier pattern (including the collision guard against the 429 tier-gate) and the drop_context_1m_beta adapter behavior (default keeps 1M, flag strips only 1M while preserving every other beta).

@JayGwod

…NousResearch#17752) Keep context-1m-2025-08-07 in OAuth requests by default so 1M-capable subscriptions retain full context. When Anthropic rejects a request with 400 'long context beta is not yet available for this subscription', disable the beta for the rest of the session, rebuild the client, and retry once. Addresses NousResearch#17680 (thanks @JayGwod for the clean reproduction) without forcing every OAuth user off the 1M context window. Changes: - agent/error_classifier.py: new FailoverReason.oauth_long_context_beta_forbidden; pattern matches 400 + 'long context beta' + 'not yet available'. Narrow enough that the existing 429 tier-gate pattern keeps its own reason. - agent/anthropic_adapter.py: _common_betas_for_base_url, build_anthropic_client, build_anthropic_kwargs gain drop_context_1m_beta kwarg. Default=False (1M stays). OAuth OAUTH_ONLY_BETAS unchanged. - agent/transports/anthropic.py: build_kwargs forwards the flag. - run_agent.py: self._oauth_1m_beta_disabled flag, retry-once guard, recovery branch next to the image-shrink path. _rebuild_anthropic_client honors the flag. The main build_kwargs call site threads it through for fast-mode extra_headers. - hermes_cli/doctor.py, hermes_cli/models.py: sibling OAuth /v1/models probes get the same reactive retry — previously they'd falsely report the Anthropic API as unreachable for affected subscriptions. Tests: 2190 tests/agent/ + 94 adjacent integration tests pass. New unit tests cover the classifier pattern (including the collision guard against the 429 tier-gate) and the drop_context_1m_beta adapter behavior (default keeps 1M, flag strips only 1M while preserving every other beta).

@JayGwod

…NousResearch#17752) Keep context-1m-2025-08-07 in OAuth requests by default so 1M-capable subscriptions retain full context. When Anthropic rejects a request with 400 'long context beta is not yet available for this subscription', disable the beta for the rest of the session, rebuild the client, and retry once. Addresses NousResearch#17680 (thanks @JayGwod for the clean reproduction) without forcing every OAuth user off the 1M context window. Changes: - agent/error_classifier.py: new FailoverReason.oauth_long_context_beta_forbidden; pattern matches 400 + 'long context beta' + 'not yet available'. Narrow enough that the existing 429 tier-gate pattern keeps its own reason. - agent/anthropic_adapter.py: _common_betas_for_base_url, build_anthropic_client, build_anthropic_kwargs gain drop_context_1m_beta kwarg. Default=False (1M stays). OAuth OAUTH_ONLY_BETAS unchanged. - agent/transports/anthropic.py: build_kwargs forwards the flag. - run_agent.py: self._oauth_1m_beta_disabled flag, retry-once guard, recovery branch next to the image-shrink path. _rebuild_anthropic_client honors the flag. The main build_kwargs call site threads it through for fast-mode extra_headers. - hermes_cli/doctor.py, hermes_cli/models.py: sibling OAuth /v1/models probes get the same reactive retry — previously they'd falsely report the Anthropic API as unreachable for affected subscriptions. Tests: 2190 tests/agent/ + 94 adjacent integration tests pass. New unit tests cover the classifier pattern (including the collision guard against the 429 tier-gate) and the drop_context_1m_beta adapter behavior (default keeps 1M, flag strips only 1M while preserving every other beta).

@JayGwod

…NousResearch#17752) Keep context-1m-2025-08-07 in OAuth requests by default so 1M-capable subscriptions retain full context. When Anthropic rejects a request with 400 'long context beta is not yet available for this subscription', disable the beta for the rest of the session, rebuild the client, and retry once. Addresses NousResearch#17680 (thanks @JayGwod for the clean reproduction) without forcing every OAuth user off the 1M context window. Changes: - agent/error_classifier.py: new FailoverReason.oauth_long_context_beta_forbidden; pattern matches 400 + 'long context beta' + 'not yet available'. Narrow enough that the existing 429 tier-gate pattern keeps its own reason. - agent/anthropic_adapter.py: _common_betas_for_base_url, build_anthropic_client, build_anthropic_kwargs gain drop_context_1m_beta kwarg. Default=False (1M stays). OAuth OAUTH_ONLY_BETAS unchanged. - agent/transports/anthropic.py: build_kwargs forwards the flag. - run_agent.py: self._oauth_1m_beta_disabled flag, retry-once guard, recovery branch next to the image-shrink path. _rebuild_anthropic_client honors the flag. The main build_kwargs call site threads it through for fast-mode extra_headers. - hermes_cli/doctor.py, hermes_cli/models.py: sibling OAuth /v1/models probes get the same reactive retry — previously they'd falsely report the Anthropic API as unreachable for affected subscriptions. Tests: 2190 tests/agent/ + 94 adjacent integration tests pass. New unit tests cover the classifier pattern (including the collision guard against the 429 tier-gate) and the drop_context_1m_beta adapter behavior (default keeps 1M, flag strips only 1M while preserving every other beta).

@JayGwod

…NousResearch#17752) Keep context-1m-2025-08-07 in OAuth requests by default so 1M-capable subscriptions retain full context. When Anthropic rejects a request with 400 'long context beta is not yet available for this subscription', disable the beta for the rest of the session, rebuild the client, and retry once. Addresses NousResearch#17680 (thanks @JayGwod for the clean reproduction) without forcing every OAuth user off the 1M context window. Changes: - agent/error_classifier.py: new FailoverReason.oauth_long_context_beta_forbidden; pattern matches 400 + 'long context beta' + 'not yet available'. Narrow enough that the existing 429 tier-gate pattern keeps its own reason. - agent/anthropic_adapter.py: _common_betas_for_base_url, build_anthropic_client, build_anthropic_kwargs gain drop_context_1m_beta kwarg. Default=False (1M stays). OAuth OAUTH_ONLY_BETAS unchanged. - agent/transports/anthropic.py: build_kwargs forwards the flag. - run_agent.py: self._oauth_1m_beta_disabled flag, retry-once guard, recovery branch next to the image-shrink path. _rebuild_anthropic_client honors the flag. The main build_kwargs call site threads it through for fast-mode extra_headers. - hermes_cli/doctor.py, hermes_cli/models.py: sibling OAuth /v1/models probes get the same reactive retry — previously they'd falsely report the Anthropic API as unreachable for affected subscriptions. Tests: 2190 tests/agent/ + 94 adjacent integration tests pass. New unit tests cover the classifier pattern (including the collision guard against the 429 tier-gate) and the drop_context_1m_beta adapter behavior (default keeps 1M, flag strips only 1M while preserving every other beta).

alt-glitch added comp/agent Core agent loop, run_agent.py, prompt builder comp/cli CLI entry point, hermes_cli/, setup wizard P1 High — major feature broken, no workaround provider/anthropic Anthropic native Messages API type/bug Something isn't working labels Apr 30, 2026

teknium1 merged commit 828d3a3 into main Apr 30, 2026
11 of 12 checks passed

teknium1 deleted the hermes/hermes-213f54eb branch April 30, 2026 04:56

teknium1 mentioned this pull request Apr 30, 2026

fix(anthropic): omit 1m context beta for native OAuth #17680

Closed

SkylineAtDayAndNight mentioned this pull request Apr 30, 2026

fix(anthropic): strip context-1m beta on OAuth path #17442

Closed

4 tasks

This was referenced Apr 30, 2026

fix(context): honor model.context_length for Ollama num_ctx and display paths (salvage #17725) #17921

Merged

fix(context): honor model.context_length for Ollama num_ctx and all display paths #17725

Closed

JayGwod mentioned this pull request May 2, 2026

fix(anthropic): use Claude Code-compatible MCP tool prefix for OAuth #17681

Open

alt-glitch mentioned this pull request May 3, 2026

fix: avoid unsupported Anthropic context beta by default #19166

Closed

aldoeliacim mentioned this pull request May 12, 2026

fix(anthropic): harden Claude Code OAuth proxy request shape #23361

Open

rodboev mentioned this pull request Jun 2, 2026

feat(anthropic): add anthropic.drop_context_1m_beta config flag and honor it in auxiliary clients (#21557) #37703

Open

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(anthropic): reactive recovery for OAuth 1M-context beta rejection#17752

fix(anthropic): reactive recovery for OAuth 1M-context beta rejection#17752
teknium1 merged 1 commit into
mainfrom
hermes/hermes-213f54eb

teknium1 commented Apr 30, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

teknium1 commented Apr 30, 2026

Summary

Why reactive instead of unconditional omit

Changes

Recovery flow

Validation

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants