fix(anthropic): reactive recovery for OAuth 1M-context beta rejection#17752
Merged
Conversation
Keep context-1m-2025-08-07 in OAuth requests by default so 1M-capable subscriptions retain full context. When Anthropic rejects a request with 400 'long context beta is not yet available for this subscription', disable the beta for the rest of the session, rebuild the client, and retry once. Addresses #17680 (thanks @JayGwod for the clean reproduction) without forcing every OAuth user off the 1M context window. Changes: - agent/error_classifier.py: new FailoverReason.oauth_long_context_beta_forbidden; pattern matches 400 + 'long context beta' + 'not yet available'. Narrow enough that the existing 429 tier-gate pattern keeps its own reason. - agent/anthropic_adapter.py: _common_betas_for_base_url, build_anthropic_client, build_anthropic_kwargs gain drop_context_1m_beta kwarg. Default=False (1M stays). OAuth OAUTH_ONLY_BETAS unchanged. - agent/transports/anthropic.py: build_kwargs forwards the flag. - run_agent.py: self._oauth_1m_beta_disabled flag, retry-once guard, recovery branch next to the image-shrink path. _rebuild_anthropic_client honors the flag. The main build_kwargs call site threads it through for fast-mode extra_headers. - hermes_cli/doctor.py, hermes_cli/models.py: sibling OAuth /v1/models probes get the same reactive retry — previously they'd falsely report the Anthropic API as unreachable for affected subscriptions. Tests: 2190 tests/agent/ + 94 adjacent integration tests pass. New unit tests cover the classifier pattern (including the collision guard against the 429 tier-gate) and the drop_context_1m_beta adapter behavior (default keeps 1M, flag strips only 1M while preserving every other beta).
sunJose
pushed a commit
to sunJose/hermes-agent
that referenced
this pull request
Apr 30, 2026
…NousResearch#17752) Keep context-1m-2025-08-07 in OAuth requests by default so 1M-capable subscriptions retain full context. When Anthropic rejects a request with 400 'long context beta is not yet available for this subscription', disable the beta for the rest of the session, rebuild the client, and retry once. Addresses NousResearch#17680 (thanks @JayGwod for the clean reproduction) without forcing every OAuth user off the 1M context window. Changes: - agent/error_classifier.py: new FailoverReason.oauth_long_context_beta_forbidden; pattern matches 400 + 'long context beta' + 'not yet available'. Narrow enough that the existing 429 tier-gate pattern keeps its own reason. - agent/anthropic_adapter.py: _common_betas_for_base_url, build_anthropic_client, build_anthropic_kwargs gain drop_context_1m_beta kwarg. Default=False (1M stays). OAuth OAUTH_ONLY_BETAS unchanged. - agent/transports/anthropic.py: build_kwargs forwards the flag. - run_agent.py: self._oauth_1m_beta_disabled flag, retry-once guard, recovery branch next to the image-shrink path. _rebuild_anthropic_client honors the flag. The main build_kwargs call site threads it through for fast-mode extra_headers. - hermes_cli/doctor.py, hermes_cli/models.py: sibling OAuth /v1/models probes get the same reactive retry — previously they'd falsely report the Anthropic API as unreachable for affected subscriptions. Tests: 2190 tests/agent/ + 94 adjacent integration tests pass. New unit tests cover the classifier pattern (including the collision guard against the 429 tier-gate) and the drop_context_1m_beta adapter behavior (default keeps 1M, flag strips only 1M while preserving every other beta).
4 tasks
This was referenced Apr 30, 2026
nickdlkk
pushed a commit
to nickdlkk/hermes-agent
that referenced
this pull request
May 11, 2026
…NousResearch#17752) Keep context-1m-2025-08-07 in OAuth requests by default so 1M-capable subscriptions retain full context. When Anthropic rejects a request with 400 'long context beta is not yet available for this subscription', disable the beta for the rest of the session, rebuild the client, and retry once. Addresses NousResearch#17680 (thanks @JayGwod for the clean reproduction) without forcing every OAuth user off the 1M context window. Changes: - agent/error_classifier.py: new FailoverReason.oauth_long_context_beta_forbidden; pattern matches 400 + 'long context beta' + 'not yet available'. Narrow enough that the existing 429 tier-gate pattern keeps its own reason. - agent/anthropic_adapter.py: _common_betas_for_base_url, build_anthropic_client, build_anthropic_kwargs gain drop_context_1m_beta kwarg. Default=False (1M stays). OAuth OAUTH_ONLY_BETAS unchanged. - agent/transports/anthropic.py: build_kwargs forwards the flag. - run_agent.py: self._oauth_1m_beta_disabled flag, retry-once guard, recovery branch next to the image-shrink path. _rebuild_anthropic_client honors the flag. The main build_kwargs call site threads it through for fast-mode extra_headers. - hermes_cli/doctor.py, hermes_cli/models.py: sibling OAuth /v1/models probes get the same reactive retry — previously they'd falsely report the Anthropic API as unreachable for affected subscriptions. Tests: 2190 tests/agent/ + 94 adjacent integration tests pass. New unit tests cover the classifier pattern (including the collision guard against the 429 tier-gate) and the drop_context_1m_beta adapter behavior (default keeps 1M, flag strips only 1M while preserving every other beta).
02356abc
pushed a commit
to 02356abc/hermes-agent
that referenced
this pull request
May 14, 2026
…NousResearch#17752) Keep context-1m-2025-08-07 in OAuth requests by default so 1M-capable subscriptions retain full context. When Anthropic rejects a request with 400 'long context beta is not yet available for this subscription', disable the beta for the rest of the session, rebuild the client, and retry once. Addresses NousResearch#17680 (thanks @JayGwod for the clean reproduction) without forcing every OAuth user off the 1M context window. Changes: - agent/error_classifier.py: new FailoverReason.oauth_long_context_beta_forbidden; pattern matches 400 + 'long context beta' + 'not yet available'. Narrow enough that the existing 429 tier-gate pattern keeps its own reason. - agent/anthropic_adapter.py: _common_betas_for_base_url, build_anthropic_client, build_anthropic_kwargs gain drop_context_1m_beta kwarg. Default=False (1M stays). OAuth OAUTH_ONLY_BETAS unchanged. - agent/transports/anthropic.py: build_kwargs forwards the flag. - run_agent.py: self._oauth_1m_beta_disabled flag, retry-once guard, recovery branch next to the image-shrink path. _rebuild_anthropic_client honors the flag. The main build_kwargs call site threads it through for fast-mode extra_headers. - hermes_cli/doctor.py, hermes_cli/models.py: sibling OAuth /v1/models probes get the same reactive retry — previously they'd falsely report the Anthropic API as unreachable for affected subscriptions. Tests: 2190 tests/agent/ + 94 adjacent integration tests pass. New unit tests cover the classifier pattern (including the collision guard against the 429 tier-gate) and the drop_context_1m_beta adapter behavior (default keeps 1M, flag strips only 1M while preserving every other beta).
jsboige
pushed a commit
to jsboige/hermes-agent
that referenced
this pull request
May 14, 2026
…NousResearch#17752) Keep context-1m-2025-08-07 in OAuth requests by default so 1M-capable subscriptions retain full context. When Anthropic rejects a request with 400 'long context beta is not yet available for this subscription', disable the beta for the rest of the session, rebuild the client, and retry once. Addresses NousResearch#17680 (thanks @JayGwod for the clean reproduction) without forcing every OAuth user off the 1M context window. Changes: - agent/error_classifier.py: new FailoverReason.oauth_long_context_beta_forbidden; pattern matches 400 + 'long context beta' + 'not yet available'. Narrow enough that the existing 429 tier-gate pattern keeps its own reason. - agent/anthropic_adapter.py: _common_betas_for_base_url, build_anthropic_client, build_anthropic_kwargs gain drop_context_1m_beta kwarg. Default=False (1M stays). OAuth OAUTH_ONLY_BETAS unchanged. - agent/transports/anthropic.py: build_kwargs forwards the flag. - run_agent.py: self._oauth_1m_beta_disabled flag, retry-once guard, recovery branch next to the image-shrink path. _rebuild_anthropic_client honors the flag. The main build_kwargs call site threads it through for fast-mode extra_headers. - hermes_cli/doctor.py, hermes_cli/models.py: sibling OAuth /v1/models probes get the same reactive retry — previously they'd falsely report the Anthropic API as unreachable for affected subscriptions. Tests: 2190 tests/agent/ + 94 adjacent integration tests pass. New unit tests cover the classifier pattern (including the collision guard against the 429 tier-gate) and the drop_context_1m_beta adapter behavior (default keeps 1M, flag strips only 1M while preserving every other beta).
dannyJ848
pushed a commit
to dannyJ848/hermes-agent
that referenced
this pull request
May 17, 2026
…NousResearch#17752) Keep context-1m-2025-08-07 in OAuth requests by default so 1M-capable subscriptions retain full context. When Anthropic rejects a request with 400 'long context beta is not yet available for this subscription', disable the beta for the rest of the session, rebuild the client, and retry once. Addresses NousResearch#17680 (thanks @JayGwod for the clean reproduction) without forcing every OAuth user off the 1M context window. Changes: - agent/error_classifier.py: new FailoverReason.oauth_long_context_beta_forbidden; pattern matches 400 + 'long context beta' + 'not yet available'. Narrow enough that the existing 429 tier-gate pattern keeps its own reason. - agent/anthropic_adapter.py: _common_betas_for_base_url, build_anthropic_client, build_anthropic_kwargs gain drop_context_1m_beta kwarg. Default=False (1M stays). OAuth OAUTH_ONLY_BETAS unchanged. - agent/transports/anthropic.py: build_kwargs forwards the flag. - run_agent.py: self._oauth_1m_beta_disabled flag, retry-once guard, recovery branch next to the image-shrink path. _rebuild_anthropic_client honors the flag. The main build_kwargs call site threads it through for fast-mode extra_headers. - hermes_cli/doctor.py, hermes_cli/models.py: sibling OAuth /v1/models probes get the same reactive retry — previously they'd falsely report the Anthropic API as unreachable for affected subscriptions. Tests: 2190 tests/agent/ + 94 adjacent integration tests pass. New unit tests cover the classifier pattern (including the collision guard against the 429 tier-gate) and the drop_context_1m_beta adapter behavior (default keeps 1M, flag strips only 1M while preserving every other beta).
gweeteve
pushed a commit
to gweeteve/hermes-agent
that referenced
this pull request
Jun 2, 2026
…NousResearch#17752) Keep context-1m-2025-08-07 in OAuth requests by default so 1M-capable subscriptions retain full context. When Anthropic rejects a request with 400 'long context beta is not yet available for this subscription', disable the beta for the rest of the session, rebuild the client, and retry once. Addresses NousResearch#17680 (thanks @JayGwod for the clean reproduction) without forcing every OAuth user off the 1M context window. Changes: - agent/error_classifier.py: new FailoverReason.oauth_long_context_beta_forbidden; pattern matches 400 + 'long context beta' + 'not yet available'. Narrow enough that the existing 429 tier-gate pattern keeps its own reason. - agent/anthropic_adapter.py: _common_betas_for_base_url, build_anthropic_client, build_anthropic_kwargs gain drop_context_1m_beta kwarg. Default=False (1M stays). OAuth OAUTH_ONLY_BETAS unchanged. - agent/transports/anthropic.py: build_kwargs forwards the flag. - run_agent.py: self._oauth_1m_beta_disabled flag, retry-once guard, recovery branch next to the image-shrink path. _rebuild_anthropic_client honors the flag. The main build_kwargs call site threads it through for fast-mode extra_headers. - hermes_cli/doctor.py, hermes_cli/models.py: sibling OAuth /v1/models probes get the same reactive retry — previously they'd falsely report the Anthropic API as unreachable for affected subscriptions. Tests: 2190 tests/agent/ + 94 adjacent integration tests pass. New unit tests cover the classifier pattern (including the collision guard against the 429 tier-gate) and the drop_context_1m_beta adapter behavior (default keeps 1M, flag strips only 1M while preserving every other beta).
4 tasks
Egavasyug
pushed a commit
to Egavasyug/hermes-agent
that referenced
this pull request
Jun 10, 2026
…NousResearch#17752) Keep context-1m-2025-08-07 in OAuth requests by default so 1M-capable subscriptions retain full context. When Anthropic rejects a request with 400 'long context beta is not yet available for this subscription', disable the beta for the rest of the session, rebuild the client, and retry once. Addresses NousResearch#17680 (thanks @JayGwod for the clean reproduction) without forcing every OAuth user off the 1M context window. Changes: - agent/error_classifier.py: new FailoverReason.oauth_long_context_beta_forbidden; pattern matches 400 + 'long context beta' + 'not yet available'. Narrow enough that the existing 429 tier-gate pattern keeps its own reason. - agent/anthropic_adapter.py: _common_betas_for_base_url, build_anthropic_client, build_anthropic_kwargs gain drop_context_1m_beta kwarg. Default=False (1M stays). OAuth OAUTH_ONLY_BETAS unchanged. - agent/transports/anthropic.py: build_kwargs forwards the flag. - run_agent.py: self._oauth_1m_beta_disabled flag, retry-once guard, recovery branch next to the image-shrink path. _rebuild_anthropic_client honors the flag. The main build_kwargs call site threads it through for fast-mode extra_headers. - hermes_cli/doctor.py, hermes_cli/models.py: sibling OAuth /v1/models probes get the same reactive retry — previously they'd falsely report the Anthropic API as unreachable for affected subscriptions. Tests: 2190 tests/agent/ + 94 adjacent integration tests pass. New unit tests cover the classifier pattern (including the collision guard against the 429 tier-gate) and the drop_context_1m_beta adapter behavior (default keeps 1M, flag strips only 1M while preserving every other beta).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
OAuth users with 1M-capable subscriptions keep the full 1M context window. OAuth users whose subscriptions don't include 1M context get one transparent retry with the beta stripped and a session-wide disable, instead of a hard failure.
Closes #17680 (thanks @JayGwod for the clean reproduction).
Why reactive instead of unconditional omit
#17680 proposes always omitting
context-1m-2025-08-07from OAuth requests. That protects affected subscriptions but silently downgrades every other OAuth user's context window from 1M to 200K — there's no user-visible signal, just truncated context on the next long session. Reactive recovery lets the provider's own error signal be the ground truth: attempt full fidelity, recover on actual rejection, persist within the session so we don't re-probe every turn.Changes
FailoverReason.oauth_long_context_beta_forbidden— matches 400 + "long context beta" + "not yet available". Narrow enough to not collide with the 429 "extra usage" tier-gate pattern._common_betas_for_base_url,build_anthropic_client,build_anthropic_kwargsgaindrop_context_1m_betakwarg (default False).build_kwargsforwards the flag.self._oauth_1m_beta_disabledsession flag +oauth_1m_beta_retry_attemptedper-turn guard + recovery branch next toimage_shrink_retry_attempted._rebuild_anthropic_clienthonors the flag; mainbuild_kwargscall threads it for fast-modeextra_headers.hermes doctor/v1/models probe retries once on the same error so affected subscriptions don't falsely report as unreachable._fetch_anthropic_modelsretries once on the same error so OAuth model discovery works.Recovery flow
oauth_long_context_beta_forbidden.self._oauth_1m_beta_disabled = True, closes + rebuilds the Anthropic client (which reads the flag via_rebuild_anthropic_client), logs🔕 OAuth subscription doesn't support the 1M-context beta — disabled for this session and retrying..., retries once.Validation
tests/agent/tests pass.test_long_context_tier_429,test_bedrock_integration,test_ctx_halving_fix).build_anthropic_clientdefault keeps 1M for OAuth.build_anthropic_clientwithdrop_context_1m_beta=Truestrips only 1M, preserves every other beta.build_anthropic_kwargsfast-mode extra_headers: default keeps 1M, flag strips only 1M.Full live verification of the rejection path requires an OAuth subscription that doesn't include 1M context — not reproducible locally without access to one. Happy to gate on a user report if wanted, or merge and iterate.