Skip to content

fix(anthropic): omit 1m context beta for native OAuth#17680

Closed
JayGwod wants to merge 1 commit into
NousResearch:mainfrom
JayGwod:fix/anthropic-oauth-context-1m-beta
Closed

fix(anthropic): omit 1m context beta for native OAuth#17680
JayGwod wants to merge 1 commit into
NousResearch:mainfrom
JayGwod:fix/anthropic-oauth-context-1m-beta

Conversation

@JayGwod

@JayGwod JayGwod commented Apr 30, 2026

Copy link
Copy Markdown

Summary

  • Omit the context-1m-2025-08-07 beta header for native Anthropic OAuth / Claude Code-style auth.
  • Keep the existing common beta behavior for API-key auth and non-native Anthropic-compatible routes.
  • Also omit the 1M context beta from per-request fast-mode headers when the request is OAuth-authenticated.

Why

Some Anthropic OAuth subscriptions reject requests before model/tool execution when the request includes:

anthropic-beta: context-1m-2025-08-07

The observed provider error is:

The long context beta is not yet available for this subscription.

That failure is independent of the model/tool payload. Native Anthropic 1M context availability can be account/subscription gated differently from Bedrock/Azure beta-header requirements, so OAuth requests should not unconditionally inherit this beta header.

Behavior preserved

  • API-key auth still includes context-1m-2025-08-07 in the common beta list.
  • Bedrock/Azure and other non-OAuth paths continue using _common_betas_for_base_url(...) unchanged.
  • OAuth-only betas such as claude-code-20250219 and oauth-2025-04-20 remain present for OAuth requests.

Tests

./venv/bin/python -m py_compile agent/anthropic_adapter.py
./venv/bin/python -m pytest -o 'addopts=' tests/agent/test_anthropic_adapter.py -q

Result:

139 passed

@alt-glitch alt-glitch added type/bug Something isn't working P1 High — major feature broken, no workaround comp/agent Core agent loop, run_agent.py, prompt builder provider/anthropic Anthropic native Messages API area/auth Authentication, OAuth, credential pools labels Apr 30, 2026
@alt-glitch

Copy link
Copy Markdown
Collaborator

Likely duplicate of #17442 — same fix: strip context-1m beta header on OAuth path to avoid 400s on consumer subscriptions.

1 similar comment
@alt-glitch

Copy link
Copy Markdown
Collaborator

Likely duplicate of #17442 — same fix: strip context-1m beta header on OAuth path to avoid 400s on consumer subscriptions.

@JayGwod

JayGwod commented Apr 30, 2026

Copy link
Copy Markdown
Author

Thanks — agreed the main fix overlaps with #17442.

One small difference in this PR is that it also covers the fast-mode per-request extra_headers path in build_anthropic_kwargs(). That path rebuilds anthropic-beta and overrides the client-level header, so if only the client construction path strips context-1m-2025-08-07, an OAuth + fast_mode=True request can still reintroduce the same beta header.

If maintainers prefer #17442 as the primary PR, I'm happy to close this one and either:

  1. add the fast-mode extra_headers coverage as a small follow-up on top of fix(anthropic): strip context-1m beta on OAuth path #17442, or
  2. keep this open only if you want the broader coverage/test here.

teknium1 added a commit that referenced this pull request Apr 30, 2026
…#17752)

Keep context-1m-2025-08-07 in OAuth requests by default so 1M-capable
subscriptions retain full context. When Anthropic rejects a request with
400 'long context beta is not yet available for this subscription',
disable the beta for the rest of the session, rebuild the client, and
retry once.

Addresses #17680 (thanks @JayGwod for the clean reproduction) without
forcing every OAuth user off the 1M context window.

Changes:
- agent/error_classifier.py: new FailoverReason.oauth_long_context_beta_forbidden;
  pattern matches 400 + 'long context beta' + 'not yet available'. Narrow
  enough that the existing 429 tier-gate pattern keeps its own reason.
- agent/anthropic_adapter.py: _common_betas_for_base_url,
  build_anthropic_client, build_anthropic_kwargs gain drop_context_1m_beta
  kwarg. Default=False (1M stays). OAuth OAUTH_ONLY_BETAS unchanged.
- agent/transports/anthropic.py: build_kwargs forwards the flag.
- run_agent.py: self._oauth_1m_beta_disabled flag, retry-once guard,
  recovery branch next to the image-shrink path. _rebuild_anthropic_client
  honors the flag. The main build_kwargs call site threads it through for
  fast-mode extra_headers.
- hermes_cli/doctor.py, hermes_cli/models.py: sibling OAuth /v1/models
  probes get the same reactive retry — previously they'd falsely report
  the Anthropic API as unreachable for affected subscriptions.

Tests: 2190 tests/agent/ + 94 adjacent integration tests pass. New unit
tests cover the classifier pattern (including the collision guard against
the 429 tier-gate) and the drop_context_1m_beta adapter behavior (default
keeps 1M, flag strips only 1M while preserving every other beta).
@teknium1

Copy link
Copy Markdown
Contributor

Addressed via PR #17752 (merged as 828d3a3) — thanks @JayGwod for the clean reproduction and the exact error phrasing, both were crucial.

I took a reactive-recovery approach instead of unconditionally omitting the beta. The concern with the unconditional-omit path is that OAuth users whose subscriptions do include 1M context would silently lose the capability (their next long session gets truncated at 200K with no user-visible signal). Reactive recovery lets Anthropic's own error be the ground truth:

  1. Request goes out with context-1m-2025-08-07 as today.
  2. If Anthropic returns 400 with "long context beta is not yet available for this subscription", the classifier tags it oauth_long_context_beta_forbidden.
  3. The retry loop sets a session-wide _oauth_1m_beta_disabled flag, rebuilds the Anthropic client without the beta, logs 🔕 OAuth subscription doesn't support the 1M-context beta — disabled for this session and retrying..., retries once.
  4. Remaining turns in the session reuse the reduced-beta client — no repeated probing, one failed attempt total per session.

Result: 1M-capable subscriptions keep full 1M; 1M-ineligible subscriptions pay one silent failed call per session and then work transparently.

Also widened to the two sibling OAuth /v1/models probes you didn't touch (hermes_cli/doctor.py and hermes_cli/models.py) — they would have falsely reported Anthropic as unreachable for affected subscriptions.

Full diff: #17752

sunJose pushed a commit to sunJose/hermes-agent that referenced this pull request Apr 30, 2026
…NousResearch#17752)

Keep context-1m-2025-08-07 in OAuth requests by default so 1M-capable
subscriptions retain full context. When Anthropic rejects a request with
400 'long context beta is not yet available for this subscription',
disable the beta for the rest of the session, rebuild the client, and
retry once.

Addresses NousResearch#17680 (thanks @JayGwod for the clean reproduction) without
forcing every OAuth user off the 1M context window.

Changes:
- agent/error_classifier.py: new FailoverReason.oauth_long_context_beta_forbidden;
  pattern matches 400 + 'long context beta' + 'not yet available'. Narrow
  enough that the existing 429 tier-gate pattern keeps its own reason.
- agent/anthropic_adapter.py: _common_betas_for_base_url,
  build_anthropic_client, build_anthropic_kwargs gain drop_context_1m_beta
  kwarg. Default=False (1M stays). OAuth OAUTH_ONLY_BETAS unchanged.
- agent/transports/anthropic.py: build_kwargs forwards the flag.
- run_agent.py: self._oauth_1m_beta_disabled flag, retry-once guard,
  recovery branch next to the image-shrink path. _rebuild_anthropic_client
  honors the flag. The main build_kwargs call site threads it through for
  fast-mode extra_headers.
- hermes_cli/doctor.py, hermes_cli/models.py: sibling OAuth /v1/models
  probes get the same reactive retry — previously they'd falsely report
  the Anthropic API as unreachable for affected subscriptions.

Tests: 2190 tests/agent/ + 94 adjacent integration tests pass. New unit
tests cover the classifier pattern (including the collision guard against
the 429 tier-gate) and the drop_context_1m_beta adapter behavior (default
keeps 1M, flag strips only 1M while preserving every other beta).
nickdlkk pushed a commit to nickdlkk/hermes-agent that referenced this pull request May 11, 2026
…NousResearch#17752)

Keep context-1m-2025-08-07 in OAuth requests by default so 1M-capable
subscriptions retain full context. When Anthropic rejects a request with
400 'long context beta is not yet available for this subscription',
disable the beta for the rest of the session, rebuild the client, and
retry once.

Addresses NousResearch#17680 (thanks @JayGwod for the clean reproduction) without
forcing every OAuth user off the 1M context window.

Changes:
- agent/error_classifier.py: new FailoverReason.oauth_long_context_beta_forbidden;
  pattern matches 400 + 'long context beta' + 'not yet available'. Narrow
  enough that the existing 429 tier-gate pattern keeps its own reason.
- agent/anthropic_adapter.py: _common_betas_for_base_url,
  build_anthropic_client, build_anthropic_kwargs gain drop_context_1m_beta
  kwarg. Default=False (1M stays). OAuth OAUTH_ONLY_BETAS unchanged.
- agent/transports/anthropic.py: build_kwargs forwards the flag.
- run_agent.py: self._oauth_1m_beta_disabled flag, retry-once guard,
  recovery branch next to the image-shrink path. _rebuild_anthropic_client
  honors the flag. The main build_kwargs call site threads it through for
  fast-mode extra_headers.
- hermes_cli/doctor.py, hermes_cli/models.py: sibling OAuth /v1/models
  probes get the same reactive retry — previously they'd falsely report
  the Anthropic API as unreachable for affected subscriptions.

Tests: 2190 tests/agent/ + 94 adjacent integration tests pass. New unit
tests cover the classifier pattern (including the collision guard against
the 429 tier-gate) and the drop_context_1m_beta adapter behavior (default
keeps 1M, flag strips only 1M while preserving every other beta).
02356abc pushed a commit to 02356abc/hermes-agent that referenced this pull request May 14, 2026
…NousResearch#17752)

Keep context-1m-2025-08-07 in OAuth requests by default so 1M-capable
subscriptions retain full context. When Anthropic rejects a request with
400 'long context beta is not yet available for this subscription',
disable the beta for the rest of the session, rebuild the client, and
retry once.

Addresses NousResearch#17680 (thanks @JayGwod for the clean reproduction) without
forcing every OAuth user off the 1M context window.

Changes:
- agent/error_classifier.py: new FailoverReason.oauth_long_context_beta_forbidden;
  pattern matches 400 + 'long context beta' + 'not yet available'. Narrow
  enough that the existing 429 tier-gate pattern keeps its own reason.
- agent/anthropic_adapter.py: _common_betas_for_base_url,
  build_anthropic_client, build_anthropic_kwargs gain drop_context_1m_beta
  kwarg. Default=False (1M stays). OAuth OAUTH_ONLY_BETAS unchanged.
- agent/transports/anthropic.py: build_kwargs forwards the flag.
- run_agent.py: self._oauth_1m_beta_disabled flag, retry-once guard,
  recovery branch next to the image-shrink path. _rebuild_anthropic_client
  honors the flag. The main build_kwargs call site threads it through for
  fast-mode extra_headers.
- hermes_cli/doctor.py, hermes_cli/models.py: sibling OAuth /v1/models
  probes get the same reactive retry — previously they'd falsely report
  the Anthropic API as unreachable for affected subscriptions.

Tests: 2190 tests/agent/ + 94 adjacent integration tests pass. New unit
tests cover the classifier pattern (including the collision guard against
the 429 tier-gate) and the drop_context_1m_beta adapter behavior (default
keeps 1M, flag strips only 1M while preserving every other beta).
jsboige pushed a commit to jsboige/hermes-agent that referenced this pull request May 14, 2026
…NousResearch#17752)

Keep context-1m-2025-08-07 in OAuth requests by default so 1M-capable
subscriptions retain full context. When Anthropic rejects a request with
400 'long context beta is not yet available for this subscription',
disable the beta for the rest of the session, rebuild the client, and
retry once.

Addresses NousResearch#17680 (thanks @JayGwod for the clean reproduction) without
forcing every OAuth user off the 1M context window.

Changes:
- agent/error_classifier.py: new FailoverReason.oauth_long_context_beta_forbidden;
  pattern matches 400 + 'long context beta' + 'not yet available'. Narrow
  enough that the existing 429 tier-gate pattern keeps its own reason.
- agent/anthropic_adapter.py: _common_betas_for_base_url,
  build_anthropic_client, build_anthropic_kwargs gain drop_context_1m_beta
  kwarg. Default=False (1M stays). OAuth OAUTH_ONLY_BETAS unchanged.
- agent/transports/anthropic.py: build_kwargs forwards the flag.
- run_agent.py: self._oauth_1m_beta_disabled flag, retry-once guard,
  recovery branch next to the image-shrink path. _rebuild_anthropic_client
  honors the flag. The main build_kwargs call site threads it through for
  fast-mode extra_headers.
- hermes_cli/doctor.py, hermes_cli/models.py: sibling OAuth /v1/models
  probes get the same reactive retry — previously they'd falsely report
  the Anthropic API as unreachable for affected subscriptions.

Tests: 2190 tests/agent/ + 94 adjacent integration tests pass. New unit
tests cover the classifier pattern (including the collision guard against
the 429 tier-gate) and the drop_context_1m_beta adapter behavior (default
keeps 1M, flag strips only 1M while preserving every other beta).
dannyJ848 pushed a commit to dannyJ848/hermes-agent that referenced this pull request May 17, 2026
…NousResearch#17752)

Keep context-1m-2025-08-07 in OAuth requests by default so 1M-capable
subscriptions retain full context. When Anthropic rejects a request with
400 'long context beta is not yet available for this subscription',
disable the beta for the rest of the session, rebuild the client, and
retry once.

Addresses NousResearch#17680 (thanks @JayGwod for the clean reproduction) without
forcing every OAuth user off the 1M context window.

Changes:
- agent/error_classifier.py: new FailoverReason.oauth_long_context_beta_forbidden;
  pattern matches 400 + 'long context beta' + 'not yet available'. Narrow
  enough that the existing 429 tier-gate pattern keeps its own reason.
- agent/anthropic_adapter.py: _common_betas_for_base_url,
  build_anthropic_client, build_anthropic_kwargs gain drop_context_1m_beta
  kwarg. Default=False (1M stays). OAuth OAUTH_ONLY_BETAS unchanged.
- agent/transports/anthropic.py: build_kwargs forwards the flag.
- run_agent.py: self._oauth_1m_beta_disabled flag, retry-once guard,
  recovery branch next to the image-shrink path. _rebuild_anthropic_client
  honors the flag. The main build_kwargs call site threads it through for
  fast-mode extra_headers.
- hermes_cli/doctor.py, hermes_cli/models.py: sibling OAuth /v1/models
  probes get the same reactive retry — previously they'd falsely report
  the Anthropic API as unreachable for affected subscriptions.

Tests: 2190 tests/agent/ + 94 adjacent integration tests pass. New unit
tests cover the classifier pattern (including the collision guard against
the 429 tier-gate) and the drop_context_1m_beta adapter behavior (default
keeps 1M, flag strips only 1M while preserving every other beta).
gweeteve pushed a commit to gweeteve/hermes-agent that referenced this pull request Jun 2, 2026
…NousResearch#17752)

Keep context-1m-2025-08-07 in OAuth requests by default so 1M-capable
subscriptions retain full context. When Anthropic rejects a request with
400 'long context beta is not yet available for this subscription',
disable the beta for the rest of the session, rebuild the client, and
retry once.

Addresses NousResearch#17680 (thanks @JayGwod for the clean reproduction) without
forcing every OAuth user off the 1M context window.

Changes:
- agent/error_classifier.py: new FailoverReason.oauth_long_context_beta_forbidden;
  pattern matches 400 + 'long context beta' + 'not yet available'. Narrow
  enough that the existing 429 tier-gate pattern keeps its own reason.
- agent/anthropic_adapter.py: _common_betas_for_base_url,
  build_anthropic_client, build_anthropic_kwargs gain drop_context_1m_beta
  kwarg. Default=False (1M stays). OAuth OAUTH_ONLY_BETAS unchanged.
- agent/transports/anthropic.py: build_kwargs forwards the flag.
- run_agent.py: self._oauth_1m_beta_disabled flag, retry-once guard,
  recovery branch next to the image-shrink path. _rebuild_anthropic_client
  honors the flag. The main build_kwargs call site threads it through for
  fast-mode extra_headers.
- hermes_cli/doctor.py, hermes_cli/models.py: sibling OAuth /v1/models
  probes get the same reactive retry — previously they'd falsely report
  the Anthropic API as unreachable for affected subscriptions.

Tests: 2190 tests/agent/ + 94 adjacent integration tests pass. New unit
tests cover the classifier pattern (including the collision guard against
the 429 tier-gate) and the drop_context_1m_beta adapter behavior (default
keeps 1M, flag strips only 1M while preserving every other beta).
Egavasyug pushed a commit to Egavasyug/hermes-agent that referenced this pull request Jun 10, 2026
…NousResearch#17752)

Keep context-1m-2025-08-07 in OAuth requests by default so 1M-capable
subscriptions retain full context. When Anthropic rejects a request with
400 'long context beta is not yet available for this subscription',
disable the beta for the rest of the session, rebuild the client, and
retry once.

Addresses NousResearch#17680 (thanks @JayGwod for the clean reproduction) without
forcing every OAuth user off the 1M context window.

Changes:
- agent/error_classifier.py: new FailoverReason.oauth_long_context_beta_forbidden;
  pattern matches 400 + 'long context beta' + 'not yet available'. Narrow
  enough that the existing 429 tier-gate pattern keeps its own reason.
- agent/anthropic_adapter.py: _common_betas_for_base_url,
  build_anthropic_client, build_anthropic_kwargs gain drop_context_1m_beta
  kwarg. Default=False (1M stays). OAuth OAUTH_ONLY_BETAS unchanged.
- agent/transports/anthropic.py: build_kwargs forwards the flag.
- run_agent.py: self._oauth_1m_beta_disabled flag, retry-once guard,
  recovery branch next to the image-shrink path. _rebuild_anthropic_client
  honors the flag. The main build_kwargs call site threads it through for
  fast-mode extra_headers.
- hermes_cli/doctor.py, hermes_cli/models.py: sibling OAuth /v1/models
  probes get the same reactive retry — previously they'd falsely report
  the Anthropic API as unreachable for affected subscriptions.

Tests: 2190 tests/agent/ + 94 adjacent integration tests pass. New unit
tests cover the classifier pattern (including the collision guard against
the 429 tier-gate) and the drop_context_1m_beta adapter behavior (default
keeps 1M, flag strips only 1M while preserving every other beta).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/auth Authentication, OAuth, credential pools comp/agent Core agent loop, run_agent.py, prompt builder P1 High — major feature broken, no workaround provider/anthropic Anthropic native Messages API type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants