test(2/N): fix bedrock 1M-context beta + max_tokens retry hardening#8
Merged
Conversation
Bedrock 1M cluster (3 tests): add context-1m-2025-08-07 to _COMMON_BETAS
so it reaches Bedrock + non-OAuth fast-mode requests, while keeping the
client-level native/custom default header (and OAuth fast-mode) stripping
it for subscriptions that 400 on the long-context beta. Fast-mode now
re-includes the common betas even on models without speed=fast support.
max_tokens cluster (2 tests): the unsupported-parameter retry branch popped
max_tokens but never re-added max_completion_tokens, so providers that
rename the param ("Unknown parameter: max_tokens") lost their cap. Re-add
under the renamed key on retry (sync + async), preserving the ZAI 1210
strip-entirely behavior.
Follows PR #6 (1/N: timezone-independent flow-peak scheduler tests).
🔎 Lint report:
|
| Rule | Count |
|---|---|
invalid-argument-type |
3 |
First entries
run_agent.py:6649: [invalid-argument-type] invalid-argument-type: Argument to function `build_anthropic_client` is incorrect: Expected `str`, found `str | dict[Unknown | str, Unknown | str | dict[str, str]] | Any | ... omitted 3 union elements`
run_agent.py:12539: [invalid-argument-type] invalid-argument-type: Argument to function `_is_oauth_token` is incorrect: Expected `str`, found `str | dict[Unknown | str, Unknown | str | dict[str, str]] | Any | ... omitted 3 union elements`
run_agent.py:12542: [invalid-argument-type] invalid-argument-type: Argument to function `len` is incorrect: Expected `Sized`, found `(str & ~AlwaysFalsy) | (dict[Unknown | str, Unknown | str | dict[str, str]] & ~AlwaysFalsy) | (Any & ~AlwaysFalsy) | ... omitted 3 union elements`
✅ Fixed issues (3):
| Rule | Count |
|---|---|
invalid-argument-type |
3 |
First entries
run_agent.py:12542: [invalid-argument-type] invalid-argument-type: Argument to function `len` is incorrect: Expected `Sized`, found `(str & ~AlwaysFalsy) | (dict[Unknown, Unknown] & ~AlwaysFalsy) | (Any & ~AlwaysFalsy) | ... omitted 3 union elements`
run_agent.py:12539: [invalid-argument-type] invalid-argument-type: Argument to function `_is_oauth_token` is incorrect: Expected `str`, found `str | dict[Unknown, Unknown] | Any | ... omitted 3 union elements`
run_agent.py:6649: [invalid-argument-type] invalid-argument-type: Argument to function `build_anthropic_client` is incorrect: Expected `str`, found `str | dict[Unknown, Unknown] | Any | ... omitted 3 union elements`
Unchanged: 4073 pre-existing issues carried over.
Diagnostics are surfaced as warnings — this check never fails the build.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Context
Second PR in the test-suite-green series. Follows PR #6 (1/N) which made the
timezone-dependent flow-peak scheduler tests timezone-independent. This PR turns
5 more pre-existing failures green in the
agent/model-API cluster.Tests fixed (5)
Bedrock 1M-context beta cluster:
tests/agent/test_bedrock_1m_context.py::TestBedrockContext1MBeta::test_common_betas_includes_1mtests/agent/test_bedrock_1m_context.py::TestBedrockContext1MBeta::test_common_betas_for_native_anthropic_includes_1mtests/agent/test_bedrock_1m_context.py::TestBedrockContext1MBeta::test_build_anthropic_kwargs_includes_1m_for_bedrock_fastmodemax_tokens retry-hardening cluster:
4.
tests/agent/test_unsupported_parameter_retry.py::TestMaxTokensRetryHardening::test_sync_max_tokens_retry_matches_generic_phrasing5.
tests/agent/test_unsupported_parameter_retry.py::TestMaxTokensRetryHardening::test_async_max_tokens_retry_matches_generic_phrasingRoot cause
Bedrock cluster:
context-1m-2025-08-07was deliberately kept OUT of_COMMON_BETASand only appended for Azure, so neither the common-betas set northe non-OAuth fast-mode request path carried it — Bedrock/native Opus 4.6/4.7
silently capped at 200K. The test suite now requires the beta to live in
_COMMON_BETAS(and flow to Bedrock + non-OAuth fast-mode) while still beingabsent from the client-level native/custom default header and the OAuth
fast-mode path (subscriptions without long-context 400 on it).
max_tokens cluster: the unsupported-parameter retry branch did
kwargs.pop("max_tokens"); kwargs.pop("max_completion_tokens")but neverre-added the cap under the renamed key. Providers that reject the parameter as a
rename ("Unknown parameter: max_tokens — use max_completion_tokens") lost their
token cap entirely, and the test (which asserts
max_completion_tokens == 512)hit
KeyError.Fix
agent/anthropic_adapter.pycontext-1m-2025-08-07to_COMMON_BETAS._common_betas_for_base_url: keep the MiniMax bearer-auth strip; rely on thecommon set for everything else.
build_anthropic_client: compute an effective drop flag so the client-leveldefault header omits 1M for native/custom endpoints but keeps it for Azure
(and MiniMax strips it regardless). Dedupe the Bedrock client header join.
build_anthropic_kwargsfast-mode: always re-emit the common betas inper-request
extra_headers(they override client defaults), drop 1M only onthe OAuth path, and gate
speed=fast+ the fast-mode beta to models thatactually support Fast Mode (Opus 4.6) — so Opus 4.7 fast-mode still carries 1M
without 400'ing on
speed.agent/auxiliary_client.pymax_completion_tokens = max_tokensafter poppingmax_tokens, except for theZAI 1210 case which wants the param removed entirely.
Validation
Before (clean
main):pytest tests/agent/→ 18 failed, 2520 passed, 7 skipped(the 5 above + 13 environment-dependent OAuth/keychain/botocore-region tests).
After this PR:
pytest tests/agent/→ 13 failed, 2525 passed, 7 skipped.pytest tests/agent/test_bedrock_1m_context.py tests/agent/test_unsupported_parameter_retry.py→ 23 passed (was 5 failed / 18 passed).clean-
mainset (TestReadClaudeCodeCredentials ×5, TestResolveBedrocRegion ×3,TestResolveAnthropicToken ×2, TestRunOauthSetupToken ×3) — all environment-
dependent (keychain / credential files / botocore profile), untouched here.
TestBuildAnthropicClient,TestBuildAnthropicKwargs, MiniMax/Azure/Bedrock header tests.Remaining on main
After merge: ~33 pre-existing failures remain repo-wide (38 at series start
minus PR #6's batch minus these 5). The 13 still-red
agent/tests in this runare environment-dependent and out of scope for this PR.
Note
Medium Risk
Changes default Anthropic beta headers and fast-mode/OAuth branching affect all Claude API paths; incorrect stripping could 400 auxiliary calls or cap context at 200K on Bedrock.
Overview
Moves the 1M context beta (
context-1m-2025-08-07) into_COMMON_BETASso Bedrock and non-OAuth fast-mode requests keep full beta headers when per-requestextra_headersoverride client defaults, while native/custom Anthropic clients still omit that beta on the default header (subscriptions without long-context can 400). Azure keeps 1M on the client; MiniMax bearer endpoints strip tool-streaming and 1M betas. Fast mode always re-emits applicable betas; OAuth fast-mode drops the 1M beta;speed: fastand the fast-mode beta apply only on models that support fast mode (e.g. Opus 4.6), so Opus 4.7 fast-mode can retain 1M without invalidspeed.In
auxiliary_client, sync/async retries aftermax_tokensis rejected as unknown/renamed now setmax_completion_tokensto the prior cap instead of dropping the limit entirely (ZAI 1210 still strips both).Reviewed by Cursor Bugbot for commit e7252d4. Bugbot is set up for automated code reviews on this repo. Configure here.