feat(core): improve Anthropic proxy compatibility and enable global prompt cache scope#4020
Conversation
Code Coverage Summary
CLI Package - Full Text ReportCore Package - Full Text ReportFor detailed HTML reports, please see the 'coverage-reports-22.x-ubuntu-latest' artifact from the main CI run. |
…rompt cache scope - Use authToken instead of apiKey to send Authorization: Bearer header, avoiding dual-header conflicts with IdeaLab-style proxies - Set User-Agent to claude-cli format and add x-app header for proxy Team rule compatibility - Add adaptive thinking support for Claude 4.6+ models - Enable prompt-caching-scope-2026-01-05 beta header and scope: global on system prompt cache_control to improve cross-session cache hit rates
5327ad5 to
05a9c6e
Compare
wenshao
left a comment
There was a problem hiding this comment.
9 existing tests are broken by this PR — User-Agent, beta header, and cache_control scope assertions need updating.
wenshao
left a comment
There was a problem hiding this comment.
Review found 7 Critical and 9 Suggestion issues across 524 files (+74,929/-10,117 lines). Key Critical findings:
modelSupportsAdaptiveThinkingregex only matches single-digit versions[6-9]—claude-opus-4-10and beyond will not match- User-approved command silently rewritten before execution —
getConfirmationDetailsshows raw command butexecute()runs rewritten version; error messages actively hide the rewrite - Synchronous
execFileSyncblocks event loop in commit attribution —getGitHeadSync+ N+1validateAgainstcalls block 10-50ms+ each - Promoted shell orphan zombie — no exit listener; registry stays
runningforever; PID recycling risk isAmendCommit/findAttributableCommitSegmentmismatch — compoundgit commit && git commit --amendinjects trailer on wrong segmenthookRunnersilently downgrades deny→allow — non-zero hook exits that previously blocked are now allowed with warning- 172 tsc type errors — 164 TS4111 (index signature access), plus unused vars, missing return, missing types
Other Suggestion findings: beta header unconditional, User-Agent spoofed as claude-cli, Mistral provider false positive, shell.ts duplication (3725 lines), qwen-home-bootstrap.js sync hazard, priorReadEnforcement no dedicated test, chatCompressionService double JSON.stringify, WeChat detectImageMime incomplete WebP validation.
See inline comments for details.
Address review feedback on PR #4020 by narrowing each workaround to where it actually applies, instead of shipping it globally. - Gate `Authorization: Bearer` (`authToken`), `claude-cli` User-Agent, and `x-app: cli` to non-Anthropic-native baseURLs. Direct `api.anthropic.com` users keep the SDK-default `x-api-key` (`apiKey`) auth and a truthful `QwenCode` User-Agent so usage isn't misattributed in Anthropic's logs/quotas, and so a stricter Anthropic backend doesn't 401 on a `Bearer`-shaped header. - Gate the `prompt-caching-scope-2026-01-05` beta on `enableCacheControl`. When the converter isn't attaching `cache_control` to the body the beta is dead weight and risks 4xx responses from anthropic-compatible backends that don't recognize it. Restores the `betas.length === 0` early-return for the all-disabled case. - Detect adaptive-thinking models with numeric major/minor compare instead of `[6-9]`. The character class missed `claude-haiku-4-6` entirely and would silently fall through to `budget_tokens` on `claude-opus-4-10` / `claude-opus-5-1` once those ship — tripping HTTP 400 with a shape the server no longer accepts. - Honor explicit `reasoning.budget_tokens` before the adaptive branch. Adaptive omits `budget_tokens` from the wire shape, so checking it second silently dropped a user-supplied escape-hatch budget on Claude 4.6+ models. - Add `scope: 'global'` on the tool `cache_control` entry so the largest, slowest-changing prefix actually participates in cross-session caching under the new beta — the system-only attachment was capturing maybe half the available hit-rate improvement. - Replace the misleading `as { type: 'ephemeral' }` cast on the system block (which erased `scope` from the type while leaving it on the wire) with a `AnthropicTextBlockParam` type that mirrors the existing `AnthropicToolParam` widening, so types match the runtime shape.
wenshao
left a comment
There was a problem hiding this comment.
Review — deepseek-v4-pro
4 files, +322/−27. Build passes, all 89 tests green. Previous review comments all addressed.
Suggestion
**1. may diverge between generator and converter after hot **
,
The generator reads at request time (new code at line 316), but the converter captures it once at construction time (). If hot-switches mid-session, the beta header and body can disagree:
- : beta header dropped but body still carries
- : beta header sent but body has no
Suggested fix: have the converter read from the shared config object at convert time instead of caching at construction.
**2. Streaming and per-request beta tests don't assert **
Two existing tests use (where defaults to ) and verify / but not the new cache-scope beta flag. Both use so they pass even if the flag is accidentally removed.
Suggested fix: add at both locations.
Needs Human Review
- Cache-scope beta () is gated only by , not by . Non-Anthropic backends may 4xx on unknown beta flags. Mitigation exists: set .
- has two uncovered branches ( subdomain matching, URL parse failure catch).
Verdict
Comment — no Critical issues. All determinations pre-checked: self-PR, CI has Agent failure (unrelated to this diff).
— deepseek-v4-pro via Qwen Code /review
wenshao
left a comment
There was a problem hiding this comment.
Review — deepseek-v4-pro
4 files, +322/−27. Build passes, all 89 tests green. Previous review comments all addressed.
Suggestion
1. enableCacheControl may diverge between generator and converter after hot setModel()
packages/core/src/core/anthropicContentGenerator/anthropicContentGenerator.ts:316, converter.ts:91-94
The generator reads this.contentGeneratorConfig.enableCacheControl at request time (new code at line 316), but the converter captures it once at construction time (converter.ts:91-94). If Config.setModel() hot-switches enableCacheControl mid-session, the beta header and body cache_control can disagree:
true → false: beta header dropped but body still carriesscope: 'global'false → true: beta header sent but body has nocache_control
Suggested fix: have the converter read from the shared config object at convert time instead of caching at construction.
2. Streaming and per-request beta tests don't assert prompt-caching-scope-2026-01-05
packages/core/src/core/anthropicContentGenerator/anthropicContentGenerator.test.ts:463, 495
Two existing tests use baseConfig (where enableCacheControl defaults to true) and verify interleaved-thinking-2025-05-14 / effort-2025-11-24 but not the new cache-scope beta flag. Both use toContain so they pass even if the flag is accidentally removed.
Suggested fix: add expect(headers['anthropic-beta']).toContain('prompt-caching-scope-2026-01-05') at both locations.
Needs Human Review
- Cache-scope beta (
prompt-caching-scope-2026-01-05) is gated only byenableCacheControl, not byisAnthropicNativeBaseUrl. Non-Anthropic backends may 4xx on unknown beta flags. Mitigation exists: setenableCacheControl: false. isAnthropicNativeBaseUrlhas two uncovered branches (.anthropic.comsubdomain matching, URL parse failure catch).
Verdict
Comment — no Critical issues. All determinations pre-checked: self-PR, CI has Agent failure (unrelated to this diff).
— deepseek-v4-pro via Qwen Code /review
Follow-up on PR #4020 review: `Config.setModel()` mutates `enableCacheControl` in place (it's in `MODEL_GENERATION_CONFIG_FIELDS`), but the converter captured it once at construction. On a hot flip the generator's per-request `prompt-caching-scope-2026-01-05` beta gate would sample the new value while the converter still emitted the old body-side `cache_control` — beta-header and body could disagree. - Thread the live `contentGeneratorConfig.enableCacheControl` into the converter via a per-call options override on both `convertGeminiRequestToAnthropic` and `convertGeminiToolsToAnthropic`, falling back to the constructor-time default when the caller doesn't pass one. The generator samples the value once per `buildRequest` and forwards it to both convert calls so the body and beta header always agree within a request, even across `setModel` flips. - Regression test: hot-flip `enableCacheControl` from `true` to `false` on a live generator, verify the 2nd request drops both the beta header AND the body-side `cache_control` in lockstep. - Tighten two existing beta-header tests that used `toContain` only on `interleaved-thinking` / `effort` — they now also assert `prompt-caching-scope-2026-01-05` is present (per-request keep-default and streaming paths), so accidental removal trips the test. - Add coverage for the two previously-uncovered branches of `isAnthropicNativeBaseUrl`: `*.anthropic.com` subdomains (Anthropic-native) and a malformed baseURL (URL parse failure → proxy fallthrough). Also add an `anthropic.com.evil.com` hostname-spoof case mirroring the existing DeepSeek spoof test.
…e cases Follow-up on PR #4020 review: - Extract `AnthropicThinkingParam` type alias. The thinking union `{ type: 'enabled'; budget_tokens } | { type: 'adaptive' }` was repeated verbatim in three places: the `MessageCreateParamsWithThinking` field, the streaming-request intersection, and `buildThinkingConfig`'s return type. Once a third shape ships, forgetting one site would silently narrow a runtime value — single alias keeps them locked. - Compute `useProxyIdentity` once in the constructor and pass it into `buildHeaders`. Previously `useBearerAuth` and `useProxyIdentity` named the same predicate at two call sites; collapsing them clarifies that Bearer auth + `claude-cli` UA + `x-app: cli` are one bundle that should never be split. - Document that `modelSupportsAdaptiveThinking`'s regex is intentionally unanchored so reseller-prefixed names (`bedrock/claude-opus-4-7`, `vertex_ai/claude-sonnet-4-6@…`, `idealab:claude-opus-4-6`, …) keep matching. Tightening to `^claude-` would silently regress those. - Soften the `prompt-caching-scope` beta comment so it describes what the code enforces (gate on the `enableCacheControl` flag) rather than promising a stronger "only ship when cache_control is on the body" invariant — the converter still skips `cache_control` on niche shapes (e.g. no system text, no tools, last user block isn't text). The looser gate is intentional; Anthropic-native ignores unused betas. - Pin the wire shape for the `reasoning: undefined` + 4.6+ model corner. `resolveEffectiveEffort` returns undefined on `reasoning === undefined`, so `buildThinkingConfig` ships `{ type: 'adaptive' }` with no `output_config` and no `effort-2025-11-24` beta. If Anthropic ever starts requiring `output_config.effort` alongside adaptive, this test will fail at CI rather than at runtime as a server 400.
wenshao
left a comment
There was a problem hiding this comment.
-
addCacheControlToMessagesmissingscope: 'global'—converter.tsaddCacheControlToMessages(L~789) still writes{ type: 'ephemeral' }withoutscope: 'global', inconsistent with system/tool blocks which both addedscope: 'global'in this PR. User messages change per-request so cross-session caching here has lower value, but no comment explains the intentional omission — future maintainers may mistake it for a bug. -
Converter per-call
enableCacheControloverride untested —converter.test.tshas no tests passing{ enableCacheControl: false/true }toconvertGeminiRequestToAnthropicorconvertGeminiToolsToAnthropic. The??fallback path that enables hotConfig.setModel()flips is never verified at the converter level. -
Hot flip test doesn't verify tool path —
anthropicContentGenerator.test.ts"hot enableCacheControl flips" test only assertssystemoutput, not toolcache_controlfromconvertGeminiToolsToAnthropic.
— DeepSeek/deepseek-v4-pro via Qwen Code /review
…gate Follow-up on PR #4020 review: the `prompt-caching-scope-2026-01-05` beta header and the body-side `scope: 'global'` field together comprise an Anthropic-only wire-shape extension. Shipping them to non-Anthropic backends (DeepSeek, IdeaLab) leaned on "unknown betas are ignored" — true on Anthropic-native, but unverified for proxies and silently inconsistent with the auth/identity gate, which already uses `isAnthropicNativeBaseUrl` to bind Bearer / claude-cli / x-app to the proxy path only. - Add `useGlobalCacheScope` predicate on the generator. True iff `enableCacheControl !== false` AND the resolved baseURL is Anthropic-native. Plumbed per-request into both `convertGeminiRequestToAnthropic` and `convertGeminiToolsToAnthropic`; the same predicate also gates the `prompt-caching-scope-2026-01-05` beta in `buildPerRequestHeaders` so beta + scope field always travel together. - Converter emits `cache_control: { type: 'ephemeral' }` (per-session) when scope is off and `{ type: 'ephemeral', scope: 'global' }` when on. Non-Anthropic baseURLs go back to their pre-PR per-session caching shape; existing prompt caching keeps working with no new beta. - Document the intentional `scope: 'global'` omission on `addCacheControlToMessages`. The last user message changes every turn (live prompt + immediate tool_results), so cross-session reuse has effectively zero hit rate; cross-session caching is concentrated on system + tool prefixes only. Tests: - DeepSeek baseURL pins the proxy auth/identity path (`authToken` / claude-cli UA / `x-app: cli`). Documents the contract assumption that DeepSeek's anthropic-compatible endpoint accepts `Authorization: Bearer` — any future deviation surfaces here rather than at runtime for users. - Non-Anthropic baseURL strips the cache-scope beta AND `scope: 'global'` from the wire shape, while keeping per-session `cache_control: { type: 'ephemeral' }` on system / tools. - Hot-flip test extended to assert tool `cache_control` flips alongside system / user / beta header. - Converter-level tests for per-call `enableCacheControl` and `useGlobalCacheScope` overrides — both directions of the constructor default (true→false, false→true) and the scope-independent-of-source case (cache on, scope off → per-session shape). - baseConfig in the per-request anthropic-beta block now targets `api.anthropic.com` so cache-scope assertions remain meaningful; the proxy-baseURL behavior is covered separately.
|
Thanks for catching this — confirmed the credential-leak path and fixed in What changedThe leak (your scenario): SDK destructures with defaults ( Fix: explicit ...(useProxyIdentity
? { authToken: contentGeneratorConfig.apiKey, apiKey: null }
: { apiKey: contentGeneratorConfig.apiKey, authToken: null }),Destructuring default no longer fires; env back-fill is dead. Inverse direction (you flagged it alongside): TestsFive new regression tests in the env-poison suite — all assert the exact constructor args:
Also updated the existing 7 identity tests from Re-requesting review. |
tanzhenxin
left a comment
There was a problem hiding this comment.
Review
Both code-level findings from the prior round are closed. The Bearer-auth env back-fill is fixed at the right place — passing apiKey: null (and the symmetric authToken: null on the native branch) suppresses the SDK's destructuring default, since defaults fire only for undefined. The new regression test does the right thing: stubs the env, asserts the explicit-null contract rather than toBeUndefined(), and would fail under the pre-fix code. The ANTHROPIC_BASE_URL classification fix mirrors the SDK's resolution order precisely (config → env → default), and the cache-scope gate inherits the env-aware predicate automatically, so identity and cache-scope stay paired.
Verdict
APPROVE — solid follow-up. The fix is minimal, the tests exercise the real failure mode rather than mocking it away, and the symmetric ANTHROPIC_AUTH_TOKEN guard is a nice defensive add even though it wasn't strictly requested.
…icate #4020 review (Copilot): the comment promised "beta and body-side scope: 'global' field always ship together" but the gate was just `useGlobalCacheScope()`. In the degenerate case where the predicate is true but the request body has no system text AND no tools, the beta would still ship without any matching `cache_control.scope: 'global'` on the wire — overstating the contract and shipping dead weight. - New `hasGlobalCacheScopeOnWire(req)` scans the assembled request body (system block when shaped as `TextBlockParam[]`, tools array) for any `cache_control: { …, scope: 'global' }` entry. `buildPerRequestHeaders` gates the `prompt-caching-scope-2026-01-05` beta on this scan, so the beta and the body field share a single source of truth. No window between sampling the predicate and emitting the body where the two could diverge. - `useGlobalCacheScope()` is still sampled once per `buildRequest` and threaded into the converter to decide whether to ATTACH `scope: 'global'` to the body. The body-scan downstream then derives the beta from what actually landed. Tests: - New: empty systemInstruction + no tools + Anthropic-native + cache on → beta NOT shipped (degenerate body-scan case). - New: empty systemInstruction + non-empty tools → beta shipped (tool scope:'global' triggers the scan). - Existing per-request beta tests now include a `systemInstruction` so the body has the scope field; degenerate case is covered by the new dedicated test. Also tightened two stale comments (#3217834451, #3217834505) that claimed `Config.setModel()` mutates both `enableCacheControl` and `baseUrl` in place — only `enableCacheControl` is hot-mutated (qwen-oauth path); non-qwen-oauth providers recreate the generator on refresh, so `baseUrl` is captured fresh at construct time. Comments now describe the real in-place mutation and note the qwen-oauth boundary.
#4020 review (Copilot): two low-stakes follow-ups on 491a441. - `resolveEffectiveBaseUrl` trimmed `ANTHROPIC_BASE_URL` env but returned `contentGeneratorConfig.baseUrl` as-is. A copy-pasted baseURL with leading/trailing whitespace would trip `new URL(...)` in `isAnthropicNativeBaseUrl` and fall through the catch branch to the proxy identity bundle — meaning real api.anthropic.com would receive Bearer auth + claude-cli UA and 401. Apply the same trim() + empty-as-missing normalization on the config side. New regression test pins the contract with `' https://api.anthropic.com '`. - `buildHeaders` docstring said constructor headers carry only User-Agent + customHeaders (excluding anthropic-beta). The PR also added `x-app: cli` on the proxy path; updated the comment so a future maintainer reading the "no duplicate headers" rationale doesn't miss the x-app addition.
| // half of the bundle without the other. | ||
| const useProxyIdentity = !isAnthropicNativeBaseUrl(contentGeneratorConfig); | ||
| const defaultHeaders = this.buildHeaders(useProxyIdentity); | ||
| const baseURL = contentGeneratorConfig.baseUrl; |
| /** | ||
| * Resolve the baseURL the Anthropic SDK will actually use, mirroring the | ||
| * SDK's own destructuring-default order: explicit config first, then | ||
| * `ANTHROPIC_BASE_URL` env, then the SDK default. Returns the SDK default | ||
| * literal when nothing is configured so callers can do hostname matching | ||
| * without a special case for the empty path. | ||
| * | ||
| * Both inputs get the SDK's `readEnv`-style normalization | ||
| * (whitespace-trim + empty-as-missing). Trimming the config side too | ||
| * prevents a copy-pasted baseURL with stray whitespace from tripping | ||
| * `new URL(...)` in `isAnthropicNativeBaseUrl`, which would otherwise | ||
| * fall through the catch branch to proxy identity and ship Bearer auth | ||
| * against the real Anthropic API. | ||
| */ | ||
| function resolveEffectiveBaseUrl( | ||
| contentGeneratorConfig: ContentGeneratorConfig, | ||
| ): string { | ||
| const fromConfig = contentGeneratorConfig.baseUrl?.trim(); | ||
| if (fromConfig) return fromConfig; | ||
| const fromEnv = process.env['ANTHROPIC_BASE_URL']?.trim(); | ||
| if (fromEnv) return fromEnv; | ||
| return 'https://api.anthropic.com'; | ||
| } |
tanzhenxin
left a comment
There was a problem hiding this comment.
Review
Re-reviewed the two follow-up commits since the dismissed approve. The cache-scope refactor is the substantive one — it replaces the predicate-based gate on prompt-caching-scope-2026-01-05 with a scan of the assembled request body, so the header now ships iff a block actually carries scope: 'global' on the wire. Single source of truth, locked in by tests on both the degenerate path (empty system + no tools → no beta) and the tools-only path. The baseUrl trim commit is small but adds parity with the SDK's readEnv whitespace handling and a regression test for whitespace-wrapped native URLs.
The minor observations from the prior round are unchanged or only partially closed (whitespace-only config now resolves correctly; strict baseUrl: '' still routes differently between the predicate and the SDK constructor; the PR description framing on the type widening still doesn't match what was on main). None of them gate merge — no caller passes an empty string, and the predicate/constructor divergence would require a mid-stack process.env mutation that doesn't happen in practice. One small commit-message nit on the trim commit: it says whitespace-wrapped URLs "trip new URL(...)", but Node's URL constructor tolerates leading/trailing whitespace fine — the trim is still useful for parity and normalization, just not for the failure mode the message describes.
Verdict
APPROVE — both new commits hold up; the body-scan refactor is structurally tighter than the predicate-only gate it replaces.
wenshao
left a comment
There was a problem hiding this comment.
Additional findings not mapped to specific diff lines:
[Suggestion] Missing test: ANTHROPIC_BASE_URL env-side whitespace trim — resolveEffectiveBaseUrl trims both config and env inputs, but only config-side whitespace has a direct test. An env value like ' https://api.anthropic.com ' going untrimmed would misclassify as proxy identity.
[Suggestion] Missing test: empty-string / whitespace-only baseUrl — resolveEffectiveBaseUrl treats '' and ' ' as missing (falls through to env/default). No test covers these edge cases.
[Suggestion] No diagnostic logging for auth path selection — The constructor makes an important security decision (apiKey vs authToken) silently. A debugLogger.debug line for the selected auth path + resolved baseURL would aid 401/403 incident debugging.
| // half of the bundle without the other. | ||
| const useProxyIdentity = !isAnthropicNativeBaseUrl(contentGeneratorConfig); | ||
| const defaultHeaders = this.buildHeaders(useProxyIdentity); | ||
| const baseURL = contentGeneratorConfig.baseUrl; |
There was a problem hiding this comment.
[Suggestion] SDK receives untrimmed baseURL while classification uses trimmed value. resolveEffectiveBaseUrl() trims whitespace and resolves env fallback for classification, but this line passes the raw contentGeneratorConfig.baseUrl to the SDK constructor. A copy-pasted URL like ' https://api.anthropic.com ' classifies correctly as Anthropic-native (correct auth/identity) but the SDK receives the whitespace-padded string — which may cause TypeError or connection failures.
| const baseURL = contentGeneratorConfig.baseUrl; | |
| const baseURL = resolveEffectiveBaseUrl(contentGeneratorConfig); |
— glm-5.1 via Qwen Code /review
| this.client = new Anthropic({ | ||
| apiKey: contentGeneratorConfig.apiKey, | ||
| ...(useProxyIdentity | ||
| ? { authToken: contentGeneratorConfig.apiKey, apiKey: null } |
There was a problem hiding this comment.
[Suggestion] SDK null auth contract lacks regression protection. Passing null to suppress env back-fill is correct and well-documented, but all tests mock the SDK. If a future @anthropic-ai/sdk version normalizes null to undefined in its constructor, the credential leak silently returns — ANTHROPIC_API_KEY env would back-fill the "unused" field, shipping the real key to a third-party proxy. Consider adding an integration test that runs against the real SDK (not a mock) with ANTHROPIC_API_KEY set, verifying the suppressed field remains unset.
— glm-5.1 via Qwen Code /review
…rompt cache scope (#4020) * feat(core): improve Anthropic proxy compatibility and enable global prompt cache scope - Use authToken instead of apiKey to send Authorization: Bearer header, avoiding dual-header conflicts with IdeaLab-style proxies - Set User-Agent to claude-cli format and add x-app header for proxy Team rule compatibility - Add adaptive thinking support for Claude 4.6+ models - Enable prompt-caching-scope-2026-01-05 beta header and scope: global on system prompt cache_control to improve cross-session cache hit rates * test: update tests for User-Agent, beta header, and cache_control changes * fix(core): scope anthropic proxy workarounds to non-native baseURLs only Address review feedback on PR #4020 by narrowing each workaround to where it actually applies, instead of shipping it globally. - Gate `Authorization: Bearer` (`authToken`), `claude-cli` User-Agent, and `x-app: cli` to non-Anthropic-native baseURLs. Direct `api.anthropic.com` users keep the SDK-default `x-api-key` (`apiKey`) auth and a truthful `QwenCode` User-Agent so usage isn't misattributed in Anthropic's logs/quotas, and so a stricter Anthropic backend doesn't 401 on a `Bearer`-shaped header. - Gate the `prompt-caching-scope-2026-01-05` beta on `enableCacheControl`. When the converter isn't attaching `cache_control` to the body the beta is dead weight and risks 4xx responses from anthropic-compatible backends that don't recognize it. Restores the `betas.length === 0` early-return for the all-disabled case. - Detect adaptive-thinking models with numeric major/minor compare instead of `[6-9]`. The character class missed `claude-haiku-4-6` entirely and would silently fall through to `budget_tokens` on `claude-opus-4-10` / `claude-opus-5-1` once those ship — tripping HTTP 400 with a shape the server no longer accepts. - Honor explicit `reasoning.budget_tokens` before the adaptive branch. Adaptive omits `budget_tokens` from the wire shape, so checking it second silently dropped a user-supplied escape-hatch budget on Claude 4.6+ models. - Add `scope: 'global'` on the tool `cache_control` entry so the largest, slowest-changing prefix actually participates in cross-session caching under the new beta — the system-only attachment was capturing maybe half the available hit-rate improvement. - Replace the misleading `as { type: 'ephemeral' }` cast on the system block (which erased `scope` from the type while leaving it on the wire) with a `AnthropicTextBlockParam` type that mirrors the existing `AnthropicToolParam` widening, so types match the runtime shape. * fix(core): keep enableCacheControl live in the converter Follow-up on PR #4020 review: `Config.setModel()` mutates `enableCacheControl` in place (it's in `MODEL_GENERATION_CONFIG_FIELDS`), but the converter captured it once at construction. On a hot flip the generator's per-request `prompt-caching-scope-2026-01-05` beta gate would sample the new value while the converter still emitted the old body-side `cache_control` — beta-header and body could disagree. - Thread the live `contentGeneratorConfig.enableCacheControl` into the converter via a per-call options override on both `convertGeminiRequestToAnthropic` and `convertGeminiToolsToAnthropic`, falling back to the constructor-time default when the caller doesn't pass one. The generator samples the value once per `buildRequest` and forwards it to both convert calls so the body and beta header always agree within a request, even across `setModel` flips. - Regression test: hot-flip `enableCacheControl` from `true` to `false` on a live generator, verify the 2nd request drops both the beta header AND the body-side `cache_control` in lockstep. - Tighten two existing beta-header tests that used `toContain` only on `interleaved-thinking` / `effort` — they now also assert `prompt-caching-scope-2026-01-05` is present (per-request keep-default and streaming paths), so accidental removal trips the test. - Add coverage for the two previously-uncovered branches of `isAnthropicNativeBaseUrl`: `*.anthropic.com` subdomains (Anthropic-native) and a malformed baseURL (URL parse failure → proxy fallthrough). Also add an `anthropic.com.evil.com` hostname-spoof case mirroring the existing DeepSeek spoof test. * refactor(core): consolidate anthropic generator shapes & document edge cases Follow-up on PR #4020 review: - Extract `AnthropicThinkingParam` type alias. The thinking union `{ type: 'enabled'; budget_tokens } | { type: 'adaptive' }` was repeated verbatim in three places: the `MessageCreateParamsWithThinking` field, the streaming-request intersection, and `buildThinkingConfig`'s return type. Once a third shape ships, forgetting one site would silently narrow a runtime value — single alias keeps them locked. - Compute `useProxyIdentity` once in the constructor and pass it into `buildHeaders`. Previously `useBearerAuth` and `useProxyIdentity` named the same predicate at two call sites; collapsing them clarifies that Bearer auth + `claude-cli` UA + `x-app: cli` are one bundle that should never be split. - Document that `modelSupportsAdaptiveThinking`'s regex is intentionally unanchored so reseller-prefixed names (`bedrock/claude-opus-4-7`, `vertex_ai/claude-sonnet-4-6@…`, `idealab:claude-opus-4-6`, …) keep matching. Tightening to `^claude-` would silently regress those. - Soften the `prompt-caching-scope` beta comment so it describes what the code enforces (gate on the `enableCacheControl` flag) rather than promising a stronger "only ship when cache_control is on the body" invariant — the converter still skips `cache_control` on niche shapes (e.g. no system text, no tools, last user block isn't text). The looser gate is intentional; Anthropic-native ignores unused betas. - Pin the wire shape for the `reasoning: undefined` + 4.6+ model corner. `resolveEffectiveEffort` returns undefined on `reasoning === undefined`, so `buildThinkingConfig` ships `{ type: 'adaptive' }` with no `output_config` and no `effort-2025-11-24` beta. If Anthropic ever starts requiring `output_config.effort` alongside adaptive, this test will fail at CI rather than at runtime as a server 400. * fix(core): gate cache-scope on Anthropic-native baseURL, mirror auth gate Follow-up on PR #4020 review: the `prompt-caching-scope-2026-01-05` beta header and the body-side `scope: 'global'` field together comprise an Anthropic-only wire-shape extension. Shipping them to non-Anthropic backends (DeepSeek, IdeaLab) leaned on "unknown betas are ignored" — true on Anthropic-native, but unverified for proxies and silently inconsistent with the auth/identity gate, which already uses `isAnthropicNativeBaseUrl` to bind Bearer / claude-cli / x-app to the proxy path only. - Add `useGlobalCacheScope` predicate on the generator. True iff `enableCacheControl !== false` AND the resolved baseURL is Anthropic-native. Plumbed per-request into both `convertGeminiRequestToAnthropic` and `convertGeminiToolsToAnthropic`; the same predicate also gates the `prompt-caching-scope-2026-01-05` beta in `buildPerRequestHeaders` so beta + scope field always travel together. - Converter emits `cache_control: { type: 'ephemeral' }` (per-session) when scope is off and `{ type: 'ephemeral', scope: 'global' }` when on. Non-Anthropic baseURLs go back to their pre-PR per-session caching shape; existing prompt caching keeps working with no new beta. - Document the intentional `scope: 'global'` omission on `addCacheControlToMessages`. The last user message changes every turn (live prompt + immediate tool_results), so cross-session reuse has effectively zero hit rate; cross-session caching is concentrated on system + tool prefixes only. Tests: - DeepSeek baseURL pins the proxy auth/identity path (`authToken` / claude-cli UA / `x-app: cli`). Documents the contract assumption that DeepSeek's anthropic-compatible endpoint accepts `Authorization: Bearer` — any future deviation surfaces here rather than at runtime for users. - Non-Anthropic baseURL strips the cache-scope beta AND `scope: 'global'` from the wire shape, while keeping per-session `cache_control: { type: 'ephemeral' }` on system / tools. - Hot-flip test extended to assert tool `cache_control` flips alongside system / user / beta header. - Converter-level tests for per-call `enableCacheControl` and `useGlobalCacheScope` overrides — both directions of the constructor default (true→false, false→true) and the scope-independent-of-source case (cache on, scope off → per-session shape). - baseConfig in the per-request anthropic-beta block now targets `api.anthropic.com` so cache-scope assertions remain meaningful; the proxy-baseURL behavior is covered separately. * docs(core): tighten useGlobalCacheScope JSDoc — baseUrl is NOT hot-mutated #4020 review: the JSDoc claimed `Config.setModel()` mutates both `enableCacheControl` AND `baseUrl` in place. Per the current Config implementation, only the qwen-oauth hot-update path mutates `enableCacheControl` in place; non-qwen-oauth providers go through the refresh path which recreates the ContentGenerator (so `baseUrl` is captured fresh at construct time, not mutated). Tightened the wording to reflect actual behavior + kept the read-both-each-request defense (cheap and avoids stale-cache surprises if the hot-update list ever expands). * fix(core)!: suppress env back-fill so proxy auth doesn't leak real Anthropic key #4020 review (tanzhenxin, severity high): the IdeaLab-proxy branch spread `{ authToken: <key> }` and omitted `apiKey` entirely. The Anthropic SDK constructor destructures with defaults (`apiKey = readEnv('ANTHROPIC_API_KEY') ?? null`), and destructuring defaults only fire for `undefined` — so an omitted `apiKey` lets `ANTHROPIC_API_KEY` back-fill it. The SDK's auth resolver then prefers `apiKey` over `authToken`, shipping `X-Api-Key` (not `Authorization: Bearer`) on the wire. Concrete impact: a user with `ANTHROPIC_API_KEY=sk-ant-…` exported (normal for anyone also running Claude Code in the same shell) configuring qwen-code with an IdeaLab proxy plus an IdeaLab token would leak their real Anthropic key as `X-Api-Key` to the third-party proxy endpoint. - Pass `apiKey: null` explicitly on the proxy branch and `authToken: null` on the Anthropic-native branch. Explicit `null` suppresses the destructuring default; the env back-fill no longer fires. - New helper `resolveEffectiveBaseUrl` mirrors the SDK's own destructuring order (config → `ANTHROPIC_BASE_URL` env → SDK default). `isAnthropicNativeBaseUrl` now consults the env too, so a user configuring the proxy purely through `ANTHROPIC_BASE_URL` (qwen-code `baseUrl` unset) gets the proxy identity bundle instead of silently shipping native auth + UA + cache-scope beta to the proxy. Tests: - ANTHROPIC_API_KEY env + proxy baseURL → ctor receives `apiKey: null` and `authToken: our-key`. Locks in the credential-leak fix. - ANTHROPIC_AUTH_TOKEN env + Anthropic-native baseURL → ctor receives `authToken: null` and `apiKey: our-key`. Symmetric guard for the inverse direction. - ANTHROPIC_BASE_URL env points to proxy, config.baseUrl unset → proxy identity bundle (claude-cli UA, x-app, Bearer auth) applies. - ANTHROPIC_BASE_URL unset → SDK default api.anthropic.com path keeps native identity (predicate doesn't misclassify the SDK default as a proxy). - config.baseUrl wins over ANTHROPIC_BASE_URL — mirrors the SDK's own resolution order. - Existing 7 identity tests updated from `toBeUndefined()` to `toBeNull()` to match the new explicit-suppression contract. * refactor(core): gate cache-scope beta on body presence, not just predicate #4020 review (Copilot): the comment promised "beta and body-side scope: 'global' field always ship together" but the gate was just `useGlobalCacheScope()`. In the degenerate case where the predicate is true but the request body has no system text AND no tools, the beta would still ship without any matching `cache_control.scope: 'global'` on the wire — overstating the contract and shipping dead weight. - New `hasGlobalCacheScopeOnWire(req)` scans the assembled request body (system block when shaped as `TextBlockParam[]`, tools array) for any `cache_control: { …, scope: 'global' }` entry. `buildPerRequestHeaders` gates the `prompt-caching-scope-2026-01-05` beta on this scan, so the beta and the body field share a single source of truth. No window between sampling the predicate and emitting the body where the two could diverge. - `useGlobalCacheScope()` is still sampled once per `buildRequest` and threaded into the converter to decide whether to ATTACH `scope: 'global'` to the body. The body-scan downstream then derives the beta from what actually landed. Tests: - New: empty systemInstruction + no tools + Anthropic-native + cache on → beta NOT shipped (degenerate body-scan case). - New: empty systemInstruction + non-empty tools → beta shipped (tool scope:'global' triggers the scan). - Existing per-request beta tests now include a `systemInstruction` so the body has the scope field; degenerate case is covered by the new dedicated test. Also tightened two stale comments (#3217834451, #3217834505) that claimed `Config.setModel()` mutates both `enableCacheControl` and `baseUrl` in place — only `enableCacheControl` is hot-mutated (qwen-oauth path); non-qwen-oauth providers recreate the generator on refresh, so `baseUrl` is captured fresh at construct time. Comments now describe the real in-place mutation and note the qwen-oauth boundary. * fix(core): trim config.baseUrl and document x-app in buildHeaders #4020 review (Copilot): two low-stakes follow-ups on 491a441. - `resolveEffectiveBaseUrl` trimmed `ANTHROPIC_BASE_URL` env but returned `contentGeneratorConfig.baseUrl` as-is. A copy-pasted baseURL with leading/trailing whitespace would trip `new URL(...)` in `isAnthropicNativeBaseUrl` and fall through the catch branch to the proxy identity bundle — meaning real api.anthropic.com would receive Bearer auth + claude-cli UA and 401. Apply the same trim() + empty-as-missing normalization on the config side. New regression test pins the contract with `' https://api.anthropic.com '`. - `buildHeaders` docstring said constructor headers carry only User-Agent + customHeaders (excluding anthropic-beta). The PR also added `x-app: cli` on the proxy path; updated the comment so a future maintainer reading the "no duplicate headers" rationale doesn't miss the x-app addition.
#4323) On the IdeaLab-style proxy branch, the Anthropic SDK is constructed with `authToken: <key>, apiKey: null` so it emits `Authorization: Bearer <key>` and suppresses the ANTHROPIC_API_KEY env back-fill (the #4020 leak fix). That covers IdeaLab and CherryStudio-style proxies, but standards- compliant Anthropic-compatible servers (OpenCode-Go, Claude proxy products) authenticate only on the canonical `x-api-key` header and reject the request with "Missing API key" even though the bearer token is present. Inject `x-api-key: <key>` into `defaultHeaders` on the proxy branch (post-`buildHeaders`, so customHeaders cannot override it). The value is the user's already-configured `apiKey` — never an env-resolved one — so the #4020 env-leak vector stays closed. The Anthropic-native branch is untouched: the SDK's apiKey path already emits the header, and duplicating it via defaultHeaders would risk stale-value drift. Verified: - new unit test pins `x-api-key: <key>` on every proxy-branch case (config-baseUrl, malformed baseUrl, DeepSeek anthropic-compat, ANTHROPIC_BASE_URL env-pointed-at-proxy); a negative test pins that the native branch does NOT add the header. - E2E: spun up a local `http.createServer`, pointed the SDK at it the same way `AnthropicContentGenerator` does, and dumped the captured wire headers — `Authorization: Bearer` and `x-api-key` both arrive alongside the existing X-Stainless-* / x-app / claude-cli UA trio. Fixes #4323
#4323) (#4342) * fix(core): set x-api-key alongside Authorization on Anthropic outbound (#4323) On the IdeaLab-style proxy branch, the Anthropic SDK is constructed with `authToken: <key>, apiKey: null` so it emits `Authorization: Bearer <key>` and suppresses the ANTHROPIC_API_KEY env back-fill (the #4020 leak fix). That covers IdeaLab and CherryStudio-style proxies, but standards- compliant Anthropic-compatible servers (OpenCode-Go, Claude proxy products) authenticate only on the canonical `x-api-key` header and reject the request with "Missing API key" even though the bearer token is present. Inject `x-api-key: <key>` into `defaultHeaders` on the proxy branch (post-`buildHeaders`, so customHeaders cannot override it). The value is the user's already-configured `apiKey` — never an env-resolved one — so the #4020 env-leak vector stays closed. The Anthropic-native branch is untouched: the SDK's apiKey path already emits the header, and duplicating it via defaultHeaders would risk stale-value drift. Verified: - new unit test pins `x-api-key: <key>` on every proxy-branch case (config-baseUrl, malformed baseUrl, DeepSeek anthropic-compat, ANTHROPIC_BASE_URL env-pointed-at-proxy); a negative test pins that the native branch does NOT add the header. - E2E: spun up a local `http.createServer`, pointed the SDK at it the same way `AnthropicContentGenerator` does, and dumped the captured wire headers — `Authorization: Bearer` and `x-api-key` both arrive alongside the existing X-Stainless-* / x-app / claude-cli UA trio. Fixes #4323 * fix(core): clarify x-api-key comment + cover guard branch & customHeaders ordering (#4323) Address review feedback on #4342: - Source comment claimed the apiKey value was "never an env-resolved one"; that's wrong — `resolveCredentialField` in content-generator-config.ts:178 falls through to env vars when the explicit and inherited values are unset. The security reasoning doesn't actually depend on that claim (the same value already ships as `Authorization: Bearer` via `authToken` on the same request), so re-anchor the comment on that fact and drop the misleading "never env-resolved" framing. - Add test pinning the `&& contentGeneratorConfig.apiKey` guard: a falsy apiKey on the proxy branch must NOT inject `x-api-key:` (empty string would otherwise ship a meaningless header). The TypeScript signature `apiKey?: string` keeps the guard needed at the type level, but a future loosen-the-type refactor would silently re-enable the empty ship; the test catches that. - Add test pinning the post-buildHeaders ordering: a user-supplied `customHeaders: { 'x-api-key': … }` must NOT win against the canonical key. The source comment promises this invariant but no test pinned it; a refactor that moved the injection above the customHeaders merge would silently let user config swap the auth header, defeating the dual-auth contract. Declined two suggestions: - Bot suggested extracting the 3-line injection into a `buildApiKeyHeader()` helper for consistency. Declined: adds indirection without abstraction win, and the inline form keeps the post-buildHeaders ordering visible at the call site (the ordering IS the invariant the comment promises). - Bot suggested asserting `Authorization` is absent from `defaultHeaders` on the native path. Declined: the constructor-options pins (`apiKey: 'test-key'`, `authToken: null`) already document the SDK-driven auth mode; asserting on the absence of a header we never set in defaultHeaders is redundant given the existing assertions. 68 tests pass (66 + 2 new). tsc + eslint clean.
… outbound (#4323) (#4342)" (#4385) This reverts commit 4b25f9c. PR #4342 unconditionally injected `x-api-key` alongside `Authorization: Bearer` on every proxy-branch request. This broke IdeaLab-style proxies (e.g. `idealab.alibaba-inc.com/api/anthropic`) which reject requests carrying both headers with HTTP 401: 鉴权header, x-api-key和Authorization不可以同时存在 (auth header: x-api-key and Authorization cannot coexist) The two proxy families have mutually exclusive header contracts — OpenCode-Go-style servers want `x-api-key` only, IdeaLab-style servers want `Authorization` only — so a one-size-fits-all default cannot satisfy both at once. Reverting restores the pre-#4342 default (Bearer only) so IdeaLab users are unblocked. OpenCode-Go-style users can opt in via `customHeaders`: { "customHeaders": { "x-api-key": "<key>" } } `buildHeaders` already merges customHeaders into defaultHeaders (only `anthropic-beta` is reserved for per-request handling), and on the proxy branch the SDK is constructed with `apiKey: null` so it does not emit its own `x-api-key` — the value on the wire comes solely from the user's explicit customHeaders entry, preserving the #4020 env-leak guard. Reopens #4323.
… outbound (#4323) (#4342)" (#4385) This reverts commit 4b25f9c. PR #4342 unconditionally injected `x-api-key` alongside `Authorization: Bearer` on every proxy-branch request. This broke IdeaLab-style proxies (e.g. `idealab.alibaba-inc.com/api/anthropic`) which reject requests carrying both headers with HTTP 401: 鉴权header, x-api-key和Authorization不可以同时存在 (auth header: x-api-key and Authorization cannot coexist) The two proxy families have mutually exclusive header contracts — OpenCode-Go-style servers want `x-api-key` only, IdeaLab-style servers want `Authorization` only — so a one-size-fits-all default cannot satisfy both at once. Reverting restores the pre-#4342 default (Bearer only) so IdeaLab users are unblocked. OpenCode-Go-style users can opt in via `customHeaders`: { "customHeaders": { "x-api-key": "<key>" } } `buildHeaders` already merges customHeaders into defaultHeaders (only `anthropic-beta` is reserved for per-request handling), and on the proxy branch the SDK is constructed with `apiKey: null` so it does not emit its own `x-api-key` — the value on the wire comes solely from the user's explicit customHeaders entry, preserving the #4020 env-leak guard. Reopens #4323.
…ompatibility Restore the x-api-key header alongside Authorization: Bearer for non-Anthropic-native baseURLs. Some Anthropic-compatible servers (OpenCode-Go, Claude proxy products — see #4323) authenticate only on the canonical x-api-key header, not Authorization: Bearer. The dual-header approach ships both authentication shapes side-by-side so either family of proxies accepts us. This doesn't widen the credential leak surface from #4020 because contentGeneratorConfig.apiKey is already shipped as Authorization: Bearer via authToken on the same request. Resolves review thread: PRRT_kwDOPB-92c6EdXgl
Summary
Improve compatibility with IdeaLab-style Anthropic-compatible proxies and enable cross-session prompt caching — while keeping the direct
api.anthropic.compath on its existing behavior.Proxy identity (non-Anthropic-native baseURLs only)
For baseURLs that are not
*.anthropic.com:authTokenso requests carryAuthorization: Bearer <key>instead of the SDK-defaultx-api-key. This avoids the dual-header conflict that IdeaLab-style proxies reject.User-Agent: claude-cli/<version> (external, cli)andx-app: clito satisfy proxy Team rules that gate by client identity.For Anthropic-native baseURLs (
api.anthropic.com/ unset):apiKey/x-api-keyauth (noBearerflip → no 401 against the real Anthropic API).User-Agent: QwenCode/<version> (<platform>; <arch>), nox-app, so usage isn't misattributed in Anthropic's logs/quotas.Adaptive thinking for Claude 4.6+
thinking: { type: 'adaptive' }forclaude-(opus|sonnet|haiku)-<major>-<minor>wheremajor > 4 || (major === 4 && minor >= 6). Numeric compare (not a[6-9]character class) so haiku,4-10+, and a future5-xaren't silently dropped onto the budget path and tripping HTTP 400.{ type: 'enabled', budget_tokens }ladder.reasoning.budget_tokensis honored before the adaptive branch — adaptive omits the budget field entirely, so checking it second would silently discard the user-supplied escape-hatch budget on 4.6+ models.Cross-session prompt caching
enableCacheControl !== false, sendanthropic-beta: prompt-caching-scope-2026-01-05and attachcache_control: { type: 'ephemeral', scope: 'global' }on both the system text block and the last tool entry. Tools tend to be the largest, slowest-changing prefix, so cross-session reuse there is where most of the hit-rate improvement shows up.enableCacheControldrops the beta header along with the body-sidecache_control, so backends that don't recognize the flag don't 4xx — and thebetas.length === 0early-return stays meaningful.AnthropicCacheControl/AnthropicTextBlockParamsoscope: 'global'is part of the static type instead of a misleadingas { type: 'ephemeral' }cast that erased it from the type while leaving it on the wire.Test plan
claude-opus-4-7through an IdeaLab-style proxy (no 401/400)api.anthropic.comrequest keepsx-api-key+ QwenCode UA + nox-appapi.deepseek.com/anthropic) unaffected — existing thinking-injection workaround still triggers;effort: 'max'still passes through unclampedenableCacheControl: falsedoes not ship the cache-scope beta; reasoning-only requests still sendinterleaved-thinking/effortbetasopus-4-6/sonnet-4-6/opus-4-7/haiku-4-6/opus-4-10/opus-5-1; budget path retained foropus-4-5reasoning.budget_tokensoverrides adaptive on 4.6+ modelspackages/coretest suite green (89 generator/converter tests, 7541 across the package)