fix(core): inject thinking blocks for DeepSeek anthropic-compatible provider#3788
Conversation
…rovider DeepSeek's anthropic-compatible endpoint (https://api.deepseek.com/anthropic) rejects follow-up requests with HTTP 400 ("The content[].thinking in the thinking mode must be passed back to the API.") whenever a prior assistant turn carrying tool_use omits a thinking block. The model can legitimately return a tool round without thinking text, so qwen-code stored no thought parts and rebuilt the next request with no thinking block, tripping the API's check. Mirroring the existing OpenAI-side fix (#3729, #3747), the converter now detects DeepSeek by base URL or model name and prepends an empty { type: 'thinking', thinking: '', signature: '' } block to assistant turns missing one. Other anthropic-protocol providers are unaffected. Verified against the live api.deepseek.com/anthropic endpoint: - assistant with tool_use, no thinking → 400 (reproduces #3786) - assistant with tool_use, empty thinking injected → 200 OK Refs #3786
Code Coverage Summary
CLI Package - Full Text ReportCore Package - Full Text ReportFor detailed HTML reports, please see the 'coverage-reports-22.x-ubuntu-latest' artifact from the main CI run. |
wenshao
left a comment
There was a problem hiding this comment.
wenshao
left a comment
There was a problem hiding this comment.
2 条建议,无阻塞项。整体实现方向正确,测试覆盖良好。
| return true; | ||
| } | ||
| const model = (contentGeneratorConfig.model ?? '').toLowerCase(); | ||
| return model.includes('deepseek'); |
There was a problem hiding this comment.
[Suggestion] model.includes('deepseek') 匹配范围过宽——任何模型名包含 "deepseek" 都会触发 thinking 块注入,包括指向非 DeepSeek 后端的模型(如 my-deepseek-clone),可能导致非目标 API 返回 400。反之,如果 DeepSeek 发布不含 "deepseek" 前缀的模型,检测会静默失败。
建议使用更精确的匹配(如 model.startsWith('deepseek'))、维护已知前缀集合,或提供显式配置覆盖项(如 forceAssistantThinking)。
— deepseek-v4-pro via Qwen Code /review
wenshao
left a comment
There was a problem hiding this comment.
No review findings. Downgraded from Approve to Comment: self-PR; CI failing: CodeQL. — gpt-5.5 via Qwen Code /review
wenshao
left a comment
There was a problem hiding this comment.
Request changes: 空 thinking 块注入未与 thinking 模式联动,在 includeThoughts: false 的生产代码路径(suggestionGenerator/ArenaManager/forkedAgent)上会导致 HTTP 400。
Address PR review feedback:
1. (Critical) Gate empty-thinking injection on the same per-request
condition that emits the top-level `thinking` parameter. The previous
implementation injected unconditionally on DeepSeek providers, but
`buildThinkingConfig()` may omit `thinking` when reasoning=false or
`thinkingConfig.includeThoughts=false` — which is exactly what
suggestionGenerator / ArenaManager / forkedAgent do. Shipping
thinking blocks without enabling thinking mode is a protocol
violation that DeepSeek may reject. Move the option from converter
constructor to a per-request `convertGeminiRequestToAnthropic`
parameter so the generator can compute the gate correctly.
2. (CodeQL) Replace `baseUrl.includes('api.deepseek.com')` with
`new URL(baseUrl).hostname` exact-match. The substring check would
accept spoofed hosts like `api.deepseek.com.evil.com`.
3. Document the empty-signature workaround inline.
4. Rename the misleading "redacted_thinking" test case.
Review feedback addressed (commit 2212b18)
新增 4 个测试(gating × 2、URL 欺骗 × 1、重命名 × 1),共 57 个测试通过。 实测验证:DeepSeek 当前对 "thinking-off + 消息含 thinking 块" 这一组合宽容(200 OK),但门控仍是协议正确的改动,避免未来 DeepSeek 收紧校验时再回归。 |
There was a problem hiding this comment.
Pull request overview
This PR adds a DeepSeek-specific workaround in the Anthropic content generator so replayed assistant history can include synthetic empty thinking blocks when required by DeepSeek's anthropic-compatible endpoint. It fits into the provider-normalization layer that already adapts request/response payloads for vendor-specific protocol quirks.
Changes:
- Added an optional converter path to inject empty assistant
thinkingblocks during Gemini→Anthropic request conversion. - Added DeepSeek provider detection in
AnthropicContentGeneratorand gated the workaround on requests that actually enable Anthropic thinking mode. - Added unit tests covering converter injection behavior and DeepSeek-specific request building.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
packages/core/src/core/anthropicContentGenerator/converter.ts |
Adds request-conversion option and helper to prepend synthetic empty thinking blocks to assistant messages. |
packages/core/src/core/anthropicContentGenerator/converter.test.ts |
Adds unit tests for the new assistant-thinking injection behavior in the converter. |
packages/core/src/core/anthropicContentGenerator/anthropicContentGenerator.ts |
Detects DeepSeek Anthropic-compatible providers and enables the converter workaround only when thinking mode is active. |
packages/core/src/core/anthropicContentGenerator/anthropicContentGenerator.test.ts |
Adds request-building tests for DeepSeek detection and the thinking-enabled/disabled gating logic. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
wenshao
left a comment
There was a problem hiding this comment.
No review findings. Downgraded from Approve to Comment: self-PR. — gpt-5.5 via Qwen Code /review
wenshao
left a comment
There was a problem hiding this comment.
No review findings.
5 forward-looking observations are recorded in the local review report (.claude/reviews/2026-05-02-180000-pr-3788.md) but not posted as inline comments — none indicate a current-behavior bug, all are low-confidence suggestions for future hardening (streaming-path test gap; trailing-dot FQDN bypass in URL detection; subdomain endsWith branch untested; missing debug log on detection; cached isDeepSeekProvider staleness via raw-override Config.setModel). Prior critical review feedback (gating on thinking mode, URL hostname strict-match, comment hygiene, test name accuracy) is fully addressed in commit 2212b18.
CI: all 13 checks green. Tests: 57/57 pass. — claude-opus-4-7 via Claude Code /qreview
wenshao
left a comment
There was a problem hiding this comment.
No Critical issues. 3 Suggestions for consideration:
-
[Suggestion] Empty thinking blocks silently dropped by response converter, causing perpetual re-injection —
convertAnthropicResponseToGeminidrops empty thinking blocks (if (thinking || signature)— both''are falsy), so the next request re-injects them. Functionally correct but creates an undocumented invariant between the response-side guard andapplyEmptyThinkingToAssistantMessages. Suggestion: add a cross-reference comment at the response-side guard. -
[Suggestion]
redacted_thinkinground-trip losesdatafield, bypasses injection — When the API returns aredacted_thinkingblock, the response converter discards thedatafield and stores{ text: '', thought: true }. On the next request,processContentproduces{ type: 'thinking', thinking: '' }with nosignaturekey at all (not even empty).hasThinkingsees athinkingblock and skips injection. If DeepSeek requires thesignaturefield in schema validation, this would still trigger HTTP 400. The existinghasThinkingcheck could be tightened to verify thesignaturefield exists, not just block type.
— glm-5.1 via Qwen Code /review
…domain test Address PR review round 2: 1. Narrow injection scope to assistant turns containing tool_use. Live verification against api.deepseek.com/anthropic showed plain-text assistant turns without thinking are accepted unchanged — only tool_use turns trigger the HTTP 400. Injecting on every assistant turn unnecessarily bloats replay history with synthetic blocks the API does not require. Existing thinking blocks on any turn are still preserved untouched. 2. Add test coverage for the subdomain hostname branch (us.api.deepseek.com → matches), addressing the gap noted in review. 3. Update existing negative-case tests (non-deepseek / spoofed / reasoning=false / includeThoughts=false) to use tool_use scenarios so they actually exercise the gating logic instead of trivially passing under the narrowed scope.
Review feedback round 2 addressed (commit 8721b41)
附带修整:原先的 negative-case 测试(non-deepseek / 欺骗主机 / reasoning=false / includeThoughts=false)都用纯文本 assistant,缩窄后纯文本本就不注入,那些测试会变成"自然通过"而不真正触及 gating。已全部改为 tool_use 场景,确保 gate 真的被验。 58 个单测通过;lint + typecheck 干净。 |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…scope Address PR review round 3 (copilot-pull-request-reviewer × 3): comments in three locations still described the constraint as applying to "any prior assistant turn", which was true before commit 8721b41 but no longer matches the implementation. Update the doc comment on isDeepSeekAnthropicProvider and the two test-suite header comments to state the actual narrower contract: the API rejects only tool-use turns that omit thinking blocks; plain-text assistant turns are accepted unchanged. Comment-only change; 58 tests still pass.
Review feedback round 3 addressed (commit 487066a)3 个 copilot 评论指出三处注释还停留在缩窄前的语义("any prior assistant turn"),与缩窄后的实现("tool-use turns only")不一致。
仅注释改动,58 个单测通过。 |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Address PR review round 4 (copilot-pull-request-reviewer × 2): 1. Stale provider-detection cache (HCia). The constructor cached isDeepSeekProvider once, but Config.setModel() mutates contentGeneratorConfig.model in place. After a runtime /model switch from a non-DeepSeek model to a DeepSeek one on the same auth config, buildRequest() would keep using the stale flag. Move the detection into buildRequest so each call sees the current model. The detector is cheap (URL parse + string compare). 2. Real thought parts leak through when thinking is disabled (HCib). The previous gate only blocked synthetic injection — but the converter still replayed any existing `thought: true` parts in request.contents as thinking blocks. Code paths that disable thinking against a session whose history was built with thinking on (suggestionGenerator / ArenaManager / forkedAgent) would still emit thinking blocks alongside an absent top-level `thinking` config — the same protocol mismatch the gate was meant to avoid. Add a `stripAssistantThinking` converter option, set in buildRequest to `isDeepSeek && !thinking`. The converter strips thinking and redacted_thinking blocks from assistant messages before message construction completes. Mirror behavior is already proven safe by live verification (DeepSeek currently tolerates either shape, but stripping makes the request body internally consistent and robust to future validation tightening). 3 new tests: - converter strips thinking from assistant turns when option set - generator strips real thought parts when reasoning=false - generator reflects runtime model changes (no stale cache) 61 tests pass; lint + typecheck clean.
Review feedback round 4 addressed (commit ae1aae5)
3 个新测试:
61 个单测通过;lint + typecheck 干净。 |
|
I took another pass over the latest revision and the new Copilot comments. The main direction looks much safer now: the thinking-mode gating, Before merging, I think one small follow-up pass would still be useful:
None of these look like large changes, but they would close the remaining risk around this PR. |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…existence test
Address PR review round 12 (copilot-pull-request-reviewer × 2):
1. (H6ws) The single `ensureThinkingOnToolUseTurns` option was
misleadingly narrow: the implementation also rewrote non-tool-use
turns by normalizing malformed thinking blocks. Future callers
could enable it expecting only the tool-use behavior. Split into
two precisely-named options:
- normalizeAssistantThinkingSignature: fill missing `signature`
on every assistant `thinking` block (cross-provider history
compat).
- injectThinkingOnToolUseTurns: prepend synthetic empty thinking
on tool_use turns missing one (issue #3786 trigger).
The generator wires both together for DeepSeek when thinking mode
is on; either can be used independently if a future caller needs
only one pass.
2. (H6w4) Add a test asserting that the per-request `headers` path
coexists correctly with `customHeaders`: User-Agent and unrelated
customHeaders entries stay in `defaultHeaders` while only the
computed `anthropic-beta` rides on the per-request path. Defends
against a future regression where header config might be routed
through a code path that wipes the constructor defaults.
72 tests pass; lint + typecheck clean.
Review feedback round 12 addressed (commit 4b81d67)
72 单测通过;lint + typecheck 干净。 |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Address yiliang114 review feedback (#3788). HTTP header names are case-insensitive by spec, and the Anthropic SDK lower-cases them during merge. Previously buildPerRequestHeaders only read the lower-case `anthropic-beta` key from customHeaders, so a user-configured `Anthropic-Beta` or `ANTHROPIC-BETA` would be silently overwritten by the per-request computed value. Replace the direct dict lookup with collectCustomBetaFlags() which walks all customHeaders entries and matches the key case-insensitively. Multiple matching entries (unlikely but possible) are concatenated; the existing dedupe pass handles any duplicates. Add a regression test for both `Anthropic-Beta` and `ANTHROPIC-BETA` key shapes. 73 tests pass; lint + typecheck clean.
|
@yiliang114 Thanks for the second pass — addressed in commit 0d8b5de:
73 single-package tests pass; lint + typecheck clean. |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
Round 13: copilot raised the case-insensitive customHeaders[anthropic-beta] concern, but it was based on commit 4b81d67 — the very next commit (0d8b5de) had already fixed it via the case-insensitive |
…antics + redacted_thinking strip test Address PR review round 14 (copilot-pull-request-reviewer × 4): 1. (IAl4) PR description still described "dropped here so synthetic injection takes over" but the implementation now normalizes signature-less thinking blocks in place (preserving text). PR description rewritten to describe the two-pass model: normalize-in-place + injection-when-truly-missing. 2. (IAl7) `injectThinkingOnToolUseTurns` option docstring claimed signature-less blocks would be "seen as missing" so the synthetic replaces them. Updated to describe the actual flow: the normalization pass runs first, blocks become compliant in place, the injector then sees them as already-satisfying and prepends nothing. Helper JSDoc on `injectEmptyThinkingOnToolUseTurns` fixed the same way. 3. (IAl8) Strip-path coverage missed `redacted_thinking` blocks. Added regression test that verifies both thinking and redacted_thinking blocks are removed when `stripAssistantThinking` is set. 4. (IAl-) Renamed the converter test suite from "thinking-mode injection + normalization (DeepSeek thinking on)" to "DeepSeek thinking-mode normalization, injection, and stripping" so the title accurately covers all behavior the block exercises (including `stripAssistantThinking` cases later in the same describe). 74 tests pass; lint + typecheck clean.
Review feedback round 14 addressed (commit f139e1a)
74 单测通过;lint + typecheck 干净。 |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| * the provider by base URL hostname or model name so the converter can inject | ||
| * empty thinking blocks on the affected turns. | ||
| * https://github.com/QwenLM/qwen-code/issues/3786 | ||
| */ | ||
| function isDeepSeekAnthropicProvider( | ||
| contentGeneratorConfig: ContentGeneratorConfig, | ||
| ): boolean { | ||
| const baseUrl = contentGeneratorConfig.baseUrl ?? ''; | ||
| if (baseUrl) { | ||
| try { | ||
| const hostname = new URL(baseUrl).hostname.toLowerCase(); | ||
| if ( | ||
| hostname === 'api.deepseek.com' || | ||
| hostname.endsWith('.api.deepseek.com') | ||
| ) { | ||
| return true; | ||
| } | ||
| } catch { | ||
| // Invalid URL — fall through to model-name detection. | ||
| } | ||
| } | ||
| const model = (contentGeneratorConfig.model ?? '').toLowerCase(); | ||
| return model.includes('deepseek'); |
…id wire duplication Address PR review round 15 (copilot-pull-request-reviewer #1). `buildHeaders()` previously spread the entire `customHeaders` map into the SDK's `defaultHeaders`. After moving anthropic-beta computation to the per-request path, a user-configured mixed-case `Anthropic-Beta` key would survive in defaultHeaders verbatim, while the per-request override added a lowercase `anthropic-beta`. The wire then carried two physical headers for the same logical name — SDK behavior on duplicate headers with different casings is undefined. `buildPerRequestHeaders()` already merges those user flags case-insensitively (commit 0d8b5de), so dropping the entry from defaultHeaders is the right boundary: the per-request path owns the header end-to-end. Other customHeaders entries continue to pass through. Add a regression test asserting no `Anthropic-Beta` (any casing) lands in defaultHeaders while unrelated customHeaders are kept. 75 tests pass; lint + typecheck clean.
Review feedback round 15 addressed (commit f17fcfc)
75 单测通过;lint + typecheck 干净。 |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 4 out of 4 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
yiliang114
left a comment
There was a problem hiding this comment.
Re-reviewed the latest revision. The previous concerns look addressed, and the focused Anthropic generator/converter tests pass locally.
No remaining blockers from my side. Approved pending CI.
…rovider (#3788) * fix(core): inject thinking blocks for DeepSeek anthropic-compatible provider DeepSeek's anthropic-compatible endpoint (https://api.deepseek.com/anthropic) rejects follow-up requests with HTTP 400 ("The content[].thinking in the thinking mode must be passed back to the API.") whenever a prior assistant turn carrying tool_use omits a thinking block. The model can legitimately return a tool round without thinking text, so qwen-code stored no thought parts and rebuilt the next request with no thinking block, tripping the API's check. Mirroring the existing OpenAI-side fix (#3729, #3747), the converter now detects DeepSeek by base URL or model name and prepends an empty { type: 'thinking', thinking: '', signature: '' } block to assistant turns missing one. Other anthropic-protocol providers are unaffected. Verified against the live api.deepseek.com/anthropic endpoint: - assistant with tool_use, no thinking → 400 (reproduces #3786) - assistant with tool_use, empty thinking injected → 200 OK Refs #3786 * fix(core): gate DeepSeek thinking-block injection on thinking mode Address PR review feedback: 1. (Critical) Gate empty-thinking injection on the same per-request condition that emits the top-level `thinking` parameter. The previous implementation injected unconditionally on DeepSeek providers, but `buildThinkingConfig()` may omit `thinking` when reasoning=false or `thinkingConfig.includeThoughts=false` — which is exactly what suggestionGenerator / ArenaManager / forkedAgent do. Shipping thinking blocks without enabling thinking mode is a protocol violation that DeepSeek may reject. Move the option from converter constructor to a per-request `convertGeminiRequestToAnthropic` parameter so the generator can compute the gate correctly. 2. (CodeQL) Replace `baseUrl.includes('api.deepseek.com')` with `new URL(baseUrl).hostname` exact-match. The substring check would accept spoofed hosts like `api.deepseek.com.evil.com`. 3. Document the empty-signature workaround inline. 4. Rename the misleading "redacted_thinking" test case. * fix(core): narrow DeepSeek thinking injection to tool_use turns + subdomain test Address PR review round 2: 1. Narrow injection scope to assistant turns containing tool_use. Live verification against api.deepseek.com/anthropic showed plain-text assistant turns without thinking are accepted unchanged — only tool_use turns trigger the HTTP 400. Injecting on every assistant turn unnecessarily bloats replay history with synthetic blocks the API does not require. Existing thinking blocks on any turn are still preserved untouched. 2. Add test coverage for the subdomain hostname branch (us.api.deepseek.com → matches), addressing the gap noted in review. 3. Update existing negative-case tests (non-deepseek / spoofed / reasoning=false / includeThoughts=false) to use tool_use scenarios so they actually exercise the gating logic instead of trivially passing under the narrowed scope. * docs(core): align DeepSeek thinking-injection comments with narrowed scope Address PR review round 3 (copilot-pull-request-reviewer × 3): comments in three locations still described the constraint as applying to "any prior assistant turn", which was true before commit 8721b41 but no longer matches the implementation. Update the doc comment on isDeepSeekAnthropicProvider and the two test-suite header comments to state the actual narrower contract: the API rejects only tool-use turns that omit thinking blocks; plain-text assistant turns are accepted unchanged. Comment-only change; 58 tests still pass. * fix(core): per-request DeepSeek detection + strip thinking when off Address PR review round 4 (copilot-pull-request-reviewer × 2): 1. Stale provider-detection cache (HCia). The constructor cached isDeepSeekProvider once, but Config.setModel() mutates contentGeneratorConfig.model in place. After a runtime /model switch from a non-DeepSeek model to a DeepSeek one on the same auth config, buildRequest() would keep using the stale flag. Move the detection into buildRequest so each call sees the current model. The detector is cheap (URL parse + string compare). 2. Real thought parts leak through when thinking is disabled (HCib). The previous gate only blocked synthetic injection — but the converter still replayed any existing `thought: true` parts in request.contents as thinking blocks. Code paths that disable thinking against a session whose history was built with thinking on (suggestionGenerator / ArenaManager / forkedAgent) would still emit thinking blocks alongside an absent top-level `thinking` config — the same protocol mismatch the gate was meant to avoid. Add a `stripAssistantThinking` converter option, set in buildRequest to `isDeepSeek && !thinking`. The converter strips thinking and redacted_thinking blocks from assistant messages before message construction completes. Mirror behavior is already proven safe by live verification (DeepSeek currently tolerates either shape, but stripping makes the request body internally consistent and robust to future validation tightening). 3 new tests: - converter strips thinking from assistant turns when option set - generator strips real thought parts when reasoning=false - generator reflects runtime model changes (no stale cache) 61 tests pass; lint + typecheck clean. * fix(core): preserve thinking-only assistant turns instead of emitting empty content Address PR review round 5 (copilot-pull-request-reviewer × 2 — code + test): stripThinkingFromAssistantMessages previously replaced message.content with the filtered array unconditionally. For an assistant turn whose only blocks are thinking/redacted_thinking (e.g. a round cut off by max_tokens before any text or tool_use was emitted), this left `content: []` — which Anthropic API rejects. Dropping the message entirely was considered but would break the required user/assistant alternation. Instead, fall back to leaving the original blocks in place when stripping would empty the message. DeepSeek empirically tolerates the residual `thinking-block + no-thinking-config` shape (verified against api.deepseek.com/anthropic in the V2/X scenarios), so leaving the message untouched is the safer choice than emitting invalid structure. Add regression test for the thinking-only turn shape. 62 tests pass; lint + typecheck clean. * fix(core): validate thinking-block signature, rename option, gate output_config Address PR review round 6 — five substantive items: 1. (Critical) Drop non-compliant thinking blocks lacking a `signature` field and replace them with a synthetic one. A `redacted_thinking` block round-tripped through Gemini Part format becomes `{ text: '', thought: true }` (no thoughtSignature) and converts back to `{ type: 'thinking', thinking: '' }` without `signature` — not spec-compliant. The previous `hasThinking` check accepted these as already-satisfying, leaving non-compliant blocks in the wire message. Tighten the check so they're filtered out and the synthetic injection runs. Live verification: DeepSeek currently tolerates both shapes (lenient), but normalizing is defensively correct against future tightening. 2. Rename converter option `ensureAssistantThinking` → `ensureThinkingOnToolUseTurns`. The new name reflects the actual contract (tool-use turns only, not every assistant turn). 3. Honor `thinkingConfig.includeThoughts: false` in `buildOutputConfig`. Previously a per-request opt-out dropped the top-level `thinking` parameter but still emitted `output_config.effort`, leaking a reasoning-shaped field into side queries that don't want it. 4. Add regression test for mixed text + tool_use assistant turns (common shape: model says something, then calls a tool). 5. Add explicit test for the signature-validation path: an existing compliant thinking block (with signature) is preserved untouched. 64 tests pass; lint + typecheck clean. * fix(core): clean up non-compliant thinking blocks on plain-text turns + assert output_config gating Address PR review round 7 (copilot-pull-request-reviewer × 2): 1. (Hp-x) Round-tripped redacted_thinking blocks were left malformed on assistant turns lacking tool_use. The previous structure only ran the cleanup pass when a tool_use block was present (early return on `!hasToolUse`), so plain-text turns kept the non-compliant `{ type: 'thinking', thinking: '' }` shape. Restructure into two sequential steps: a. Drop non-compliant thinking blocks (no `signature`) on every assistant turn — same fallback that avoids `content: []` if the message is thinking-only. b. Inject the synthetic empty thinking block on tool_use turns that still lack a compliant thinking block after step (a). 2. (Hp-7) The includeThoughts=false test asserted that the top-level `thinking` field is suppressed but didn't cover `output_config`, leaving regressions in the new `buildOutputConfig` gate uncaught. Tighten the assertion to also verify `output_config` is absent. 3. New converter test: cleanup runs on plain-text assistant turns too. 65 tests pass; lint + typecheck clean. * test(core): add explicit redacted_thinking injection-path coverage Address PR review round 8 (#30 — copilot reviewer). The converter treats `redacted_thinking` as already satisfying the thinking-block requirement (no synthetic injected), distinguished from a signature-less `thinking` block which is non-compliant and gets dropped/replaced. Existing tests covered the latter path; this adds explicit coverage of the former. processContent doesn't synthesize redacted_thinking from Gemini parts, so the test reaches into the private helper directly. (#31 — subdomain hostname coverage — already exists at line 602.) 66 tests pass; lint + typecheck clean. * fix(core): per-request anthropic-beta + normalize thinking-only turns Address PR review round 9 (copilot-pull-request-reviewer × 2): 1. (Hydz) thinking-only assistant turns (e.g. max_tokens cutoff or round-tripped redacted_thinking) hit the cleanup-empties fallback and kept the original non-compliant `{ type: 'thinking', thinking: '' }` block. The fallback now replaces the message with a synthetic empty thinking block (`signature: ''` included), which keeps the message non-empty AND spec-compliant. 2. (Hyd4) `anthropic-beta` was set once at construction from the global `reasoning` config, so requests with per-request `thinkingConfig.includeThoughts=false` still advertised interleaved-thinking / effort even though the body had dropped the matching fields. Move beta computation to a new `buildPerRequestHeaders` that derives the header from the actual `thinking` / `output_config` fields present in the request body, and pass it via `messages.create(..., { headers })`. The wire shape is now internally consistent. Test updates: - Drop the three constructor-time beta assertions; they no longer apply. - Add four per-request header tests covering: both betas present, only interleaved-thinking, reasoning=false (no betas), and per-request includeThoughts=false (no betas). 67 tests pass; lint + typecheck clean. * fix(core): preserve thinking text by normalizing in place + merge user beta flags Address PR review round 10 (copilot-pull-request-reviewer × 2): 1. (H0oF) The previous cleanup filtered out every thinking block missing a `signature` field. But that shape is the normal output from OpenAI/Gemini/agent-runtime generators, which only set `thought: true` without a signature. Users switching providers mid-session would silently lose preserved thinking text on the first DeepSeek request. Change Step 1 to NORMALIZE in place: when a thinking block has no signature, set `signature: ''` rather than dropping the block. The original `thinking` text is preserved; DeepSeek empirically accepts empty signatures so the wire shape stays valid. 2. (H0oL) `buildPerRequestHeaders()` overwrote `customHeaders['anthropic-beta']` whenever the per-request override fired, regressing the customHeaders escape hatch for unrelated Anthropic beta features. Merge the user's flags into the computed list (deduped) so users can stack their own betas alongside interleaved-thinking / effort. Test changes: - Renamed and rewrote "drops non-compliant... plain-text" test to assert in-place normalization that preserves thinking text. - Updated "replaces a non-compliant thinking block" comment + name to describe the normalization (the assertion was already correct because the test happened to use empty thinking text). - The empty-content fallback in Step 1 is no longer reachable under the new logic, so the dedicated thinking-only-turn test now exercises only the strip path (where it remains relevant). - Added 3 customHeaders[anthropic-beta] tests: merge with computed, passthrough when no thinking/effort, dedupe. 70 tests pass; lint + typecheck clean. * docs(core): align thinking-injection comments with normalize semantics + add stream test Address PR review round 11 (copilot-pull-request-reviewer × 3): 1. (H3Iw) Update the `ensureThinkingOnToolUseTurns` option docstring to describe in-place normalization (preserving thinking text by filling in `signature: ''`) instead of the old drop-and-replace semantics. 2. (H3I9) Same update on the `applyEmptyThinkingToToolUseTurns` helper JSDoc — clarify that signature-less thinking blocks are normalized in place (preserving original text), not dropped. Mention the common case of cross-provider history where non-Anthropic generators only set `thought: true`. 3. (H3I3) Add a streaming test asserting that `generateContentStream()` also attaches the per-request `anthropic-beta` header. The previous coverage only exercised `generateContent()`, leaving the streaming path's separate code path (line 144 in anthropicContentGenerator.ts) unverified. 71 tests pass; lint + typecheck clean. * refactor(core): split DeepSeek thinking option in two + add header coexistence test Address PR review round 12 (copilot-pull-request-reviewer × 2): 1. (H6ws) The single `ensureThinkingOnToolUseTurns` option was misleadingly narrow: the implementation also rewrote non-tool-use turns by normalizing malformed thinking blocks. Future callers could enable it expecting only the tool-use behavior. Split into two precisely-named options: - normalizeAssistantThinkingSignature: fill missing `signature` on every assistant `thinking` block (cross-provider history compat). - injectThinkingOnToolUseTurns: prepend synthetic empty thinking on tool_use turns missing one (issue #3786 trigger). The generator wires both together for DeepSeek when thinking mode is on; either can be used independently if a future caller needs only one pass. 2. (H6w4) Add a test asserting that the per-request `headers` path coexists correctly with `customHeaders`: User-Agent and unrelated customHeaders entries stay in `defaultHeaders` while only the computed `anthropic-beta` rides on the per-request path. Defends against a future regression where header config might be routed through a code path that wipes the constructor defaults. 72 tests pass; lint + typecheck clean. * fix(core): case-insensitive customHeaders[anthropic-beta] merge Address yiliang114 review feedback (#3788). HTTP header names are case-insensitive by spec, and the Anthropic SDK lower-cases them during merge. Previously buildPerRequestHeaders only read the lower-case `anthropic-beta` key from customHeaders, so a user-configured `Anthropic-Beta` or `ANTHROPIC-BETA` would be silently overwritten by the per-request computed value. Replace the direct dict lookup with collectCustomBetaFlags() which walks all customHeaders entries and matches the key case-insensitively. Multiple matching entries (unlikely but possible) are concatenated; the existing dedupe pass handles any duplicates. Add a regression test for both `Anthropic-Beta` and `ANTHROPIC-BETA` key shapes. 73 tests pass; lint + typecheck clean. * docs(core): align thinking-injection docs with normalize-in-place semantics + redacted_thinking strip test Address PR review round 14 (copilot-pull-request-reviewer × 4): 1. (IAl4) PR description still described "dropped here so synthetic injection takes over" but the implementation now normalizes signature-less thinking blocks in place (preserving text). PR description rewritten to describe the two-pass model: normalize-in-place + injection-when-truly-missing. 2. (IAl7) `injectThinkingOnToolUseTurns` option docstring claimed signature-less blocks would be "seen as missing" so the synthetic replaces them. Updated to describe the actual flow: the normalization pass runs first, blocks become compliant in place, the injector then sees them as already-satisfying and prepends nothing. Helper JSDoc on `injectEmptyThinkingOnToolUseTurns` fixed the same way. 3. (IAl8) Strip-path coverage missed `redacted_thinking` blocks. Added regression test that verifies both thinking and redacted_thinking blocks are removed when `stripAssistantThinking` is set. 4. (IAl-) Renamed the converter test suite from "thinking-mode injection + normalization (DeepSeek thinking on)" to "DeepSeek thinking-mode normalization, injection, and stripping" so the title accurately covers all behavior the block exercises (including `stripAssistantThinking` cases later in the same describe). 74 tests pass; lint + typecheck clean. * fix(core): exclude anthropic-beta variants from defaultHeaders to avoid wire duplication Address PR review round 15 (copilot-pull-request-reviewer #1). `buildHeaders()` previously spread the entire `customHeaders` map into the SDK's `defaultHeaders`. After moving anthropic-beta computation to the per-request path, a user-configured mixed-case `Anthropic-Beta` key would survive in defaultHeaders verbatim, while the per-request override added a lowercase `anthropic-beta`. The wire then carried two physical headers for the same logical name — SDK behavior on duplicate headers with different casings is undefined. `buildPerRequestHeaders()` already merges those user flags case-insensitively (commit 0d8b5de), so dropping the entry from defaultHeaders is the right boundary: the per-request path owns the header end-to-end. Other customHeaders entries continue to pass through. Add a regression test asserting no `Anthropic-Beta` (any casing) lands in defaultHeaders while unrelated customHeaders are kept. 75 tests pass; lint + typecheck clean.
…rovider (QwenLM#3788) * fix(core): inject thinking blocks for DeepSeek anthropic-compatible provider DeepSeek's anthropic-compatible endpoint (https://api.deepseek.com/anthropic) rejects follow-up requests with HTTP 400 ("The content[].thinking in the thinking mode must be passed back to the API.") whenever a prior assistant turn carrying tool_use omits a thinking block. The model can legitimately return a tool round without thinking text, so qwen-code stored no thought parts and rebuilt the next request with no thinking block, tripping the API's check. Mirroring the existing OpenAI-side fix (QwenLM#3729, QwenLM#3747), the converter now detects DeepSeek by base URL or model name and prepends an empty { type: 'thinking', thinking: '', signature: '' } block to assistant turns missing one. Other anthropic-protocol providers are unaffected. Verified against the live api.deepseek.com/anthropic endpoint: - assistant with tool_use, no thinking → 400 (reproduces QwenLM#3786) - assistant with tool_use, empty thinking injected → 200 OK Refs QwenLM#3786 * fix(core): gate DeepSeek thinking-block injection on thinking mode Address PR review feedback: 1. (Critical) Gate empty-thinking injection on the same per-request condition that emits the top-level `thinking` parameter. The previous implementation injected unconditionally on DeepSeek providers, but `buildThinkingConfig()` may omit `thinking` when reasoning=false or `thinkingConfig.includeThoughts=false` — which is exactly what suggestionGenerator / ArenaManager / forkedAgent do. Shipping thinking blocks without enabling thinking mode is a protocol violation that DeepSeek may reject. Move the option from converter constructor to a per-request `convertGeminiRequestToAnthropic` parameter so the generator can compute the gate correctly. 2. (CodeQL) Replace `baseUrl.includes('api.deepseek.com')` with `new URL(baseUrl).hostname` exact-match. The substring check would accept spoofed hosts like `api.deepseek.com.evil.com`. 3. Document the empty-signature workaround inline. 4. Rename the misleading "redacted_thinking" test case. * fix(core): narrow DeepSeek thinking injection to tool_use turns + subdomain test Address PR review round 2: 1. Narrow injection scope to assistant turns containing tool_use. Live verification against api.deepseek.com/anthropic showed plain-text assistant turns without thinking are accepted unchanged — only tool_use turns trigger the HTTP 400. Injecting on every assistant turn unnecessarily bloats replay history with synthetic blocks the API does not require. Existing thinking blocks on any turn are still preserved untouched. 2. Add test coverage for the subdomain hostname branch (us.api.deepseek.com → matches), addressing the gap noted in review. 3. Update existing negative-case tests (non-deepseek / spoofed / reasoning=false / includeThoughts=false) to use tool_use scenarios so they actually exercise the gating logic instead of trivially passing under the narrowed scope. * docs(core): align DeepSeek thinking-injection comments with narrowed scope Address PR review round 3 (copilot-pull-request-reviewer × 3): comments in three locations still described the constraint as applying to "any prior assistant turn", which was true before commit 5388ded but no longer matches the implementation. Update the doc comment on isDeepSeekAnthropicProvider and the two test-suite header comments to state the actual narrower contract: the API rejects only tool-use turns that omit thinking blocks; plain-text assistant turns are accepted unchanged. Comment-only change; 58 tests still pass. * fix(core): per-request DeepSeek detection + strip thinking when off Address PR review round 4 (copilot-pull-request-reviewer × 2): 1. Stale provider-detection cache (HCia). The constructor cached isDeepSeekProvider once, but Config.setModel() mutates contentGeneratorConfig.model in place. After a runtime /model switch from a non-DeepSeek model to a DeepSeek one on the same auth config, buildRequest() would keep using the stale flag. Move the detection into buildRequest so each call sees the current model. The detector is cheap (URL parse + string compare). 2. Real thought parts leak through when thinking is disabled (HCib). The previous gate only blocked synthetic injection — but the converter still replayed any existing `thought: true` parts in request.contents as thinking blocks. Code paths that disable thinking against a session whose history was built with thinking on (suggestionGenerator / ArenaManager / forkedAgent) would still emit thinking blocks alongside an absent top-level `thinking` config — the same protocol mismatch the gate was meant to avoid. Add a `stripAssistantThinking` converter option, set in buildRequest to `isDeepSeek && !thinking`. The converter strips thinking and redacted_thinking blocks from assistant messages before message construction completes. Mirror behavior is already proven safe by live verification (DeepSeek currently tolerates either shape, but stripping makes the request body internally consistent and robust to future validation tightening). 3 new tests: - converter strips thinking from assistant turns when option set - generator strips real thought parts when reasoning=false - generator reflects runtime model changes (no stale cache) 61 tests pass; lint + typecheck clean. * fix(core): preserve thinking-only assistant turns instead of emitting empty content Address PR review round 5 (copilot-pull-request-reviewer × 2 — code + test): stripThinkingFromAssistantMessages previously replaced message.content with the filtered array unconditionally. For an assistant turn whose only blocks are thinking/redacted_thinking (e.g. a round cut off by max_tokens before any text or tool_use was emitted), this left `content: []` — which Anthropic API rejects. Dropping the message entirely was considered but would break the required user/assistant alternation. Instead, fall back to leaving the original blocks in place when stripping would empty the message. DeepSeek empirically tolerates the residual `thinking-block + no-thinking-config` shape (verified against api.deepseek.com/anthropic in the V2/X scenarios), so leaving the message untouched is the safer choice than emitting invalid structure. Add regression test for the thinking-only turn shape. 62 tests pass; lint + typecheck clean. * fix(core): validate thinking-block signature, rename option, gate output_config Address PR review round 6 — five substantive items: 1. (Critical) Drop non-compliant thinking blocks lacking a `signature` field and replace them with a synthetic one. A `redacted_thinking` block round-tripped through Gemini Part format becomes `{ text: '', thought: true }` (no thoughtSignature) and converts back to `{ type: 'thinking', thinking: '' }` without `signature` — not spec-compliant. The previous `hasThinking` check accepted these as already-satisfying, leaving non-compliant blocks in the wire message. Tighten the check so they're filtered out and the synthetic injection runs. Live verification: DeepSeek currently tolerates both shapes (lenient), but normalizing is defensively correct against future tightening. 2. Rename converter option `ensureAssistantThinking` → `ensureThinkingOnToolUseTurns`. The new name reflects the actual contract (tool-use turns only, not every assistant turn). 3. Honor `thinkingConfig.includeThoughts: false` in `buildOutputConfig`. Previously a per-request opt-out dropped the top-level `thinking` parameter but still emitted `output_config.effort`, leaking a reasoning-shaped field into side queries that don't want it. 4. Add regression test for mixed text + tool_use assistant turns (common shape: model says something, then calls a tool). 5. Add explicit test for the signature-validation path: an existing compliant thinking block (with signature) is preserved untouched. 64 tests pass; lint + typecheck clean. * fix(core): clean up non-compliant thinking blocks on plain-text turns + assert output_config gating Address PR review round 7 (copilot-pull-request-reviewer × 2): 1. (Hp-x) Round-tripped redacted_thinking blocks were left malformed on assistant turns lacking tool_use. The previous structure only ran the cleanup pass when a tool_use block was present (early return on `!hasToolUse`), so plain-text turns kept the non-compliant `{ type: 'thinking', thinking: '' }` shape. Restructure into two sequential steps: a. Drop non-compliant thinking blocks (no `signature`) on every assistant turn — same fallback that avoids `content: []` if the message is thinking-only. b. Inject the synthetic empty thinking block on tool_use turns that still lack a compliant thinking block after step (a). 2. (Hp-7) The includeThoughts=false test asserted that the top-level `thinking` field is suppressed but didn't cover `output_config`, leaving regressions in the new `buildOutputConfig` gate uncaught. Tighten the assertion to also verify `output_config` is absent. 3. New converter test: cleanup runs on plain-text assistant turns too. 65 tests pass; lint + typecheck clean. * test(core): add explicit redacted_thinking injection-path coverage Address PR review round 8 (QwenLM#30 — copilot reviewer). The converter treats `redacted_thinking` as already satisfying the thinking-block requirement (no synthetic injected), distinguished from a signature-less `thinking` block which is non-compliant and gets dropped/replaced. Existing tests covered the latter path; this adds explicit coverage of the former. processContent doesn't synthesize redacted_thinking from Gemini parts, so the test reaches into the private helper directly. (QwenLM#31 — subdomain hostname coverage — already exists at line 602.) 66 tests pass; lint + typecheck clean. * fix(core): per-request anthropic-beta + normalize thinking-only turns Address PR review round 9 (copilot-pull-request-reviewer × 2): 1. (Hydz) thinking-only assistant turns (e.g. max_tokens cutoff or round-tripped redacted_thinking) hit the cleanup-empties fallback and kept the original non-compliant `{ type: 'thinking', thinking: '' }` block. The fallback now replaces the message with a synthetic empty thinking block (`signature: ''` included), which keeps the message non-empty AND spec-compliant. 2. (Hyd4) `anthropic-beta` was set once at construction from the global `reasoning` config, so requests with per-request `thinkingConfig.includeThoughts=false` still advertised interleaved-thinking / effort even though the body had dropped the matching fields. Move beta computation to a new `buildPerRequestHeaders` that derives the header from the actual `thinking` / `output_config` fields present in the request body, and pass it via `messages.create(..., { headers })`. The wire shape is now internally consistent. Test updates: - Drop the three constructor-time beta assertions; they no longer apply. - Add four per-request header tests covering: both betas present, only interleaved-thinking, reasoning=false (no betas), and per-request includeThoughts=false (no betas). 67 tests pass; lint + typecheck clean. * fix(core): preserve thinking text by normalizing in place + merge user beta flags Address PR review round 10 (copilot-pull-request-reviewer × 2): 1. (H0oF) The previous cleanup filtered out every thinking block missing a `signature` field. But that shape is the normal output from OpenAI/Gemini/agent-runtime generators, which only set `thought: true` without a signature. Users switching providers mid-session would silently lose preserved thinking text on the first DeepSeek request. Change Step 1 to NORMALIZE in place: when a thinking block has no signature, set `signature: ''` rather than dropping the block. The original `thinking` text is preserved; DeepSeek empirically accepts empty signatures so the wire shape stays valid. 2. (H0oL) `buildPerRequestHeaders()` overwrote `customHeaders['anthropic-beta']` whenever the per-request override fired, regressing the customHeaders escape hatch for unrelated Anthropic beta features. Merge the user's flags into the computed list (deduped) so users can stack their own betas alongside interleaved-thinking / effort. Test changes: - Renamed and rewrote "drops non-compliant... plain-text" test to assert in-place normalization that preserves thinking text. - Updated "replaces a non-compliant thinking block" comment + name to describe the normalization (the assertion was already correct because the test happened to use empty thinking text). - The empty-content fallback in Step 1 is no longer reachable under the new logic, so the dedicated thinking-only-turn test now exercises only the strip path (where it remains relevant). - Added 3 customHeaders[anthropic-beta] tests: merge with computed, passthrough when no thinking/effort, dedupe. 70 tests pass; lint + typecheck clean. * docs(core): align thinking-injection comments with normalize semantics + add stream test Address PR review round 11 (copilot-pull-request-reviewer × 3): 1. (H3Iw) Update the `ensureThinkingOnToolUseTurns` option docstring to describe in-place normalization (preserving thinking text by filling in `signature: ''`) instead of the old drop-and-replace semantics. 2. (H3I9) Same update on the `applyEmptyThinkingToToolUseTurns` helper JSDoc — clarify that signature-less thinking blocks are normalized in place (preserving original text), not dropped. Mention the common case of cross-provider history where non-Anthropic generators only set `thought: true`. 3. (H3I3) Add a streaming test asserting that `generateContentStream()` also attaches the per-request `anthropic-beta` header. The previous coverage only exercised `generateContent()`, leaving the streaming path's separate code path (line 144 in anthropicContentGenerator.ts) unverified. 71 tests pass; lint + typecheck clean. * refactor(core): split DeepSeek thinking option in two + add header coexistence test Address PR review round 12 (copilot-pull-request-reviewer × 2): 1. (H6ws) The single `ensureThinkingOnToolUseTurns` option was misleadingly narrow: the implementation also rewrote non-tool-use turns by normalizing malformed thinking blocks. Future callers could enable it expecting only the tool-use behavior. Split into two precisely-named options: - normalizeAssistantThinkingSignature: fill missing `signature` on every assistant `thinking` block (cross-provider history compat). - injectThinkingOnToolUseTurns: prepend synthetic empty thinking on tool_use turns missing one (issue QwenLM#3786 trigger). The generator wires both together for DeepSeek when thinking mode is on; either can be used independently if a future caller needs only one pass. 2. (H6w4) Add a test asserting that the per-request `headers` path coexists correctly with `customHeaders`: User-Agent and unrelated customHeaders entries stay in `defaultHeaders` while only the computed `anthropic-beta` rides on the per-request path. Defends against a future regression where header config might be routed through a code path that wipes the constructor defaults. 72 tests pass; lint + typecheck clean. * fix(core): case-insensitive customHeaders[anthropic-beta] merge Address yiliang114 review feedback (QwenLM#3788). HTTP header names are case-insensitive by spec, and the Anthropic SDK lower-cases them during merge. Previously buildPerRequestHeaders only read the lower-case `anthropic-beta` key from customHeaders, so a user-configured `Anthropic-Beta` or `ANTHROPIC-BETA` would be silently overwritten by the per-request computed value. Replace the direct dict lookup with collectCustomBetaFlags() which walks all customHeaders entries and matches the key case-insensitively. Multiple matching entries (unlikely but possible) are concatenated; the existing dedupe pass handles any duplicates. Add a regression test for both `Anthropic-Beta` and `ANTHROPIC-BETA` key shapes. 73 tests pass; lint + typecheck clean. * docs(core): align thinking-injection docs with normalize-in-place semantics + redacted_thinking strip test Address PR review round 14 (copilot-pull-request-reviewer × 4): 1. (IAl4) PR description still described "dropped here so synthetic injection takes over" but the implementation now normalizes signature-less thinking blocks in place (preserving text). PR description rewritten to describe the two-pass model: normalize-in-place + injection-when-truly-missing. 2. (IAl7) `injectThinkingOnToolUseTurns` option docstring claimed signature-less blocks would be "seen as missing" so the synthetic replaces them. Updated to describe the actual flow: the normalization pass runs first, blocks become compliant in place, the injector then sees them as already-satisfying and prepends nothing. Helper JSDoc on `injectEmptyThinkingOnToolUseTurns` fixed the same way. 3. (IAl8) Strip-path coverage missed `redacted_thinking` blocks. Added regression test that verifies both thinking and redacted_thinking blocks are removed when `stripAssistantThinking` is set. 4. (IAl-) Renamed the converter test suite from "thinking-mode injection + normalization (DeepSeek thinking on)" to "DeepSeek thinking-mode normalization, injection, and stripping" so the title accurately covers all behavior the block exercises (including `stripAssistantThinking` cases later in the same describe). 74 tests pass; lint + typecheck clean. * fix(core): exclude anthropic-beta variants from defaultHeaders to avoid wire duplication Address PR review round 15 (copilot-pull-request-reviewer QwenLM#1). `buildHeaders()` previously spread the entire `customHeaders` map into the SDK's `defaultHeaders`. After moving anthropic-beta computation to the per-request path, a user-configured mixed-case `Anthropic-Beta` key would survive in defaultHeaders verbatim, while the per-request override added a lowercase `anthropic-beta`. The wire then carried two physical headers for the same logical name — SDK behavior on duplicate headers with different casings is undefined. `buildPerRequestHeaders()` already merges those user flags case-insensitively (commit 414546d), so dropping the entry from defaultHeaders is the right boundary: the per-request path owns the header end-to-end. Other customHeaders entries continue to pass through. Add a regression test asserting no `Anthropic-Beta` (any casing) lands in defaultHeaders while unrelated customHeaders are kept. 75 tests pass; lint + typecheck clean.
Summary
Closes #3786.
DeepSeek's anthropic-compatible endpoint (
https://api.deepseek.com/anthropic) rejects follow-up requests with HTTP 400 when an assistant turn carryingtool_useis missing athinkingblock:The model can legitimately return a tool round without thinking text, so qwen-code stored no thought parts and rebuilt the next request with no
thinkingblock, tripping the API's check.This mirrors the existing OpenAI-side fix (#3729, #3747) at the request boundary. Three coordinated converter behaviors keep DeepSeek requests internally consistent:
When thinking mode is on (
thinkingconfig present): two passes run together.normalizeAssistantThinkingSignaturefills insignature: ''on every assistantthinkingblock that lacks the requiredsignaturefield, preserving the original thinking text in place. Common case: cross-provider history where non-Anthropic generators (OpenAI / Gemini / agent-runtime) only setthought: true, orredacted_thinkingblocks that lost theirdatafield through the Gemini-Part round trip.injectThinkingOnToolUseTurnsthen prepends a synthetic empty thinking block on assistant turns containingtool_usebut lacking any thinking block. Live verification showed only tool-use turns trigger DeepSeek's 400 — plain-text assistant turns are accepted unchanged, so injection is scoped to that case to avoid bloating replay history. Because normalization ran first, any pre-existingthinkingblock is now compliant and the injector treats it as already-satisfying (no synthetic prepended).When thinking mode is off (e.g.
suggestionGenerator/forkedAgent/ArenaManagerpaths that passthinkingConfig.includeThoughts: false): strip pre-existing thinking and redacted_thinking blocks from assistant messages so the request body matches the absent top-levelthinkingconfig. The same per-request opt-out also dropsoutput_config.effortso reasoning-shaped fields don't leak into thinking-off side queries. Thinking-only assistant turns (rare, e.g. max_tokens cutoff) are left untouched to avoid emittingcontent: [], which Anthropic API rejects.Provider detection runs per request (DeepSeek by hostname or model name) so a runtime
/modelswitch viaConfig.setModel()doesn't see a stale provider flag. Theanthropic-betaheader is also computed per-request from the actual fields in the request body, with user-suppliedcustomHeaders['anthropic-beta'](case-insensitive lookup) merged in and deduped.Detection scope
Detection matches
api.deepseek.com(and subdomains likeus.api.deepseek.com) by hostname, or any model name containingdeepseek. The model-name path is consistent with the OpenAI-sideDeepSeekOpenAICompatibleProvider.isDeepSeekProviderand intentionally covers self-hosted DeepSeek deployments (sglang/vllm) that may sit behind generic anthropic-compatible endpoints. Side effect: a model named e.g.my-deepseek-cloneon a non-DeepSeek host also enables the workaround. Trade-off accepted because false positives are harmless on lenient providers, while a stricter prefix check would silently break the self-hosted DeepSeek scenarios the OpenAI path already supports.Verification
Reproduced and verified directly against the live
api.deepseek.com/anthropicendpoint:assistant: [tool_use](no thinking)assistant: [thinking, tool_use]assistant: [text](no tool_use)assistant: [thinking-no-signature, tool_use]Test plan
npm test— all core tests pass (74 anthropic-side tests covering normalization on every assistant turn, injection on tool_use turns, stripping when thinking-off, runtime model switch, hostname detection variants including subdomains and spoof rejection, gating, edge cases like thinking-only turns and mixed text+tool_use, customHeaders coexistence and case-insensitive merge, streaming-path beta headers)npm run lintcleannpm run typecheckcleanapi.deepseek.com/anthropic(issue reproduced pre-fix, gone post-fix; signature-less and signature-empty thinking blocks both currently accepted by DeepSeek)