fix(core): preserve reasoning_content during session resume and active sessions (GH#3579) by fyc09 · Pull Request #3590 · QwenLM/qwen-code

fyc09 · 2026-04-24T09:07:40Z

Summary

Fixes DeepSeek API 400 error (reasoning_content in the thinking mode must be passed back). Two independent root causes, both fixed.

Changes

1. Preserve `reasoning_content` on session resume (`sessionService.ts`)

buildApiHistoryFromConversation() was stripping thought parts from history by default (stripThoughtsFromHistory: true). When resuming a session, the reasoning_content from prior assistant turns was silently removed, causing DeepSeek to reject the subsequent API call.

Fix: Changed the default of stripThoughtsFromHistory from true to false. Thought parts (reasoning_content) are now preserved by default.

2. Remove `thinkingClearLatched` idle-cleanup mechanism (`client.ts`)

A sticky latch (thinkingClearLatched) was triggered after 5 minutes of idle time. Once latched, it permanently stripped old thinking blocks from history via stripThoughtsFromHistoryKeepRecent(1) on every subsequent user message. This caused intermittent 400 errors during long tasks — the user pauses briefly, the latch fires, old reasoning_content is removed, and the next API call fails.

Fix: Removed the thinkingClearLatched field, its reset in resetChat(), and the idle-cleanup block that called stripThoughtsFromHistoryKeepRecent(1).

3. Clean up dead config (`thinkingThresholdMinutes`)

With the latch removed, thinkingThresholdMinutes in ClearContextOnIdleSettings is unused. Removed from the interface, config constructor, settings schemas (CLI + VS Code), docs, and test fixtures.

Trade-offs

Removing the latch means reasoning_content from all prior turns accumulates in the conversation history indefinitely. The latch was originally designed to save tokens for reasoning models that don't require passing back thought content (e.g., OpenAI o-series). Approach A (remove latch, accept token bloat) was chosen after discussion in the issue — correctness on session resume is the priority, and a provider-aware strategy can be added later if token bloat becomes a problem.

Review

The PR addresses the DeepSeek resume and idle-session failure by preserving reasoning_content across reconstructed history and removing the idle thinking cleanup that could silently break required reasoning passback. The implementation is focused, and the added tests cover both default preservation and the explicit stripping option.

Verdict

APPROVE — the behavior change is intentional for correctness with reasoning models, and the implementation is sufficiently scoped.

…enLM#3590, QwenLM#3505, QwenLM#3467) (#113) * fix: backport upstream trivial bug fixes (QwenLM#3499, QwenLM#3630, QwenLM#3320) Cherry-picked from QwenLM/qwen-code: - QwenLM#3499 fix(core): use empty string instead of null for reasoning-only assistant content. Some OpenAI-compatible providers (e.g. Ollama qwen3.5:9b) reject content: null with HTTP 400 when reasoning_content is also present. Tool-call-only messages keep null per OpenAI spec. - QwenLM#3630 fix(telemetry): switch FileExporter.serialize from JSON.stringify to safeJsonStringify. OTel ReadableSpans hold a BatchSpanProcessor back-reference that forms a cycle and crashed --telemetry-outfile users. - QwenLM#3320 fix(core): cap chokidar depth at 2 in SkillManager and skip .git / special file types. Prevents FD exhaustion when a skill dir contains node_modules etc., which silently broke node-pty I/O. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(core): scope StreamingToolCallParser per-stream (QwenLM#3525) Backports upstream PR QwenLM#3525 + extends the per-stream context to also cover our fork's <think>-tag parser state. Bug: every caller of Config.getContentGenerator() — foreground turns, fork subagents, run_in_background subagents, ACP concurrent Agent calls (after QwenLM#3463) — shared a single OpenAIContentConverter, which held the StreamingToolCallParser as an instance field. Concurrent streams corrupted each other's tool-call buffers, surfacing as NO_RESPONSE_TEXT. Fix: - New ConverterStreamContext interface holds toolCallParser, thinkBuffer, inThinkTag — one per stream. - createStreamContext() factory replaces resetStreamingToolCalls(). - convertOpenAIChunkToGemini(chunk, ctx) and processThinkChunk(chunk, ctx) thread the context through every parser/think-buffer access. - ContentGenerationPipeline.processStreamWithLogging creates one context at stream entry. The error path no longer manually resets — the context is GC'd when the generator unwinds. Our protoInternal recovery-note logic is preserved on the new shape. Note: upstream's follow-up QwenLM#3550 (full stateless converter refactor) is deferred — it's hygiene without a functional bug; QwenLM#3525 alone fixes the concurrency race. Tests: - New createStreamContext describe replaces resetStreamingToolCalls suite - Streaming <think> tests use a per-test context - pipeline.test.ts mock updated to match the new API - pipeline.concurrent.test.ts (from upstream commit 38edd9d) drives two real concurrent streams and asserts neither corrupts the other's tool-call output (positive control: pre-fix, this test fails with exactly the user-reported bug shape). Refs upstream QwenLM#3516, QwenLM#3525. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(core): strip thinking blocks from history on model switch (QwenLM#3304) When switching models mid-session, reasoning_content fields from thinking-capable models leaked into API requests sent to the new provider, causing 422 errors on strict OpenAI-compatible endpoints. Call stripThoughtsFromHistory() in handleModelChange() so thought parts are removed before the next request is built for the new model. * fix(core): reject truncated subagent write_file calls (QwenLM#3505) Backport of upstream QwenLM#3505. Propagates MAX_TOKENS truncation from subagent responses into tool requests and rejects truncated edit calls before schema validation can surface misleading missing-parameter errors. Adapted to our fork's coreToolScheduler.ts which already had the truncation rejection block — kept both, dropped the unused clearRetryCountsForTool() call (we don't have that retry-counter machinery yet). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(core): prevent malformed permission rules from becoming tool-wide catch-alls (QwenLM#3467) Backport of upstream QwenLM#3467. A permission rule with unbalanced parens was silently parsed with specifier: undefined, causing matchesRule to treat it as a catch-all. For deny rules this blocked all commands; for allow rules a typo could silently auto-approve everything. - Adds an invalid flag to PermissionRule - parseRule marks unbalanced-paren rules as invalid - matchesRule short-circuits invalid rules to never match - parseRules / addSession*Rule / addPersistentRule warn on malformed input - listRules filters invalid rules from /permissions UI Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(core): preserve reasoning_content during session resume and active sessions (GH#3579) * test(config): drop fork-incompatible QwenLM#3304 strip-thoughts test The test from upstream QwenLM#3304 backport assumed an in-place qwen-oauth model switch path that our fork doesn't have; the source-side fix in config.ts (stripThoughtsFromHistory call in handleModelChange) is preserved. Coverage will be re-added when the fork's switch flow stabilizes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Automaker <automaker@localhost> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: tanzhenxin <tanzhenxing1987@gmail.com> Co-authored-by: Yuchen Fu <fuyuchen0904@163.com>

Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

…e sessions (GH#3579) (QwenLM#3590) * fix(core): preserve reasoning_content during session resume and active sessions (GH#3579) * chore(core): remove dead thinkingThresholdMinutes config after latch removal (GH#3579)

fyc09 added 2 commits April 24, 2026 16:35

fix(core): preserve reasoning_content during session resume and activ…

9cad2e3

…e sessions (GH#3579)

chore(core): remove dead thinkingThresholdMinutes config after latch …

3efd308

…removal (GH#3579)

tanzhenxin approved these changes Apr 24, 2026

View reviewed changes

tanzhenxin merged commit 93cbad2 into QwenLM:main Apr 24, 2026
11 of 12 checks passed

This was referenced Apr 26, 2026

Deepseek V4 调用错误 API Error: 400 和 reasoning_content有关 #3619

Closed

BUG: DeepSeek API 400 error — reasoning_content in thinking mode must be passed back #3579

Closed

mabry1985 mentioned this pull request Apr 26, 2026

fix: backport upstream reasoning + correctness fixes (#3315, #3590, #3505, #3467) protoLabsAI/protoCLI#113

Merged

2 tasks

mabry1985 mentioned this pull request Apr 26, 2026

promote: dev -> main (5 upstream backport PRs through v0.26.9) protoLabsAI/protoCLI#118

Merged

fyc09 mentioned this pull request Apr 29, 2026

fix(core): preserve reasoning_content in rewind, compression, and merge paths (#3579) #3737

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(core): preserve reasoning_content during session resume and active sessions (GH#3579)#3590

fix(core): preserve reasoning_content during session resume and active sessions (GH#3579)#3590
tanzhenxin merged 2 commits into
QwenLM:mainfrom
fyc09:fix/issue-3579-preserve-reasoning-content-on-resume

fyc09 commented Apr 24, 2026

Uh oh!

tanzhenxin left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

fyc09 commented Apr 24, 2026

Summary

Changes

1. Preserve reasoning_content on session resume (sessionService.ts)

2. Remove thinkingClearLatched idle-cleanup mechanism (client.ts)

3. Clean up dead config (thinkingThresholdMinutes)

Trade-offs

Related

Uh oh!

tanzhenxin left a comment

Choose a reason for hiding this comment

Review

Verdict

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

1. Preserve `reasoning_content` on session resume (`sessionService.ts`)

2. Remove `thinkingClearLatched` idle-cleanup mechanism (`client.ts`)

3. Clean up dead config (`thinkingThresholdMinutes`)