Skip to content

fix(core): use empty string instead of null for reasoning-only assistant content#3499

Merged
wenshao merged 1 commit into
QwenLM:mainfrom
LaZzyMan:fix/reasoning-only-content-null-400
Apr 21, 2026
Merged

fix(core): use empty string instead of null for reasoning-only assistant content#3499
wenshao merged 1 commit into
QwenLM:mainfrom
LaZzyMan:fix/reasoning-only-content-null-400

Conversation

@LaZzyMan

Copy link
Copy Markdown
Collaborator

TLDR

When a model (e.g. Ollama + qwen3.5:9b) returns a streaming response that contains reasoning/thinking content but an empty text body, the assistant message in the conversation history was serialised with content: null. On the next turn this null is sent verbatim to the provider, and Ollama (and other OpenAI-compatible servers) reject the request with HTTP 400 – invalid message content type: <nil>, making the session unrecoverable without a restart.

Screenshots / Video Demo

N/A — no user-facing change. The fix prevents an HTTP 400 error from being thrown mid-session, so the session continues working transparently.

Dive Deeper

Root cause

In processContent() (converter.ts), when converting a Gemini Content object back to an OpenAI assistant message:

content: assistantTextContent || null,
// "" || null = null  ← sent to Ollama → HTTP 400

If the model produced only a thought part (no visible text), assistantTextContent is "" and the expression evaluates to null. The message is then:

{ "role": "assistant", "content": null, "reasoning_content": "Done…" }

Ollama rejects this because it does not accept content: null when other fields like reasoning_content are present.

Why resume works but live sessions don't

On resume, buildApiHistoryFromConversation() calls stripThoughtsFromContent(), which drops the entire assistant Content node that only contained thought parts. So the bad message is silently excluded and the request succeeds. This guard is absent for in-flight conversation history.

Fix

content: assistantTextContent || (reasoningParts.length > 0 ? '' : null),
  • reasoning-only messages → content: "" (accepted by Ollama)
  • tool-call-only messages (no reasoning) → content: null (OpenAI spec-compliant)
  • normal text messages → unchanged

Reviewer Test Plan

  1. Use an Ollama-compatible backend (e.g. ollama serve with qwen3.5:9b or any model that emits reasoning tokens).
  2. Configure Qwen Code to use http://localhost:11434/v1 with that model.
  3. Have a multi-turn conversation where the model's reasoning step produces no visible text (this happens intermittently with smaller models).
  4. Without the fix the session hangs with API Error: 400 invalid message content type: <nil>; with the fix the session continues normally.

Testing Matrix

�� 🪟 🐧
npm run
npx
Docker
Podman - - -
Seatbelt - -

Linked issues / bugs

Fixes #3421

…ant content

When a model (e.g. Ollama qwen3.5:9b) returns a response with
reasoning/thinking content but an empty text body, the assistant message
was constructed with `content: null`. Some OpenAI-compatible providers
(e.g. Ollama) reject such requests with HTTP 400 when reasoning_content
is also present.

Fix: in processContent(), when assistantTextContent is empty but
reasoningParts is non-empty, use '' instead of null. Tool-call-only
messages (no reasoning) retain `content: null` to stay spec-compliant.

Adds three regression tests covering:
- reasoning-only → content is ''
- tool-call-only → content is null (OpenAI spec)
- reasoning + text → content is the actual text

Fixes QwenLM#3421

@wenshao wenshao left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found. LGTM! ✅ — gpt-5.4 via Qwen Code /review

@wenshao

wenshao commented Apr 21, 2026

Copy link
Copy Markdown
Collaborator

Tested locally against main:

  • Checked out the branch — all 285 tests in packages/core/src/core/openaiContentGenerator pass, including the 3 new regression tests in converter.test.ts.
  • Verified the regression suite actually catches [API Error: 400 invalid message content type: <nil>] (Press Ctrl+Y to retry) #3421: temporarily reverted the one-line fix and re-ran the new tests — the reasoning-only case fails with expected null to be '', confirming the guard works.
  • Reviewed the two downstream consumers that inspect content: deepseek.ts:51-54 and dashscope.ts:183-184 both short-circuit on typeof content === 'string' / skip only null/undefined, so the empty-string payload flows through without regression.
  • Branch status: CLEAN / MERGEABLE, all CI jobs green across macOS / Ubuntu / Windows × Node 20/22/24.

LGTM, merging.

@wenshao wenshao merged commit 8a04896 into QwenLM:main Apr 21, 2026
24 of 25 checks passed
chiga0 pushed a commit that referenced this pull request Apr 24, 2026
…ant content (#3499)

When a model (e.g. Ollama qwen3.5:9b) returns a response with
reasoning/thinking content but an empty text body, the assistant message
was constructed with `content: null`. Some OpenAI-compatible providers
(e.g. Ollama) reject such requests with HTTP 400 when reasoning_content
is also present.

Fix: in processContent(), when assistantTextContent is empty but
reasoningParts is non-empty, use '' instead of null. Tool-call-only
messages (no reasoning) retain `content: null` to stay spec-compliant.

Adds three regression tests covering:
- reasoning-only → content is ''
- tool-call-only → content is null (OpenAI spec)
- reasoning + text → content is the actual text

Fixes #3421
mabry1985 added a commit to protoLabsAI/protoCLI that referenced this pull request Apr 26, 2026
…wenLM#3320) (#109)

Cherry-picked from QwenLM/qwen-code:

- QwenLM#3499 fix(core): use empty string instead of null for reasoning-only
  assistant content. Some OpenAI-compatible providers (e.g. Ollama
  qwen3.5:9b) reject content: null with HTTP 400 when reasoning_content is
  also present. Tool-call-only messages keep null per OpenAI spec.

- QwenLM#3630 fix(telemetry): switch FileExporter.serialize from JSON.stringify
  to safeJsonStringify. OTel ReadableSpans hold a BatchSpanProcessor
  back-reference that forms a cycle and crashed --telemetry-outfile users.

- QwenLM#3320 fix(core): cap chokidar depth at 2 in SkillManager and skip
  .git / special file types. Prevents FD exhaustion when a skill dir
  contains node_modules etc., which silently broke node-pty I/O.

Co-authored-by: Automaker <automaker@localhost>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
mabry1985 added a commit to protoLabsAI/protoCLI that referenced this pull request Apr 26, 2026
* fix: backport upstream trivial bug fixes (QwenLM#3499, QwenLM#3630, QwenLM#3320)

Cherry-picked from QwenLM/qwen-code:

- QwenLM#3499 fix(core): use empty string instead of null for reasoning-only
  assistant content. Some OpenAI-compatible providers (e.g. Ollama
  qwen3.5:9b) reject content: null with HTTP 400 when reasoning_content is
  also present. Tool-call-only messages keep null per OpenAI spec.

- QwenLM#3630 fix(telemetry): switch FileExporter.serialize from JSON.stringify
  to safeJsonStringify. OTel ReadableSpans hold a BatchSpanProcessor
  back-reference that forms a cycle and crashed --telemetry-outfile users.

- QwenLM#3320 fix(core): cap chokidar depth at 2 in SkillManager and skip
  .git / special file types. Prevents FD exhaustion when a skill dir
  contains node_modules etc., which silently broke node-pty I/O.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(core): scope StreamingToolCallParser per-stream (QwenLM#3525)

Backports upstream PR QwenLM#3525 + extends the per-stream context to also
cover our fork's <think>-tag parser state.

Bug: every caller of Config.getContentGenerator() — foreground turns,
fork subagents, run_in_background subagents, ACP concurrent Agent calls
(after QwenLM#3463) — shared a single OpenAIContentConverter, which held the
StreamingToolCallParser as an instance field. Concurrent streams
corrupted each other's tool-call buffers, surfacing as NO_RESPONSE_TEXT.

Fix:
- New ConverterStreamContext interface holds toolCallParser, thinkBuffer,
  inThinkTag — one per stream.
- createStreamContext() factory replaces resetStreamingToolCalls().
- convertOpenAIChunkToGemini(chunk, ctx) and processThinkChunk(chunk, ctx)
  thread the context through every parser/think-buffer access.
- ContentGenerationPipeline.processStreamWithLogging creates one context
  at stream entry. The error path no longer manually resets — the
  context is GC'd when the generator unwinds.

Our protoInternal recovery-note logic is preserved on the new shape.

Note: upstream's follow-up QwenLM#3550 (full stateless converter refactor) is
deferred — it's hygiene without a functional bug; QwenLM#3525 alone fixes
the concurrency race.

Tests:
- New createStreamContext describe replaces resetStreamingToolCalls suite
- Streaming <think> tests use a per-test context
- pipeline.test.ts mock updated to match the new API
- pipeline.concurrent.test.ts (from upstream commit 38edd9d)
  drives two real concurrent streams and asserts neither corrupts the
  other's tool-call output (positive control: pre-fix, this test fails
  with exactly the user-reported bug shape).

Refs upstream QwenLM#3516, QwenLM#3525.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Automaker <automaker@localhost>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
mabry1985 added a commit to protoLabsAI/protoCLI that referenced this pull request Apr 26, 2026
…enLM#3590, QwenLM#3505, QwenLM#3467) (#113)

* fix: backport upstream trivial bug fixes (QwenLM#3499, QwenLM#3630, QwenLM#3320)

Cherry-picked from QwenLM/qwen-code:

- QwenLM#3499 fix(core): use empty string instead of null for reasoning-only
  assistant content. Some OpenAI-compatible providers (e.g. Ollama
  qwen3.5:9b) reject content: null with HTTP 400 when reasoning_content is
  also present. Tool-call-only messages keep null per OpenAI spec.

- QwenLM#3630 fix(telemetry): switch FileExporter.serialize from JSON.stringify
  to safeJsonStringify. OTel ReadableSpans hold a BatchSpanProcessor
  back-reference that forms a cycle and crashed --telemetry-outfile users.

- QwenLM#3320 fix(core): cap chokidar depth at 2 in SkillManager and skip
  .git / special file types. Prevents FD exhaustion when a skill dir
  contains node_modules etc., which silently broke node-pty I/O.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(core): scope StreamingToolCallParser per-stream (QwenLM#3525)

Backports upstream PR QwenLM#3525 + extends the per-stream context to also
cover our fork's <think>-tag parser state.

Bug: every caller of Config.getContentGenerator() — foreground turns,
fork subagents, run_in_background subagents, ACP concurrent Agent calls
(after QwenLM#3463) — shared a single OpenAIContentConverter, which held the
StreamingToolCallParser as an instance field. Concurrent streams
corrupted each other's tool-call buffers, surfacing as NO_RESPONSE_TEXT.

Fix:
- New ConverterStreamContext interface holds toolCallParser, thinkBuffer,
  inThinkTag — one per stream.
- createStreamContext() factory replaces resetStreamingToolCalls().
- convertOpenAIChunkToGemini(chunk, ctx) and processThinkChunk(chunk, ctx)
  thread the context through every parser/think-buffer access.
- ContentGenerationPipeline.processStreamWithLogging creates one context
  at stream entry. The error path no longer manually resets — the
  context is GC'd when the generator unwinds.

Our protoInternal recovery-note logic is preserved on the new shape.

Note: upstream's follow-up QwenLM#3550 (full stateless converter refactor) is
deferred — it's hygiene without a functional bug; QwenLM#3525 alone fixes
the concurrency race.

Tests:
- New createStreamContext describe replaces resetStreamingToolCalls suite
- Streaming <think> tests use a per-test context
- pipeline.test.ts mock updated to match the new API
- pipeline.concurrent.test.ts (from upstream commit 38edd9d)
  drives two real concurrent streams and asserts neither corrupts the
  other's tool-call output (positive control: pre-fix, this test fails
  with exactly the user-reported bug shape).

Refs upstream QwenLM#3516, QwenLM#3525.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(core): strip thinking blocks from history on model switch (QwenLM#3304)

When switching models mid-session, reasoning_content fields from
thinking-capable models leaked into API requests sent to the new
provider, causing 422 errors on strict OpenAI-compatible endpoints.

Call stripThoughtsFromHistory() in handleModelChange() so thought parts
are removed before the next request is built for the new model.

* fix(core): reject truncated subagent write_file calls (QwenLM#3505)

Backport of upstream QwenLM#3505. Propagates MAX_TOKENS truncation from
subagent responses into tool requests and rejects truncated edit calls
before schema validation can surface misleading missing-parameter errors.

Adapted to our fork's coreToolScheduler.ts which already had the
truncation rejection block — kept both, dropped the unused
clearRetryCountsForTool() call (we don't have that retry-counter
machinery yet).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(core): prevent malformed permission rules from becoming tool-wide catch-alls (QwenLM#3467)

Backport of upstream QwenLM#3467. A permission rule with unbalanced parens
was silently parsed with specifier: undefined, causing matchesRule to
treat it as a catch-all. For deny rules this blocked all commands; for
allow rules a typo could silently auto-approve everything.

- Adds an invalid flag to PermissionRule
- parseRule marks unbalanced-paren rules as invalid
- matchesRule short-circuits invalid rules to never match
- parseRules / addSession*Rule / addPersistentRule warn on malformed input
- listRules filters invalid rules from /permissions UI

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(core): preserve reasoning_content during session resume and active sessions (GH#3579)

* test(config): drop fork-incompatible QwenLM#3304 strip-thoughts test

The test from upstream QwenLM#3304 backport assumed an in-place qwen-oauth
model switch path that our fork doesn't have; the source-side fix in
config.ts (stripThoughtsFromHistory call in handleModelChange) is
preserved. Coverage will be re-added when the fork's switch flow
stabilizes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Automaker <automaker@localhost>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: tanzhenxin <tanzhenxing1987@gmail.com>
Co-authored-by: Yuchen Fu <fuyuchen0904@163.com>
mabry1985 added a commit to protoLabsAI/protoCLI that referenced this pull request Apr 26, 2026
…wenLM#3574) (#115)

* fix: backport upstream trivial bug fixes (QwenLM#3499, QwenLM#3630, QwenLM#3320)

Cherry-picked from QwenLM/qwen-code:

- QwenLM#3499 fix(core): use empty string instead of null for reasoning-only
  assistant content. Some OpenAI-compatible providers (e.g. Ollama
  qwen3.5:9b) reject content: null with HTTP 400 when reasoning_content is
  also present. Tool-call-only messages keep null per OpenAI spec.

- QwenLM#3630 fix(telemetry): switch FileExporter.serialize from JSON.stringify
  to safeJsonStringify. OTel ReadableSpans hold a BatchSpanProcessor
  back-reference that forms a cycle and crashed --telemetry-outfile users.

- QwenLM#3320 fix(core): cap chokidar depth at 2 in SkillManager and skip
  .git / special file types. Prevents FD exhaustion when a skill dir
  contains node_modules etc., which silently broke node-pty I/O.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(core): scope StreamingToolCallParser per-stream (QwenLM#3525)

Backports upstream PR QwenLM#3525 + extends the per-stream context to also
cover our fork's <think>-tag parser state.

Bug: every caller of Config.getContentGenerator() — foreground turns,
fork subagents, run_in_background subagents, ACP concurrent Agent calls
(after QwenLM#3463) — shared a single OpenAIContentConverter, which held the
StreamingToolCallParser as an instance field. Concurrent streams
corrupted each other's tool-call buffers, surfacing as NO_RESPONSE_TEXT.

Fix:
- New ConverterStreamContext interface holds toolCallParser, thinkBuffer,
  inThinkTag — one per stream.
- createStreamContext() factory replaces resetStreamingToolCalls().
- convertOpenAIChunkToGemini(chunk, ctx) and processThinkChunk(chunk, ctx)
  thread the context through every parser/think-buffer access.
- ContentGenerationPipeline.processStreamWithLogging creates one context
  at stream entry. The error path no longer manually resets — the
  context is GC'd when the generator unwinds.

Our protoInternal recovery-note logic is preserved on the new shape.

Note: upstream's follow-up QwenLM#3550 (full stateless converter refactor) is
deferred — it's hygiene without a functional bug; QwenLM#3525 alone fixes
the concurrency race.

Tests:
- New createStreamContext describe replaces resetStreamingToolCalls suite
- Streaming <think> tests use a per-test context
- pipeline.test.ts mock updated to match the new API
- pipeline.concurrent.test.ts (from upstream commit 38edd9d)
  drives two real concurrent streams and asserts neither corrupts the
  other's tool-call output (positive control: pre-fix, this test fails
  with exactly the user-reported bug shape).

Refs upstream QwenLM#3516, QwenLM#3525.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(core): strip thinking blocks from history on model switch (QwenLM#3304)

When switching models mid-session, reasoning_content fields from
thinking-capable models leaked into API requests sent to the new
provider, causing 422 errors on strict OpenAI-compatible endpoints.

Call stripThoughtsFromHistory() in handleModelChange() so thought parts
are removed before the next request is built for the new model.

* fix(core): reject truncated subagent write_file calls (QwenLM#3505)

Backport of upstream QwenLM#3505. Propagates MAX_TOKENS truncation from
subagent responses into tool requests and rejects truncated edit calls
before schema validation can surface misleading missing-parameter errors.

Adapted to our fork's coreToolScheduler.ts which already had the
truncation rejection block — kept both, dropped the unused
clearRetryCountsForTool() call (we don't have that retry-counter
machinery yet).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(core): prevent malformed permission rules from becoming tool-wide catch-alls (QwenLM#3467)

Backport of upstream QwenLM#3467. A permission rule with unbalanced parens
was silently parsed with specifier: undefined, causing matchesRule to
treat it as a catch-all. For deny rules this blocked all commands; for
allow rules a typo could silently auto-approve everything.

- Adds an invalid flag to PermissionRule
- parseRule marks unbalanced-paren rules as invalid
- matchesRule short-circuits invalid rules to never match
- parseRules / addSession*Rule / addPersistentRule warn on malformed input
- listRules filters invalid rules from /permissions UI

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(core): preserve reasoning_content during session resume and active sessions (GH#3579)

* test(config): drop fork-incompatible QwenLM#3304 strip-thoughts test

The test from upstream QwenLM#3304 backport assumed an in-place qwen-oauth
model switch path that our fork doesn't have; the source-side fix in
config.ts (stripThoughtsFromHistory call in handleModelChange) is
preserved. Coverage will be re-added when the fork's switch flow
stabilizes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(acp): run Agent tool calls concurrently + graceful degrade (QwenLM#3463)

Backport of upstream QwenLM#3463. When the model returns multiple Agent tool
calls in a single turn, ACP Session was executing them sequentially in
a for-loop, multiplying latency by sub-agent count.

- Add private runToolCalls() helper that mirrors coreToolScheduler's
  partition logic: consecutive Agent calls form a parallel batch (safe
  because sub-agents have no shared mutable state); other tools form
  sequential batches.
- Replace 2 for-loops in Session.ts with runToolCalls() calls.
- Switch the AgentTool eventEmitter guard from key-presence check to
  truthy check (commit 651979c) — the key-presence check passed for
  { eventEmitter: undefined } and crashed inside SubAgentTracker.setup.

Note: upstream replaced 3 for-loops; our fork only had 2 in those code
paths.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(acp): support SSE and HTTP MCP servers in ACP mode

In ACP mode, the Mcp server list sent by the IDE client can include
SSE (type: "sse") and HTTP (type: "http") transports, but the previous
implementation only handled stdio servers via toStdioServer(). Non-stdio
servers were silently skipped (continue), so any SSE/HTTP-configured
MCP server would never be registered.

Changes:
- Add toSseServer() helper: detects type=="sse" servers and maps them
  to MCPServerConfig(url=..., headers=...)
- Add toHttpServer() helper: detects type=="http" servers and maps them
  to MCPServerConfig(httpUrl=..., headers=...)
- Refactor newSessionConfig() loop to handle all three transport types
- Declare mcpCapabilities: { sse: true, http: true } in agentCapabilities
  so IDE clients know this agent supports these transports without needing
  a transparent proxy
- Export the three helper functions for unit testing

Tests:
- Unit tests for toStdioServer / toSseServer / toHttpServer helpers
  (type discrimination, mutual exclusion)
- Integration-style tests for QwenAgent.initialize() mcpCapabilities
- Integration-style tests for newSession() with SSE/HTTP MCP servers,
  verifying MCPServerConfig is constructed with the correct arguments
  (url vs httpUrl, headers passthrough, empty-headers → undefined)

Fixes QwenLM#3472

---------

Co-authored-by: Automaker <automaker@localhost>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: tanzhenxin <tanzhenxing1987@gmail.com>
Co-authored-by: Yuchen Fu <fuyuchen0904@163.com>
Co-authored-by: LaZzyMan <zeusdream7@gmail.com>
mabry1985 added a commit to protoLabsAI/protoCLI that referenced this pull request Apr 26, 2026
* fix: backport upstream trivial bug fixes (QwenLM#3499, QwenLM#3630, QwenLM#3320)

Cherry-picked from QwenLM/qwen-code:

- QwenLM#3499 fix(core): use empty string instead of null for reasoning-only
  assistant content. Some OpenAI-compatible providers (e.g. Ollama
  qwen3.5:9b) reject content: null with HTTP 400 when reasoning_content is
  also present. Tool-call-only messages keep null per OpenAI spec.

- QwenLM#3630 fix(telemetry): switch FileExporter.serialize from JSON.stringify
  to safeJsonStringify. OTel ReadableSpans hold a BatchSpanProcessor
  back-reference that forms a cycle and crashed --telemetry-outfile users.

- QwenLM#3320 fix(core): cap chokidar depth at 2 in SkillManager and skip
  .git / special file types. Prevents FD exhaustion when a skill dir
  contains node_modules etc., which silently broke node-pty I/O.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(core): scope StreamingToolCallParser per-stream (QwenLM#3525)

Backports upstream PR QwenLM#3525 + extends the per-stream context to also
cover our fork's <think>-tag parser state.

Bug: every caller of Config.getContentGenerator() — foreground turns,
fork subagents, run_in_background subagents, ACP concurrent Agent calls
(after QwenLM#3463) — shared a single OpenAIContentConverter, which held the
StreamingToolCallParser as an instance field. Concurrent streams
corrupted each other's tool-call buffers, surfacing as NO_RESPONSE_TEXT.

Fix:
- New ConverterStreamContext interface holds toolCallParser, thinkBuffer,
  inThinkTag — one per stream.
- createStreamContext() factory replaces resetStreamingToolCalls().
- convertOpenAIChunkToGemini(chunk, ctx) and processThinkChunk(chunk, ctx)
  thread the context through every parser/think-buffer access.
- ContentGenerationPipeline.processStreamWithLogging creates one context
  at stream entry. The error path no longer manually resets — the
  context is GC'd when the generator unwinds.

Our protoInternal recovery-note logic is preserved on the new shape.

Note: upstream's follow-up QwenLM#3550 (full stateless converter refactor) is
deferred — it's hygiene without a functional bug; QwenLM#3525 alone fixes
the concurrency race.

Tests:
- New createStreamContext describe replaces resetStreamingToolCalls suite
- Streaming <think> tests use a per-test context
- pipeline.test.ts mock updated to match the new API
- pipeline.concurrent.test.ts (from upstream commit 38edd9d)
  drives two real concurrent streams and asserts neither corrupts the
  other's tool-call output (positive control: pre-fix, this test fails
  with exactly the user-reported bug shape).

Refs upstream QwenLM#3516, QwenLM#3525.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(core): strip thinking blocks from history on model switch (QwenLM#3304)

When switching models mid-session, reasoning_content fields from
thinking-capable models leaked into API requests sent to the new
provider, causing 422 errors on strict OpenAI-compatible endpoints.

Call stripThoughtsFromHistory() in handleModelChange() so thought parts
are removed before the next request is built for the new model.

* fix(core): reject truncated subagent write_file calls (QwenLM#3505)

Backport of upstream QwenLM#3505. Propagates MAX_TOKENS truncation from
subagent responses into tool requests and rejects truncated edit calls
before schema validation can surface misleading missing-parameter errors.

Adapted to our fork's coreToolScheduler.ts which already had the
truncation rejection block — kept both, dropped the unused
clearRetryCountsForTool() call (we don't have that retry-counter
machinery yet).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(core): prevent malformed permission rules from becoming tool-wide catch-alls (QwenLM#3467)

Backport of upstream QwenLM#3467. A permission rule with unbalanced parens
was silently parsed with specifier: undefined, causing matchesRule to
treat it as a catch-all. For deny rules this blocked all commands; for
allow rules a typo could silently auto-approve everything.

- Adds an invalid flag to PermissionRule
- parseRule marks unbalanced-paren rules as invalid
- matchesRule short-circuits invalid rules to never match
- parseRules / addSession*Rule / addPersistentRule warn on malformed input
- listRules filters invalid rules from /permissions UI

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(core): preserve reasoning_content during session resume and active sessions (GH#3579)

* test(config): drop fork-incompatible QwenLM#3304 strip-thoughts test

The test from upstream QwenLM#3304 backport assumed an in-place qwen-oauth
model switch path that our fork doesn't have; the source-side fix in
config.ts (stripThoughtsFromHistory call in handleModelChange) is
preserved. Coverage will be re-added when the fork's switch flow
stabilizes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(acp): run Agent tool calls concurrently + graceful degrade (QwenLM#3463)

Backport of upstream QwenLM#3463. When the model returns multiple Agent tool
calls in a single turn, ACP Session was executing them sequentially in
a for-loop, multiplying latency by sub-agent count.

- Add private runToolCalls() helper that mirrors coreToolScheduler's
  partition logic: consecutive Agent calls form a parallel batch (safe
  because sub-agents have no shared mutable state); other tools form
  sequential batches.
- Replace 2 for-loops in Session.ts with runToolCalls() calls.
- Switch the AgentTool eventEmitter guard from key-presence check to
  truthy check (commit 651979c) — the key-presence check passed for
  { eventEmitter: undefined } and crashed inside SubAgentTracker.setup.

Note: upstream replaced 3 for-loops; our fork only had 2 in those code
paths.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(acp): support SSE and HTTP MCP servers in ACP mode

In ACP mode, the Mcp server list sent by the IDE client can include
SSE (type: "sse") and HTTP (type: "http") transports, but the previous
implementation only handled stdio servers via toStdioServer(). Non-stdio
servers were silently skipped (continue), so any SSE/HTTP-configured
MCP server would never be registered.

Changes:
- Add toSseServer() helper: detects type=="sse" servers and maps them
  to MCPServerConfig(url=..., headers=...)
- Add toHttpServer() helper: detects type=="http" servers and maps them
  to MCPServerConfig(httpUrl=..., headers=...)
- Refactor newSessionConfig() loop to handle all three transport types
- Declare mcpCapabilities: { sse: true, http: true } in agentCapabilities
  so IDE clients know this agent supports these transports without needing
  a transparent proxy
- Export the three helper functions for unit testing

Tests:
- Unit tests for toStdioServer / toSseServer / toHttpServer helpers
  (type discrimination, mutual exclusion)
- Integration-style tests for QwenAgent.initialize() mcpCapabilities
- Integration-style tests for newSession() with SSE/HTTP MCP servers,
  verifying MCPServerConfig is constructed with the correct arguments
  (url vs httpUrl, headers passthrough, empty-headers → undefined)

Fixes QwenLM#3472

* fix(openai): when samplingParams is set, pass it through verbatim

Previously pipeline.ts always hardcoded max_tokens as the output-token parameter name on the OpenAI-compatible path, falling back from samplingParams.max_tokens to request.config.maxOutputTokens to provider defaults. This broke GPT-5 / o-series on OpenAI and Azure OpenAI, which require max_completion_tokens and reject max_tokens with a 400 error.

Fix: when the user provides samplingParams explicitly, treat it as the complete source of truth for the wire shape and pass its keys through verbatim. No client-injected defaults, no request fallbacks, no hardcoded parameter names. The user describes what the provider wants; the client trusts them.

When samplingParams is absent, the historical default behavior (request fallback through temperature/top_p/.../max_tokens plus provider defaults) is preserved unchanged — existing users see no difference.

Concretely, users can now set any of:

  samplingParams: { max_tokens: 4096 }                # GPT-4 / Qwen / DeepSeek

  samplingParams: { max_completion_tokens: 4096 }    # GPT-5 / o-series

  samplingParams: { reasoning_effort: 'medium' }     # future knobs

without waiting for a qwen-code release that adds model-specific branches.

Signed-off-by: Gordon Lam (SH) <yeelam@microsoft.com>

---------

Signed-off-by: Gordon Lam (SH) <yeelam@microsoft.com>
Co-authored-by: Automaker <automaker@localhost>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: tanzhenxin <tanzhenxing1987@gmail.com>
Co-authored-by: Yuchen Fu <fuyuchen0904@163.com>
Co-authored-by: LaZzyMan <zeusdream7@gmail.com>
Co-authored-by: Gordon Lam (SH) <yeelam@microsoft.com>
wenshao pushed a commit that referenced this pull request Apr 26, 2026
…easoning is present

Audit follow-up to the previous commit. mergeConsecutiveAssistantMessages
already had a latent bug independent of reasoning_content: when both
sides had empty content, combinedContent landed at null. The previous
commit makes this more frequent because reasoning_content now survives
the merge, and Ollama (and other strict OpenAI-compatible providers)
reject `content: null` when `reasoning_content` is set, returning HTTP
400 (issue #3499).

processContent already enforces this rule for single messages via
`assistantTextContent || (reasoningParts.length > 0 ? '' : null)` —
mirror it in the merge so the merged result stays compatible.

Add a regression test for the reasoning-only merge case.

Refs #3619 #3499
xaelistic pushed a commit to xaelistic/qwen-code that referenced this pull request Jun 7, 2026
…ant content (QwenLM#3499)

When a model (e.g. Ollama qwen3.5:9b) returns a response with
reasoning/thinking content but an empty text body, the assistant message
was constructed with `content: null`. Some OpenAI-compatible providers
(e.g. Ollama) reject such requests with HTTP 400 when reasoning_content
is also present.

Fix: in processContent(), when assistantTextContent is empty but
reasoningParts is non-empty, use '' instead of null. Tool-call-only
messages (no reasoning) retain `content: null` to stay spec-compliant.

Adds three regression tests covering:
- reasoning-only → content is ''
- tool-call-only → content is null (OpenAI spec)
- reasoning + text → content is the actual text

Fixes QwenLM#3421
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

type/bug Something isn't working as expected

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[API Error: 400 invalid message content type: <nil>] (Press Ctrl+Y to retry)

3 participants