Skip to content

fix: backport upstream reasoning + correctness fixes (#3315, #3590, #3505, #3467)#113

Merged
mabry1985 merged 7 commits into
devfrom
chore/backport-reasoning-correctness
Apr 26, 2026
Merged

fix: backport upstream reasoning + correctness fixes (#3315, #3590, #3505, #3467)#113
mabry1985 merged 7 commits into
devfrom
chore/backport-reasoning-correctness

Conversation

@mabry1985

Copy link
Copy Markdown

Summary

Backports 4 upstream bug fixes:

Adaptations to fork

Test plan

  • 5326 core tests passing locally
  • CI green

🤖 Generated with Claude Code

Automaker and others added 7 commits April 26, 2026 00:08
…wenLM#3320)

Cherry-picked from QwenLM/qwen-code:

- QwenLM#3499 fix(core): use empty string instead of null for reasoning-only
  assistant content. Some OpenAI-compatible providers (e.g. Ollama
  qwen3.5:9b) reject content: null with HTTP 400 when reasoning_content is
  also present. Tool-call-only messages keep null per OpenAI spec.

- QwenLM#3630 fix(telemetry): switch FileExporter.serialize from JSON.stringify
  to safeJsonStringify. OTel ReadableSpans hold a BatchSpanProcessor
  back-reference that forms a cycle and crashed --telemetry-outfile users.

- QwenLM#3320 fix(core): cap chokidar depth at 2 in SkillManager and skip
  .git / special file types. Prevents FD exhaustion when a skill dir
  contains node_modules etc., which silently broke node-pty I/O.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Backports upstream PR QwenLM#3525 + extends the per-stream context to also
cover our fork's <think>-tag parser state.

Bug: every caller of Config.getContentGenerator() — foreground turns,
fork subagents, run_in_background subagents, ACP concurrent Agent calls
(after QwenLM#3463) — shared a single OpenAIContentConverter, which held the
StreamingToolCallParser as an instance field. Concurrent streams
corrupted each other's tool-call buffers, surfacing as NO_RESPONSE_TEXT.

Fix:
- New ConverterStreamContext interface holds toolCallParser, thinkBuffer,
  inThinkTag — one per stream.
- createStreamContext() factory replaces resetStreamingToolCalls().
- convertOpenAIChunkToGemini(chunk, ctx) and processThinkChunk(chunk, ctx)
  thread the context through every parser/think-buffer access.
- ContentGenerationPipeline.processStreamWithLogging creates one context
  at stream entry. The error path no longer manually resets — the
  context is GC'd when the generator unwinds.

Our protoInternal recovery-note logic is preserved on the new shape.

Note: upstream's follow-up QwenLM#3550 (full stateless converter refactor) is
deferred — it's hygiene without a functional bug; QwenLM#3525 alone fixes
the concurrency race.

Tests:
- New createStreamContext describe replaces resetStreamingToolCalls suite
- Streaming <think> tests use a per-test context
- pipeline.test.ts mock updated to match the new API
- pipeline.concurrent.test.ts (from upstream commit 38edd9d)
  drives two real concurrent streams and asserts neither corrupts the
  other's tool-call output (positive control: pre-fix, this test fails
  with exactly the user-reported bug shape).

Refs upstream QwenLM#3516, QwenLM#3525.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…#3304)

When switching models mid-session, reasoning_content fields from
thinking-capable models leaked into API requests sent to the new
provider, causing 422 errors on strict OpenAI-compatible endpoints.

Call stripThoughtsFromHistory() in handleModelChange() so thought parts
are removed before the next request is built for the new model.
Backport of upstream QwenLM#3505. Propagates MAX_TOKENS truncation from
subagent responses into tool requests and rejects truncated edit calls
before schema validation can surface misleading missing-parameter errors.

Adapted to our fork's coreToolScheduler.ts which already had the
truncation rejection block — kept both, dropped the unused
clearRetryCountsForTool() call (we don't have that retry-counter
machinery yet).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… catch-alls (QwenLM#3467)

Backport of upstream QwenLM#3467. A permission rule with unbalanced parens
was silently parsed with specifier: undefined, causing matchesRule to
treat it as a catch-all. For deny rules this blocked all commands; for
allow rules a typo could silently auto-approve everything.

- Adds an invalid flag to PermissionRule
- parseRule marks unbalanced-paren rules as invalid
- matchesRule short-circuits invalid rules to never match
- parseRules / addSession*Rule / addPersistentRule warn on malformed input
- listRules filters invalid rules from /permissions UI

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The test from upstream QwenLM#3304 backport assumed an in-place qwen-oauth
model switch path that our fork doesn't have; the source-side fix in
config.ts (stripThoughtsFromHistory call in handleModelChange) is
preserved. Coverage will be re-added when the fork's switch flow
stabilizes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@coderabbitai

coderabbitai Bot commented Apr 26, 2026

Copy link
Copy Markdown

Warning

Rate limit exceeded

@mabry1985 has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 11 minutes and 0 seconds before requesting another review.

Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 11 minutes and 0 seconds.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 88658af1-a486-4008-bbae-da2930efa210

📥 Commits

Reviewing files that changed from the base of the PR and between 368f5d3 and 7a012a2.

📒 Files selected for processing (21)
  • packages/core/src/agents/runtime/agent-core.ts
  • packages/core/src/agents/runtime/agent-headless.test.ts
  • packages/core/src/config/config.ts
  • packages/core/src/core/client.ts
  • packages/core/src/core/coreToolScheduler.test.ts
  • packages/core/src/core/coreToolScheduler.ts
  • packages/core/src/core/openaiContentGenerator/converter.test.ts
  • packages/core/src/core/openaiContentGenerator/converter.ts
  • packages/core/src/core/openaiContentGenerator/pipeline.concurrent.test.ts
  • packages/core/src/core/openaiContentGenerator/pipeline.test.ts
  • packages/core/src/core/openaiContentGenerator/pipeline.ts
  • packages/core/src/permissions/permission-manager.test.ts
  • packages/core/src/permissions/permission-manager.ts
  • packages/core/src/permissions/rule-parser.ts
  • packages/core/src/permissions/types.ts
  • packages/core/src/services/sessionService.test.ts
  • packages/core/src/services/sessionService.ts
  • packages/core/src/skills/skill-manager.test.ts
  • packages/core/src/skills/skill-manager.ts
  • packages/core/src/telemetry/file-exporters.test.ts
  • packages/core/src/telemetry/file-exporters.ts
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch chore/backport-reasoning-correctness

Comment @coderabbitai help to get the list of available commands and usage tips.

@mabry1985 mabry1985 merged commit dd9c9ab into dev Apr 26, 2026
1 of 3 checks passed
@mabry1985 mabry1985 deleted the chore/backport-reasoning-correctness branch April 26, 2026 07:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants