fix(agents): use correct Ollama thinking field instead of reasoning#34765
fix(agents): use correct Ollama thinking field instead of reasoning#34765jnMetaCode wants to merge 1 commit intoopenclaw:mainfrom
Conversation
Greptile SummaryThis PR fixes a field name mismatch in the Ollama native API stream handler: the code was reading The fix is surgical and correct:
The logic is sound: the streaming accumulator only falls back to Confidence Score: 5/5
Last reviewed commit: 1bdbc24 |
3058e30 to
f84a116
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f84a116a3b
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| } else if (chunk.message?.thinking) { | ||
| // Qwen 3 thinking mode: content may be empty, output in thinking | ||
| accumulatedContent += chunk.message.thinking; |
There was a problem hiding this comment.
Apply thinking fallback only after final content is known
In createOllamaStreamFn, the loop appends chunk.message.thinking as soon as a chunk has empty content, so any model that streams thinking first and final answer text later will produce a merged output containing internal thinking plus the answer. This turns a fallback path into a default concatenation path for mixed streams, which can leak reasoning text and degrade response quality; instead, keep thinking/content in separate buffers and only use thinking if no content was produced by the end of the stream.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Good observation. However, this PR is strictly a field-name correction (reasoning → thinking) to match the actual Ollama API response — the accumulation logic is unchanged from the existing code on main.
The separate-buffer approach for mixed thinking+content streams is a valid improvement, but it's orthogonal to this rename fix and should be addressed in a separate PR to keep the scope tight.
f84a116 to
b8f155e
Compare
|
Closing this as superseded by #39292, which merged on 2026-03-08 and covers the same native Ollama Thank you for the original fix direction here. The merged path kept the broader compatibility handling and added follow-up coverage for mixed fallback/content streaming. If there is behavior in this PR that you believe did not land in #39292, reply here and we can reopen or spin out the missing piece explicitly. |
Summary
The Ollama native API (
/api/chat) returns thinking content inmessage.thinking, but the stream handler readsmessage.reasoning— a field that doesn't exist in the native API schema. This causes thinking-capable models (Qwen 3, DeepSeek R1, kimi-k2.5, glm-5) to silently discard all thinking content when streamed through the native Ollama endpoint, potentially producing empty responses.Fixes #34722
Changes
OllamaChatResponseinterface:reasoning→thinkingbuildAssistantMessage: fallback fieldreasoning→thinkingchunk.message?.reasoning→chunk.message?.thinkingTest plan
pnpm vitest run src/agents/ollama-stream.test.ts— 20/20 tests passqwen3:32b) via Ollama native API and verify thinking content appears