Skip to content

fix(agents): use correct Ollama thinking field instead of reasoning#34765

Closed
jnMetaCode wants to merge 1 commit intoopenclaw:mainfrom
jnMetaCode:fix/ollama-thinking-field-name
Closed

fix(agents): use correct Ollama thinking field instead of reasoning#34765
jnMetaCode wants to merge 1 commit intoopenclaw:mainfrom
jnMetaCode:fix/ollama-thinking-field-name

Conversation

@jnMetaCode
Copy link
Copy Markdown
Contributor

Summary

The Ollama native API (/api/chat) returns thinking content in message.thinking, but the stream handler reads message.reasoning — a field that doesn't exist in the native API schema. This causes thinking-capable models (Qwen 3, DeepSeek R1, kimi-k2.5, glm-5) to silently discard all thinking content when streamed through the native Ollama endpoint, potentially producing empty responses.

Fixes #34722

Changes

  • OllamaChatResponse interface: reasoningthinking
  • buildAssistantMessage: fallback field reasoningthinking
  • Streaming accumulator: chunk.message?.reasoningchunk.message?.thinking
  • Updated tests to match the corrected field name

Test plan

  • pnpm vitest run src/agents/ollama-stream.test.ts — 20/20 tests pass
  • Manual: stream a thinking model (e.g. qwen3:32b) via Ollama native API and verify thinking content appears

@openclaw-barnacle openclaw-barnacle Bot added agents Agent runtime and tooling size: XS labels Mar 4, 2026
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Mar 4, 2026

Greptile Summary

This PR fixes a field name mismatch in the Ollama native API stream handler: the code was reading message.reasoning to capture thinking/chain-of-thought output, but the Ollama /api/chat endpoint actually returns this content under message.thinking. This caused thinking-capable models (Qwen 3, DeepSeek R1, etc.) to silently discard all thinking content when streamed through the native Ollama endpoint.

The fix is surgical and correct:

  • OllamaChatResponse interface type updated: reasoning?thinking?
  • buildAssistantMessage fallback updated to read response.message.thinking
  • The streaming accumulator's else if branch updated to check chunk.message?.thinking
  • All affected tests updated to match the corrected field name

The logic is sound: the streaming accumulator only falls back to thinking content when content is empty in a given chunk, which matches Ollama's actual streaming behavior where thinking and response content arrive in separate chunks. After streaming completes, the fully accumulated string (including any thinking content) is placed into finalResponse.message.content before buildAssistantMessage is called, so the thinking fallback in buildAssistantMessage serves the non-streaming (single-response) path.

Confidence Score: 5/5

  • This PR is safe to merge — it's a minimal, well-tested fix that corrects a clearly-documented API field name mismatch with no behavioral regressions.
  • The change is a one-field rename throughout a single file and its tests. The fix directly matches the Ollama native API specification (message.thinking). The 20 existing tests all continue to pass, and the streaming accumulation logic (using else if to fall back to thinking only when content is empty) correctly reflects how Ollama streams thinking model output. No new edge cases are introduced.
  • No files require special attention.

Last reviewed commit: 1bdbc24

@jnMetaCode jnMetaCode force-pushed the fix/ollama-thinking-field-name branch 4 times, most recently from 3058e30 to f84a116 Compare March 7, 2026 00:56
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f84a116a3b

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +477 to +479
} else if (chunk.message?.thinking) {
// Qwen 3 thinking mode: content may be empty, output in thinking
accumulatedContent += chunk.message.thinking;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Apply thinking fallback only after final content is known

In createOllamaStreamFn, the loop appends chunk.message.thinking as soon as a chunk has empty content, so any model that streams thinking first and final answer text later will produce a merged output containing internal thinking plus the answer. This turns a fallback path into a default concatenation path for mixed streams, which can leak reasoning text and degrade response quality; instead, keep thinking/content in separate buffers and only use thinking if no content was produced by the end of the stream.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good observation. However, this PR is strictly a field-name correction (reasoningthinking) to match the actual Ollama API response — the accumulation logic is unchanged from the existing code on main.

The separate-buffer approach for mixed thinking+content streams is a valid improvement, but it's orthogonal to this rename fix and should be addressed in a separate PR to keep the scope tight.

@vincentkoc
Copy link
Copy Markdown
Member

Closing this as superseded by #39292, which merged on 2026-03-08 and covers the same native Ollama message.thinking path while preserving message.reasoning as a compatibility fallback.

Thank you for the original fix direction here. The merged path kept the broader compatibility handling and added follow-up coverage for mixed fallback/content streaming.

If there is behavior in this PR that you believe did not land in #39292, reply here and we can reopen or spin out the missing piece explicitly.

@vincentkoc vincentkoc closed this Mar 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling size: XS

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug: Ollama native API stream reader checks message.reasoning but field is message.thinking

2 participants