Summary
OpenClaw's Ollama native stream handler reads chunk.message?.reasoning to accumulate thinking/reasoning tokens, but the Ollama native API (/api/chat) uses chunk.message.thinking — not reasoning. This means thinking-capable models (kimi-k2.5, glm-5, minimax-m2.5, etc.) via the native Ollama API silently discard all thinking content.
Affected file
src/ → compiled into dist/model-selection-*.js (multiple copies)
The specific code (line ~14749 in the compiled bundle):
// Current (wrong):
else if (chunk.message?.reasoning) accumulatedContent += chunk.message.reasoning;
// Should be:
else if (chunk.message?.thinking) accumulatedContent += chunk.message.thinking;
Also the non-streaming path (line ~14659):
// Current:
const text = response.message.content || response.message.reasoning || "";
// Should be:
const text = response.message.content || response.message.thinking || response.message.reasoning || "";
Steps to reproduce
- Configure a custom provider with
api: "ollama" pointing to Ollama Cloud or any Ollama instance running a thinking model
- Add a model that uses thinking (e.g.
kimi-k2.5, glm-5)
- Do NOT set
api at the model level (let it inherit the provider-level api: "ollama")
- Send a message
- The response arrives correctly but thinking tokens are silently dropped
Verification
Direct API call to https://ollama.com/api/chat with stream: true returns chunks like:
{"model":"kimi-k2.5","message":{"role":"assistant","content":"","thinking":"The user said..."},"done":false}
The field is thinking, not reasoning.
With stream: false:
{"message":{"role":"assistant","content":"Hi there!","thinking":"The user said hi..."}}
Impact
Thinking content is completely dropped. If a model ONLY returns content during the thinking phase and leaves content empty until the final chunk (which some models do), this can appear as an empty response.
Workaround
Patch the compiled model-selection-*.js files manually until this is fixed upstream. Note there are three copies of the file (model-selection-Zb7eBzSY.js, model-selection-ikt2OC4j.js, model-selection-CjMYMtR0.js) serving different embedded contexts — all must be patched.
References
Ollama docs confirm the field name: https://github.com/ollama/ollama/blob/main/docs/api.md
Summary
OpenClaw's Ollama native stream handler reads
chunk.message?.reasoningto accumulate thinking/reasoning tokens, but the Ollama native API (/api/chat) useschunk.message.thinking— notreasoning. This means thinking-capable models (kimi-k2.5, glm-5, minimax-m2.5, etc.) via the native Ollama API silently discard all thinking content.Affected file
src/→ compiled intodist/model-selection-*.js(multiple copies)The specific code (line ~14749 in the compiled bundle):
Also the non-streaming path (line ~14659):
Steps to reproduce
api: "ollama"pointing to Ollama Cloud or any Ollama instance running a thinking modelkimi-k2.5,glm-5)apiat the model level (let it inherit the provider-levelapi: "ollama")Verification
Direct API call to
https://ollama.com/api/chatwithstream: truereturns chunks like:{"model":"kimi-k2.5","message":{"role":"assistant","content":"","thinking":"The user said..."},"done":false}The field is
thinking, notreasoning.With
stream: false:{"message":{"role":"assistant","content":"Hi there!","thinking":"The user said hi..."}}Impact
Thinking content is completely dropped. If a model ONLY returns content during the thinking phase and leaves
contentempty until the final chunk (which some models do), this can appear as an empty response.Workaround
Patch the compiled
model-selection-*.jsfiles manually until this is fixed upstream. Note there are three copies of the file (model-selection-Zb7eBzSY.js,model-selection-ikt2OC4j.js,model-selection-CjMYMtR0.js) serving different embedded contexts — all must be patched.References
Ollama docs confirm the field name: https://github.com/ollama/ollama/blob/main/docs/api.md