fix(ollama): send think=false for thinking models when thinking is off#53200
Conversation
Greptile SummaryThis PR fixes a bug where Ollama thinking-capable models (e.g. qwen3.5, kimi-k2.5) continued generating reasoning tokens even when thinking was set to
Confidence Score: 5/5
|
d46af03 to
3838686
Compare
Ollama thinking-capable models default to think=true when the parameter is absent. When OpenClaw has thinking set to off, the request never included think=false, so models continued generating thinking tokens that were then discarded by the response parser, producing empty responses. Wire onPayload into the Ollama stream path so payload wrappers can mutate the request body, and add an Ollama-specific wrapper that sets top-level think=false when thinkingLevel is off. Fixes openclaw#46680, openclaw#50702, openclaw#50712 Co-Authored-By: SnowSky1 <126348592+snowsky1@users.noreply.github.com>
3838686 to
4fb8c01
Compare
|
Landed via temp rebase onto main.
Thanks @BruceMacD! |
Summary
thinkingLevelisoff, OpenClaw never sendsthink: falsein the Ollama/api/chatrequest body. Ollama defaults tothink: truefor thinking-capable models, so they continue generating thinking tokens that the response parser discards. This wastes tokens and in the case of some models which behave badly it may produce empty responses.onPayloadintocreateOllamaStreamFnso payload wrappers can mutate the request body before serialization. (2) Add an Ollama-specific wrapper inextra-params.tsthat sets top-levelthink: falsewhenthinkingLevel === "off". (3) Regression tests covering the payload injection, non-Ollama passthrough, and non-off passthrough. (4) Widen test-supportthinkingLeveltype to use the canonicalThinkLevelunion.buildAssistantMessage) is untouched. No changes to other providers. No new files beyond the test.Note: @SnowSky1 has been added as a collaborator. This builds on their work in #50741 but aims to be as minimal as possible. Their PR had a bug where
thinkwas nested underoptionsinstead of being set as a top-level request body field where Ollama expects it.Change Type (select all)
Scope (select all touched areas)
Linked Issue/PR
Root Cause / Regression History (if applicable)
createOllamaStreamFn) never calledonPayload, so payload wrappers had no way to mutate the request body. Additionally, no wrapper existed to mapthinkingLevel=offto Ollama's nativethink: falseparameter. Other thinking-aware providers (SiliconFlow, Moonshot, Google) have a dedicated wrapper already.Regression Test Plan (if applicable)
src/agents/pi-embedded-runner/extra-params.ollama.test.tsthinkingLevel=offandmodel.api=ollama, the request payload must have top-levelthink: false. Non-Ollama models and non-off thinking levels must not be affected.applyExtraParamsToAgentand asserts the final payload shape.User-visible / Behavior Changes
Ollama thinking-capable models now return actual content immediately when thinking is set to off. No config changes needed.
Security Impact (required)
NoNoNo(same Ollama/api/chatendpoint, one additional field in body)NoNoRepro + Verification
Environment
qwen3.5:cloudvia Ollamathinking: "off"Steps
qwen3.5:cloud)/think off)Expected
Actual (before fix)
Evidence
Human Verification (required)
think: falseis received only whenthinkis off. Ranpnpm test. Typecheck and all lint/format checks green.Review Conversations
Compatibility / Migration
YesNoNoFailure Recovery (if this breaks)
model.api !== "ollama"so only Ollama is affected.src/agents/ollama-stream.ts,src/agents/pi-embedded-runner/extra-params.tsRisks and Mitigations
thinkfield.thinkingLevel=off, which is already a no-op for non-thinking models.