fix(ollama): send think=false when thinking is off#50741
fix(ollama): send think=false when thinking is off#50741SnowSky1 wants to merge 16 commits intoopenclaw:mainfrom
Conversation
|
@codex review |
Greptile SummaryThis PR fixes a two-part problem with Ollama's "thinking" feature: (1) the native The fix is clean and well-scoped:
Confidence Score: 5/5
Last reviewed commit: "fix(ollama): send th..." |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: e3f01c0d43
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| } | ||
|
|
||
| // Ollama emits reasoning-only responses unless thinking is explicitly disabled. | ||
| options.think = false; |
There was a problem hiding this comment.
Put Ollama's
think flag at the top level
For native model.api == "ollama" runs with thinkingLevel === "off", this writes think into payload.options, but Ollama's chat API expects think as a top-level request field. src/agents/ollama-stream.ts serializes body.options unchanged, so Qwen/DeepSeek requests still go out without an effective disable flag and continue returning reasoning-only chunks. The new tests assert the nested shape too, so this stays undetected.
Useful? React with 👍 / 👎.
| if (thinkingLevel === "off") { | ||
| agent.streamFn = createOllamaThinkingPayloadWrapper(agent.streamFn, thinkingLevel); |
There was a problem hiding this comment.
Exclude GPT-OSS from the blanket
think=false wrapper
This unconditional wrapper now runs for every Ollama request when thinking is off, including ollama/gpt-oss:* models that OpenClaw already documents/supports. Ollama's thinking docs say GPT-OSS ignores boolean think values and only accepts low|medium|high, so --thinking off still has no effect for that model family even after this change.
Useful? React with 👍 / 👎.
|
The targeted Ollama regressions pass locally:
From the current CI run, the 3 failing jobs look unrelated to the files touched here:
This PR only updates the Ollama payload path and adds focused regression coverage in:
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: da13ec7c96
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| ...(ollamaTools.length > 0 ? { tools: ollamaTools } : {}), | ||
| options: ollamaOptions, | ||
| }; | ||
| options?.onPayload?.(body as unknown as Record<string, unknown>, model); |
There was a problem hiding this comment.
Prevent OpenAI-format Moonshot patches from hitting native Ollama
This new onPayload call makes every existing payload wrapper run against native /api/chat bodies. That regresses supported Ollama Cloud Kimi setups (docs/providers/ollama.md:80-86 and docs/providers/ollama.md:199-203): applyExtraParamsToAgent still applies Moonshot compatibility to provider="ollama" / modelId="kimi-k2.5:cloud" (src/agents/pi-embedded-runner/moonshot-stream-wrappers.ts:58-76, exercised in src/agents/pi-embedded-runner-extraparams.test.ts:874-895), and that wrapper injects OpenAI-style thinking / tool_choice fields. Before this change those mutations were inert on the native path; after it, native Ollama Kimi Cloud requests can be malformed whenever thinking or pinned tool choice is used.
Useful? React with 👍 / 👎.
|
The latest rerun narrowed the remaining CI failures down to the two Linux I followed up by isolating the top offenders called out in the shard summaries (for example The updated behavior file parses cleanly, and |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 3d758bfafc
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| } | ||
|
|
||
| // Ollama emits reasoning-only responses unless thinking is explicitly disabled. | ||
| options.think = false; |
There was a problem hiding this comment.
Send Ollama's think flag at the request top level
For native /api/chat requests with thinkingLevel === "off", this wrapper writes think into payload.options, but createOllamaStreamFn now serializes that mutated object unchanged in src/agents/ollama-stream.ts:462-469. Ollama's chat API reads think as a top-level request field, so qwen/deepseek-style native runs still go out without an effective disable flag and continue returning reasoning-only chunks.
Useful? React with 👍 / 👎.
| if (thinkingLevel === "off") { | ||
| agent.streamFn = createOllamaThinkingPayloadWrapper(agent.streamFn, thinkingLevel); |
There was a problem hiding this comment.
Skip boolean think=false for native GPT-OSS runs
This wrapper is installed for every thinkingLevel === "off" run, and src/agents/pi-embedded-runner/run.ts:451-452 makes off the default when the user does not pick a thinking level. That means the documented default Ollama model (docs/providers/ollama.md:222-223, ollama/gpt-oss:20b) now gets a boolean think=false on every native request, but Ollama only accepts low|medium|high for GPT-OSS, so /thinking off still cannot disable reasoning for that supported setup.
Useful? React with 👍 / 👎.
| ...(ollamaTools.length > 0 ? { tools: ollamaTools } : {}), | ||
| options: ollamaOptions, | ||
| }; | ||
| options?.onPayload?.(body as unknown as Record<string, unknown>, model); |
There was a problem hiding this comment.
Keep Moonshot payload rewrites off native Ollama bodies
This new onPayload hook now activates existing payload wrappers on native Ollama requests. applyExtraParamsToAgent still installs the Moonshot compatibility path for provider="ollama" + kimi-k*:cloud, and that wrapper emits OpenAI-style thinking/tool_choice fields (see src/agents/pi-embedded-runner/moonshot-stream-wrappers.ts, exercised by src/agents/pi-embedded-runner-extraparams.test.ts:859-910). Before this change those mutations were inert on the native path; now native Kimi Cloud calls with thinking or pinned tool choice can be serialized as malformed /api/chat payloads.
Useful? React with 👍 / 👎.
Follow-upI pushed a small CI-focused follow-up for the remaining failures:
Local verification:
This should address the last Windows runtime.integration failure and reduce the two Linux unit-fast OOM lanes. |
Follow-upI pushed one more small CI-only follow-up based on the latest Linux shard OOM summaries. Added the current top
Also re-validated that |
Follow-upI pushed one more tiny CI-only follow-up for the last remaining Linux shard. Added the latest
Re-validated that |
Follow-upI pushed one more tiny CI-only follow-up for the latest shard 2 Added these new top retainers to
Re-validated that |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 564c1f0979
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| if (thinkingLevel === "off") { | ||
| agent.streamFn = createOllamaThinkingPayloadWrapper(agent.streamFn, thinkingLevel); |
There was a problem hiding this comment.
Skip
think=false for Ollama models without thinking support
Because src/agents/pi-embedded-runner/run.ts:451 defaults an unset thinking level to off, these lines now wrap every native model.api === "ollama" request and force think=false. That also hits ordinary Ollama models such as the documented ollama/llama3.3 fallback in docs/providers/ollama.md:223, even though Ollama only accepts think on thinking-capable models and returns HTTP 400 (does not support thinking) for unsupported ones. In other words, a standard non-reasoning native Ollama setup can start failing every request instead of behaving as before.
Useful? React with 👍 / 👎.
…think-false # Conflicts: # scripts/test-parallel.mjs
|
Synced this branch onto the latest I also dropped the earlier CI-only follow-up edits that are now covered upstream, so the net diff is back to the Ollama fix + focused regressions only:
Local verification on the rebased branch:
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: ffe1d6101e
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| payloadObj.options = options; | ||
| } | ||
|
|
||
| // Ollama emits reasoning-only responses unless thinking is explicitly disabled. | ||
| options.think = false; |
There was a problem hiding this comment.
Move the Ollama
think flag out of options
This patch writes think to payload.options, but Ollama's native /api/chat request schema uses think as a top-level field, separate from options. In practice that means thinkingLevel=off still reaches Ollama without the disabling flag, so reasoning-capable native Ollama models keep emitting thinking-only traces and the regression remains. The new tests stay green because they assert the same nested shape instead of the wire format Ollama actually accepts.
Useful? React with 👍 / 👎.
|
Synced this branch onto the latest Local verification after the sync:
So the focused Ollama fix + regression coverage still pass cleanly on top of the current @codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 78be5af097
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| // Ollama emits reasoning-only responses unless thinking is explicitly disabled. | ||
| options.think = false; |
There was a problem hiding this comment.
Send
think as a top-level Ollama field
When thinkingLevel === "off" on native Ollama models, this wrapper writes options.think = false. I checked Ollama’s /api/chat schema (docs.ollama.com/api/chat), and think is a top-level request field next to options, not nested inside it. Because createOllamaStreamFn now serializes onPayload mutations into the outgoing body (src/agents/ollama-stream.ts:462-469), the new flag still lands in the wrong place, so the intended /think off fix remains ineffective on the native Ollama path.
Useful? React with 👍 / 👎.
| ...(ollamaTools.length > 0 ? { tools: ollamaTools } : {}), | ||
| options: ollamaOptions, | ||
| }; | ||
| options?.onPayload?.(body as unknown as Record<string, unknown>, model); |
There was a problem hiding this comment.
Skip Moonshot
thinking rewrites on native Ollama
This new onPayload hook makes existing payload wrappers mutate native /api/chat requests too. That includes the legacy Moonshot fallback: applyExtraParamsToAgent still installs createMoonshotThinkingWrapper for ollama/kimi-k*:cloud, and that wrapper writes payload.thinking = { type: ... } in OpenAI/Moonshot format (src/agents/pi-embedded-runner/moonshot-stream-wrappers.ts:121-128). The bundled Ollama provider defaults *:cloud models to api: "ollama" (extensions/ollama/index.ts:78, src/plugins/provider-ollama-setup.ts:21), so default Kimi cloud runs now send a non-native thinking object over /api/chat as soon as this line starts honoring onPayload mutations.
Useful? React with 👍 / 👎.
|
@BunsDev when you have a moment, could you take a quick look? This branch is now synced to the latest Thanks! |
|
@vincentkoc when you have a moment, could you take a quick look at this one as well? It is now synced to the current main, the scope is back to the focused Ollama hink:false fix plus regression coverage, and the latest CI run is green / skipped across the board. Thanks. |
|
Closing this to reduce my active PR queue and focus on the smallest, most mergeable fixes. Happy to reopen if this is still useful later. |
Summary
options.think = falsefor native Ollama requests when the resolved thinking level isoffonPayloadmutations before serializing the request bodyTesting
pnpm test -- src/agents/ollama-stream.test.tspnpm test -- src/agents/pi-embedded-runner/extra-params.ollama.test.tspnpm exec oxfmt --check src/agents/ollama-stream.ts src/agents/ollama-stream.test.ts src/agents/pi-embedded-runner/extra-params.ts src/agents/pi-embedded-runner/extra-params.test-support.ts src/agents/pi-embedded-runner/ollama-stream-wrappers.ts src/agents/pi-embedded-runner/extra-params.ollama.test.tsCloses #50712