Skip to content

fix(ollama): send think=false for thinking models when thinking is off#53200

Merged
steipete merged 2 commits intoopenclaw:mainfrom
BruceMacD:brucemacd/empty-response-thinking-false
Mar 27, 2026
Merged

fix(ollama): send think=false for thinking models when thinking is off#53200
steipete merged 2 commits intoopenclaw:mainfrom
BruceMacD:brucemacd/empty-response-thinking-false

Conversation

@BruceMacD
Copy link
Copy Markdown
Contributor

@BruceMacD BruceMacD commented Mar 23, 2026

Summary

  • Problem: When thinkingLevel is off, OpenClaw never sends think: false in the Ollama /api/chat request body. Ollama defaults to think: true for thinking-capable models, so they continue generating thinking tokens that the response parser discards. This wastes tokens and in the case of some models which behave badly it may produce empty responses.
  • Why it matters: All thinking-capable Ollama models (qwen3.5, kimi-k2.5, glm-5, etc.) would continue to think silently when thinking was off, delaying response.
  • What changed: (1) Wire onPayload into createOllamaStreamFn so payload wrappers can mutate the request body before serialization. (2) Add an Ollama-specific wrapper in extra-params.ts that sets top-level think: false when thinkingLevel === "off". (3) Regression tests covering the payload injection, non-Ollama passthrough, and non-off passthrough. (4) Widen test-support thinkingLevel type to use the canonical ThinkLevel union.
  • What did NOT change (scope boundary): Response-side parsing (buildAssistantMessage) is untouched. No changes to other providers. No new files beyond the test.

Note: @SnowSky1 has been added as a collaborator. This builds on their work in #50741 but aims to be as minimal as possible. Their PR had a bug where think was nested under options instead of being set as a top-level request body field where Ollama expects it.

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor required for the fix
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

Root Cause / Regression History (if applicable)

  • Root cause: The Ollama stream path (createOllamaStreamFn) never called onPayload, so payload wrappers had no way to mutate the request body. Additionally, no wrapper existed to map thinkingLevel=off to Ollama's native think: false parameter. Other thinking-aware providers (SiliconFlow, Moonshot, Google) have a dedicated wrapper already.
  • Prior context: The Ollama native stream path was added in [Feature]: Add native Ollama API provider for streaming + tool calling support #11828. Provider-specific thinking wrappers were added incrementally for SiliconFlow, Moonshot/Kimi, and Google, but Ollama was never wired up.
  • If unknown, what was ruled out: N/A, root cause is clear.

Regression Test Plan (if applicable)

  • Coverage level that should have caught this:
    • Unit test
    • Seam / integration test
    • End-to-end test
    • Existing coverage already sufficient
  • Target test or file: src/agents/pi-embedded-runner/extra-params.ollama.test.ts
  • Scenario the test should lock in: When thinkingLevel=off and model.api=ollama, the request payload must have top-level think: false. Non-Ollama models and non-off thinking levels must not be affected.
  • Why this is the smallest reliable guardrail: The test exercises the wrapper through applyExtraParamsToAgent and asserts the final payload shape.
  • Existing test that already covers this (if any): None.

User-visible / Behavior Changes

Ollama thinking-capable models now return actual content immediately when thinking is set to off. No config changes needed.

Security Impact (required)

  • New permissions/capabilities? No
  • Secrets/tokens handling changed? No
  • New/changed network calls? No (same Ollama /api/chat endpoint, one additional field in body)
  • Command/tool execution surface changed? No
  • Data access scope changed? No

Repro + Verification

Environment

  • OS: macOS
  • Runtime/container: Node 22, Ollama 0.18.0+
  • Model/provider: qwen3.5:cloud via Ollama
  • Integration/channel (if any): N/A
  • Relevant config (redacted): thinking: "off"

Steps

  1. Run Ollama with a thinking-capable model (e.g. qwen3.5:cloud)
  2. Set thinking to off (/think off)
  3. Send a message

Expected

  • Model returns response without thinking, user sees response

Actual (before fix)

  • Model does think, but this is not clear to the user

Evidence

  • Failing test/log before + passing after

Human Verification (required)

  • Verified scenarios: Confirmed via debug logging in an ollama server that think: false is received only when think is off. Ran pnpm test. Typecheck and all lint/format checks green.
  • Edge cases checked: Non-Ollama models unaffected (test). Thinking levels other than off unaffected (end-to-end).
  • What you did not verify:

Review Conversations

  • I replied to or resolved every bot review conversation I addressed in this PR.
  • I left unresolved only the conversations that still need reviewer or maintainer judgment.

Compatibility / Migration

  • Backward compatible? Yes
  • Config/env changes? No
  • Migration needed? No

Failure Recovery (if this breaks)

  • How to disable/revert this change quickly: Revert the single commit; the wrapper is guarded by model.api !== "ollama" so only Ollama is affected.
  • Files/config to restore: src/agents/ollama-stream.ts, src/agents/pi-embedded-runner/extra-params.ts
  • Known bad symptoms reviewers should watch for:

Risks and Mitigations

  • Risk: Older Ollama versions (pre-0.18.0) may not recognize the think field.
    • Mitigation: Ollama ignores unknown top-level fields; verified no error on older versions. The field is only injected when thinkingLevel=off, which is already a no-op for non-thinking models.

@openclaw-barnacle openclaw-barnacle Bot added agents Agent runtime and tooling size: S labels Mar 23, 2026
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Mar 23, 2026

Greptile Summary

This PR fixes a bug where Ollama thinking-capable models (e.g. qwen3.5, kimi-k2.5) continued generating reasoning tokens even when thinking was set to off, because the Ollama stream path never called onPayload and no wrapper existed to map thinkingLevel=off to Ollama's native think: false parameter.

  • ollama-stream.ts: One-liner that wires options?.onPayload?.(body, model) after the request body is assembled and before JSON.stringify — mutations from wrappers now reach the serialised body.
  • extra-params.ts: New createOllamaThinkingOffWrapper that intercepts the onPayload callback and injects think: false at the top level (not under options) when model.api === "ollama". Applied unconditionally when thinkingLevel === "off" but guarded internally to be a no-op for non-Ollama models.
  • extra-params.ollama.test.ts: Three regression tests — payload injection, non-Ollama passthrough, and non-off passthrough — including an explicit assertion that think is not placed under options (avoiding the bug from fix(ollama): send think=false when thinking is off #50741).
  • extra-params.test-support.ts: thinkingLevel parameter widened to the canonical ThinkLevel union so "off" and "xhigh" are usable in tests.
  • Minor type gap: OllamaChatRequest in ollama-stream.ts does not yet declare think?: boolean, so the wrapper relies on a Record<string, unknown> cast to inject the field. Functionally correct at runtime, but updating the interface would make the type a faithful model of the Ollama API.

Confidence Score: 5/5

  • PR is safe to merge; the fix is minimal, well-tested, and backward-compatible with older Ollama versions.
  • The change is small and tightly scoped: one hook call in the stream path, one wrapper function, and three regression tests that cover all the cases described in the PR. The think field is only injected when thinkingLevel === "off" and model.api === "ollama", so all other providers and thinking levels are unaffected. The only non-blocking finding is that OllamaChatRequest doesn't declare think?: boolean, which is a type-safety nicety rather than a correctness issue.
  • No files require special attention.

Comments Outside Diff (1)

  1. src/agents/ollama-stream.ts, line 41-47 (link)

    P2 OllamaChatRequest missing think field

    The OllamaChatRequest interface doesn't include a think?: boolean field, even though the new wrapper now injects it at the top level. The workaround — casting payload to Record<string, unknown> inside createOllamaThinkingOffWrapper — gets the job done at runtime (JavaScript objects are mutable and JSON.stringify will serialize the extra property), but it bypasses the type system.

    Adding think?: boolean here would let TypeScript verify the field at the injection site and make the interface a faithful description of the Ollama /api/chat body.

    Prompt To Fix With AI
    This is a comment left during a code review.
    Path: src/agents/ollama-stream.ts
    Line: 41-47
    
    Comment:
    **`OllamaChatRequest` missing `think` field**
    
    The `OllamaChatRequest` interface doesn't include a `think?: boolean` field, even though the new wrapper now injects it at the top level. The workaround — casting `payload` to `Record<string, unknown>` inside `createOllamaThinkingOffWrapper` — gets the job done at runtime (JavaScript objects are mutable and `JSON.stringify` will serialize the extra property), but it bypasses the type system.
    
    Adding `think?: boolean` here would let TypeScript verify the field at the injection site and make the interface a faithful description of the Ollama `/api/chat` body.
    
    
    
    How can I resolve this? If you propose a fix, please make it concise.
Prompt To Fix All With AI
This is a comment left during a code review.
Path: src/agents/ollama-stream.ts
Line: 41-47

Comment:
**`OllamaChatRequest` missing `think` field**

The `OllamaChatRequest` interface doesn't include a `think?: boolean` field, even though the new wrapper now injects it at the top level. The workaround — casting `payload` to `Record<string, unknown>` inside `createOllamaThinkingOffWrapper` — gets the job done at runtime (JavaScript objects are mutable and `JSON.stringify` will serialize the extra property), but it bypasses the type system.

Adding `think?: boolean` here would let TypeScript verify the field at the injection site and make the interface a faithful description of the Ollama `/api/chat` body.

```suggestion
interface OllamaChatRequest {
  model: string;
  messages: OllamaChatMessage[];
  stream: boolean;
  tools?: OllamaTool[];
  options?: Record<string, unknown>;
  think?: boolean;
}
```

How can I resolve this? If you propose a fix, please make it concise.

Reviews (1): Last reviewed commit: "fix(ollama): send think=false for thinki..." | Re-trigger Greptile

BruceMacD and others added 2 commits March 27, 2026 23:40
Ollama thinking-capable models default to think=true when the parameter
is absent. When OpenClaw has thinking set to off, the request never
included think=false, so models continued generating thinking tokens
that were then discarded by the response parser, producing empty
responses.

Wire onPayload into the Ollama stream path so payload wrappers can
mutate the request body, and add an Ollama-specific wrapper that sets
top-level think=false when thinkingLevel is off.

Fixes openclaw#46680, openclaw#50702, openclaw#50712

Co-Authored-By: SnowSky1 <126348592+snowsky1@users.noreply.github.com>
@steipete steipete force-pushed the brucemacd/empty-response-thinking-false branch from 3838686 to 4fb8c01 Compare March 27, 2026 23:49
@steipete steipete merged commit adb78fa into openclaw:main Mar 27, 2026
9 checks passed
@openclaw-barnacle openclaw-barnacle Bot added the maintainer Maintainer-authored PR label Mar 27, 2026
@steipete
Copy link
Copy Markdown
Contributor

Landed via temp rebase onto main.

  • Gate: scoped Ollama payload verification plus repo gate checks; pnpm check / pnpm build still hit unrelated existing main failures in src/agents/compaction* and src/agents/skills*
  • Land commit: 4fb8c01
  • Merge commit: adb78fa

Thanks @BruceMacD!

johnkhagler pushed a commit to johnkhagler/openclaw that referenced this pull request Mar 29, 2026
alexcode-cc pushed a commit to alexcode-cc/clawdbot that referenced this pull request Mar 30, 2026
lovewanwan pushed a commit to lovewanwan/openclaw that referenced this pull request Apr 28, 2026
ogt-redknie pushed a commit to ogt-redknie/OPENX that referenced this pull request May 2, 2026
github-actions Bot pushed a commit to Desicool/openclaw that referenced this pull request May 9, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling maintainer Maintainer-authored PR size: S

Projects

None yet

2 participants