Skip to content

Ollama (and other local models): streaming breaks tool calling — need stream:false fallback #5769

@code-compliant

Description

@code-compliant

Summary

When using Ollama-hosted models (e.g. Mistral Small 3.2 24B) via the openai-completions provider, tool calling silently fails because OpenClaw always sends stream: true. Ollama's streaming implementation does not properly emit tool_calls delta chunks — the model decides to call a tool, but the streaming response returns empty content with finish_reason: "stop", losing the tool call entirely.

This means no local model can use tools through OpenClaw, despite the models supporting tool calling perfectly in non-streaming mode.

Evidence

Non-streaming (works perfectly):

curl -s http://localhost:11434/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model":"mistral-small:24b","messages":[{"role":"user","content":"List open PRs"}],"tools":[{"type":"function","function":{"name":"exec","description":"Run a command","parameters":{"type":"object","properties":{"command":{"type":"string"}},"required":["command"]}}}]}'

Response: proper tool_calls array with finish_reason: "tool_calls"

Streaming (broken):

Same request with "stream": true added:

data: {"choices":[{"delta":{"role":"assistant","content":""},"finish_reason":"stop"}]}
data: [DONE]

Tool call completely lost. Empty response. ❌

Root Cause

In @mariozechner/pi-ai/dist/providers/openai-completions.js, line ~316:

const params = {
    model: model.id,
    messages,
    stream: true,  // ← hardcoded, no way to disable
};

This is a known Ollama limitation affecting multiple models (Mistral, Qwen, etc.) — tracked in ollama/ollama#9632 and ollama/ollama#12557. The recommended workaround from the Ollama community is to use stream: false when tools are present.

Impact

  • All Ollama models configured as custom providers cannot use tools (exec, web_search, browser, etc.)
  • Cron jobs and subagents configured with local models get narrative essays instead of tool execution
  • Heartbeat sessions with local models cannot perform any actions

Proposed Fix

Add a per-provider or per-model config option to disable streaming, or auto-detect when the provider is Ollama (baseUrl contains :11434 or provider is explicitly ollama) and fall back to non-streaming when tools are present in the request.

Something like:

const shouldStream = !(context.tools?.length && isOllamaProvider(model));
const params = {
    model: model.id,
    messages,
    stream: shouldStream,
};

Or a model-level config option:

{
  "id": "mistral-small:24b",
  "streamToolCalls": false
}

Environment

  • OpenClaw: v2026.1.29
  • Ollama: v0.15.1
  • Model: mistral-small:24b (Mistral Small 3.2)
  • OS: Linux 6.14.0-37-generic (x64)
  • Provider config: api: "openai-completions", baseUrl: "http://localhost:11434/v1"

Metadata

Metadata

Assignees

Labels

bugSomething isn't workingdedupe:parentPrimary canonical item in dedupe clusterstaleMarked as stale due to inactivity

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions