You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
OpenClaw currently connects to Ollama through the OpenAI compatibility layer (/v1/chat/completions), which silently drops tool calls when streaming is enabled. Since OpenClaw hardcodes stream: true, no Ollama model can use tools — the model decides to call a tool, but the streaming response returns empty content with finish_reason: "stop", losing the tool call entirely.
Meanwhile, Ollama's native API (/api/chat) has fully supported streaming + tool calling since May 2025 (blog post, PR ollama/ollama#10415). The problem isn't Ollama — it's that OpenClaw routes through a broken compatibility layer instead of using the native endpoint.
Root causes identified (3 issues)
#
Problem
Impact
1
OpenAI compat endpoint drops tool_calls when streaming
Tool calls silently lost — model produces them, response doesn't contain them
2
Ollama sends tool_calls in intermediate chunks (done:false), not the final done:true chunk
Native API client must accumulate tool_calls across all chunks
3
Ollama defaults num_ctx to 4096 tokens regardless of model's actual context window
Large system prompts + 23 tool definitions get silently truncated, model never sees the tool schemas
Proposed solution
Add a dedicated ollama API provider type that talks to Ollama's native /api/chat endpoint directly, with proper streaming chunk handling and context window configuration.
Wait for Ollama to fix /v1/chat/completions: Tracked in ollama#12557, but no timeline. The native API already works, so there's no reason to wait.
jokelord's supportedParameters patch: Adds config-level tool support declaration for local models (sglang/vLLM). Solves a different problem (tool detection) but doesn't fix the streaming issue with Ollama.
Summary
OpenClaw currently connects to Ollama through the OpenAI compatibility layer (
/v1/chat/completions), which silently drops tool calls when streaming is enabled. Since OpenClaw hardcodesstream: true, no Ollama model can use tools — the model decides to call a tool, but the streaming response returns empty content withfinish_reason: "stop", losing the tool call entirely.Meanwhile, Ollama's native API (
/api/chat) has fully supported streaming + tool calling since May 2025 (blog post, PR ollama/ollama#10415). The problem isn't Ollama — it's that OpenClaw routes through a broken compatibility layer instead of using the native endpoint.Root causes identified (3 issues)
done:false), not the finaldone:truechunknum_ctxto 4096 tokens regardless of model's actual context windowProposed solution
Add a dedicated
ollamaAPI provider type that talks to Ollama's native/api/chatendpoint directly, with proper streaming chunk handling and context window configuration.Config example:
{ "models": { "providers": { "ollama": { "baseUrl": "http://localhost:11434", "api": "ollama", "models": [{ "id": "qwen3:32b", "name": "Qwen3 32B", "reasoning": true, "input": ["text"], "contextWindow": 131072, "maxTokens": 16384 }] } } } }What this enables:
openai-completions(current)ollama(proposed)/v1/chat/completions/api/chatnum_ctxfrom model configImplementation scope:
"ollama"to theApitype uniontool_callsfrom intermediatedone:falsechunksnum_ctxfrom model'scontextWindowconfig (default 65536) to prevent prompt truncationVerified behavior
Tested with
qwen3:32b(32B parameters) on MacBook Pro M4 Pro 48GB:num_ctx=65536+ 23 tools + system prompt →tool_callsgenerated correctlyAlternatives considered
streamToolCalls: falsefallback): Disables streaming when tools are present. This works but sacrifices the streaming UX — users see no output until the full response is ready./v1/chat/completions: Tracked in ollama#12557, but no timeline. The native API already works, so there's no reason to wait.supportedParameterspatch: Adds config-level tool support declaration for local models (sglang/vLLM). Solves a different problem (tool detection) but doesn't fix the streaming issue with Ollama.Additional context
num_ctxproblem reported in OpenCode projectTested environment: