Skip to content

[Feature]: Add native Ollama API provider for streaming + tool calling support #11828

@BrokenFinger98

Description

@BrokenFinger98

Summary

OpenClaw currently connects to Ollama through the OpenAI compatibility layer (/v1/chat/completions), which silently drops tool calls when streaming is enabled. Since OpenClaw hardcodes stream: true, no Ollama model can use tools — the model decides to call a tool, but the streaming response returns empty content with finish_reason: "stop", losing the tool call entirely.

Meanwhile, Ollama's native API (/api/chat) has fully supported streaming + tool calling since May 2025 (blog post, PR ollama/ollama#10415). The problem isn't Ollama — it's that OpenClaw routes through a broken compatibility layer instead of using the native endpoint.

Root causes identified (3 issues)

# Problem Impact
1 OpenAI compat endpoint drops tool_calls when streaming Tool calls silently lost — model produces them, response doesn't contain them
2 Ollama sends tool_calls in intermediate chunks (done:false), not the final done:true chunk Native API client must accumulate tool_calls across all chunks
3 Ollama defaults num_ctx to 4096 tokens regardless of model's actual context window Large system prompts + 23 tool definitions get silently truncated, model never sees the tool schemas

Proposed solution

Add a dedicated ollama API provider type that talks to Ollama's native /api/chat endpoint directly, with proper streaming chunk handling and context window configuration.

Config example:

{
  "models": {
    "providers": {
      "ollama": {
        "baseUrl": "http://localhost:11434",
        "api": "ollama",
        "models": [{
          "id": "qwen3:32b",
          "name": "Qwen3 32B",
          "reasoning": true,
          "input": ["text"],
          "contextWindow": 131072,
          "maxTokens": 16384
        }]
      }
    }
  }
}

What this enables:

Aspect openai-completions (current) ollama (proposed)
Endpoint /v1/chat/completions /api/chat
Streaming + Tools ❌ Broken ✅ Works
Response format OpenAI schema Ollama native schema
Context window Not configurable Set via num_ctx from model config
Tool call parsing N/A (dropped) Accumulates from intermediate chunks

Implementation scope:

  1. Add "ollama" to the Api type union
  2. Create native Ollama API client (request/response mapping)
  3. Handle streaming chunks — accumulate tool_calls from intermediate done:false chunks
  4. Set num_ctx from model's contextWindow config (default 65536) to prevent prompt truncation
  5. Convert messages/tools between SDK format and Ollama native format

Verified behavior

Tested with qwen3:32b (32B parameters) on MacBook Pro M4 Pro 48GB:

  • curl → Ollama native API with num_ctx=65536 + 23 tools + system prompt → tool_calls generated correctly
  • Streaming text works with all Ollama models
  • Tool call accumulation from intermediate chunks works
  • ✅ All 13 unit tests pass

Alternatives considered

  • PR fix(ollama): add streamToolCalls fallback for tool calling #5783 (streamToolCalls: false fallback): Disables streaming when tools are present. This works but sacrifices the streaming UX — users see no output until the full response is ready.
  • Wait for Ollama to fix /v1/chat/completions: Tracked in ollama#12557, but no timeline. The native API already works, so there's no reason to wait.
  • jokelord's supportedParameters patch: Adds config-level tool support declaration for local models (sglang/vLLM). Solves a different problem (tool detection) but doesn't fix the streaming issue with Ollama.

Additional context

Tested environment:

  • OpenClaw v2026.1.29
  • Ollama v0.15.4
  • Models: qwen3:32b, glm-4.7-flash, mistral-small3.1:24b, devstral
  • OS: macOS (Apple Silicon M4 Pro, 48GB)

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions