Skip to content

Ollama silently truncates context to 4096 tokens - num_ctx not passed via OpenAI-compatible API #4028

@adriangrassi

Description

@adriangrassi

Summary

When using Ollama with the openai-completions API, the context is silently truncated to Ollama's default num_ctx (typically 4096 tokens), even when contextWindow is configured to a larger value in clawdbot config. This causes bootstrap files (SOUL.md, USER.md, IDENTITY.md, etc.) to be cut off from the system prompt.

Environment

  • Clawdbot version: 2026.1.24-3
  • Ollama version: latest
  • Model: qwen2.5:32b
  • OS: Windows 10
  • API mode: openai-completions

Problem

Ollama's OpenAI-compatible API (/v1/chat/completions) does not respect the OLLAMA_NUM_CTX environment variable or the model's configured context window. Instead, it uses Ollama's internal default (often 4096 tokens).

Ollama logs show:

level=WARN msg="truncating input prompt" limit=4096 prompt=10573 keep=4 new=4096

This means a 10,573 token prompt gets truncated to 4,096 tokens, cutting off the Project Context section containing bootstrap files.

Root Cause

The pi-ai package's openai-completions.js does not pass the options.num_ctx parameter to Ollama. Ollama's OpenAI-compatible API requires this parameter to use context windows larger than the default.

The Ollama native API supports:

{
  "model": "qwen2.5:32b",
  "messages": [...],
  "options": {
    "num_ctx": 32768
  }
}

A good balance is 16k which is what I am using now.
But the OpenAI-compatible wrapper doesn't pass through this option.

Reproduction

  1. Configure clawdbot with Ollama using openai-completions API
  2. Set contextWindow to a large value (e.g., 131072)
  3. Create bootstrap files (SOUL.md, USER.md) in workspace
  4. Send a message asking about content from bootstrap files
  5. Model will not have access to bootstrap content (truncated)

Workaround

Create a custom Ollama model with num_ctx baked in:

cat > Modelfile << 'EOF'
FROM qwen2.5:32b
PARAMETER num_ctx 32768
EOF

ollama create qwen2.5-32k -f Modelfile

Then update clawdbot config to use the new model.

Suggested Fix

Pass num_ctx from the model's contextWindow config to Ollama's API. Options:

  1. Include options.num_ctx in the OpenAI-compatible request body
  2. Use Ollama's native /api/chat endpoint with proper options parameter
  3. Allow users to specify extraParams.num_ctx in model config

Test Results

Model Prompt Tokens Result
qwen2.5:32b (default) 19 (truncated from 9071) No access to context
qwen2.5-32k (custom with num_ctx) 9071 Full context preserved

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions