Summary
When using Ollama with the openai-completions API, the context is silently truncated to Ollama's default num_ctx (typically 4096 tokens), even when contextWindow is configured to a larger value in clawdbot config. This causes bootstrap files (SOUL.md, USER.md, IDENTITY.md, etc.) to be cut off from the system prompt.
Environment
- Clawdbot version: 2026.1.24-3
- Ollama version: latest
- Model: qwen2.5:32b
- OS: Windows 10
- API mode:
openai-completions
Problem
Ollama's OpenAI-compatible API (/v1/chat/completions) does not respect the OLLAMA_NUM_CTX environment variable or the model's configured context window. Instead, it uses Ollama's internal default (often 4096 tokens).
Ollama logs show:
level=WARN msg="truncating input prompt" limit=4096 prompt=10573 keep=4 new=4096
This means a 10,573 token prompt gets truncated to 4,096 tokens, cutting off the Project Context section containing bootstrap files.
Root Cause
The pi-ai package's openai-completions.js does not pass the options.num_ctx parameter to Ollama. Ollama's OpenAI-compatible API requires this parameter to use context windows larger than the default.
The Ollama native API supports:
{
"model": "qwen2.5:32b",
"messages": [...],
"options": {
"num_ctx": 32768
}
}
A good balance is 16k which is what I am using now.
But the OpenAI-compatible wrapper doesn't pass through this option.
Reproduction
- Configure clawdbot with Ollama using openai-completions API
- Set contextWindow to a large value (e.g., 131072)
- Create bootstrap files (SOUL.md, USER.md) in workspace
- Send a message asking about content from bootstrap files
- Model will not have access to bootstrap content (truncated)
Workaround
Create a custom Ollama model with num_ctx baked in:
cat > Modelfile << 'EOF'
FROM qwen2.5:32b
PARAMETER num_ctx 32768
EOF
ollama create qwen2.5-32k -f Modelfile
Then update clawdbot config to use the new model.
Suggested Fix
Pass num_ctx from the model's contextWindow config to Ollama's API. Options:
- Include
options.num_ctx in the OpenAI-compatible request body
- Use Ollama's native
/api/chat endpoint with proper options parameter
- Allow users to specify
extraParams.num_ctx in model config
Test Results
| Model |
Prompt Tokens |
Result |
| qwen2.5:32b (default) |
19 (truncated from 9071) |
No access to context |
| qwen2.5-32k (custom with num_ctx) |
9071 |
Full context preserved |
Summary
When using Ollama with the
openai-completionsAPI, the context is silently truncated to Ollama's defaultnum_ctx(typically 4096 tokens), even whencontextWindowis configured to a larger value in clawdbot config. This causes bootstrap files (SOUL.md, USER.md, IDENTITY.md, etc.) to be cut off from the system prompt.Environment
openai-completionsProblem
Ollama's OpenAI-compatible API (
/v1/chat/completions) does not respect theOLLAMA_NUM_CTXenvironment variable or the model's configured context window. Instead, it uses Ollama's internal default (often 4096 tokens).Ollama logs show:
This means a 10,573 token prompt gets truncated to 4,096 tokens, cutting off the Project Context section containing bootstrap files.
Root Cause
The
pi-aipackage'sopenai-completions.jsdoes not pass theoptions.num_ctxparameter to Ollama. Ollama's OpenAI-compatible API requires this parameter to use context windows larger than the default.The Ollama native API supports:
{ "model": "qwen2.5:32b", "messages": [...], "options": { "num_ctx": 32768 } }A good balance is 16k which is what I am using now.
But the OpenAI-compatible wrapper doesn't pass through this option.
Reproduction
Workaround
Create a custom Ollama model with
num_ctxbaked in:Then update clawdbot config to use the new model.
Suggested Fix
Pass
num_ctxfrom the model'scontextWindowconfig to Ollama's API. Options:options.num_ctxin the OpenAI-compatible request body/api/chatendpoint with properoptionsparameterextraParams.num_ctxin model configTest Results