Bug type
Behavior bug (incorrect output/state without crash)
Summary
Environment
- OpenClaw: 2026.3.2
- OS: Windows 11
- Node: 24.14.0
- Ollama model: qwen3:4b (Q4_K_M, 2.5GB)
- RAM: 16GB system + 11GB swap
- GPU: RTX 2060 6GB VRAM
Summary
OpenClaw hardcodes 128K context (KvSize=262144) in the agent bootstrap and sends it directly to the Ollama runner, regardless of any config values. This causes Ollama to request 36GB of memory for a 2.5GB model, making it completely unusable on any machine with less than ~40GB RAM.
Error
model requires more system memory (34.4 GiB) than is available (17.7 GiB)
Ollama runner log confirming hardcoded value
msg=load request="{Operation:fit ... KvSize:262144 ...}"
msg="kv cache" device=CUDA0 size="36.0 GiB"
msg="model weights" device=CUDA0 size="2.3 GiB"
msg="total memory" size="39.5 GiB"
Config attempts that were all ignored
All of the following were tried and had zero effect:
"models.providers.ollama.models[].contextWindow": 4096
"models.providers.ollama.models[].numCtx": 4096
"models.providers.ollama.modelOptions.num_ctx": 4096
$env:OLLAMA_MAX_NUM_CTX = "4096"
Changing api type from openai-completions to ollama — no effect.
Workaround (working)
HTTP proxy on port 11435 that intercepts requests and rewrites num_ctx before forwarding to Ollama on 11434:
// ollama-proxy.js
const http = require('http');
const PROXY_PORT = 11435;
const OLLAMA_PORT = 11434;
const MAX_CTX = 16000; // OpenClaw minimum is 16000
const server = http.createServer((req, res) => {
let body = '';
req.on('data', chunk => { body += chunk.toString(); });
req.on('end', () => {
let modifiedBody = body;
if (body && (req.headers['content-type'] || '').includes('application/json')) {
try {
const parsed = JSON.parse(body);
if (parsed.options && parsed.options.num_ctx) {
parsed.options.num_ctx = MAX_CTX;
} else {
if (!parsed.options) parsed.options = {};
parsed.options.num_ctx = MAX_CTX;
}
modifiedBody = JSON.stringify(parsed);
} catch (e) {}
}
const options = {
hostname: '127.0.0.1',
port: OLLAMA_PORT,
path: req.url,
method: req.method,
headers: { ...req.headers, 'host': `127.0.0.1:${OLLAMA_PORT}`, 'content-length': Buffer.byteLength(modifiedBody) }
};
const proxyReq = http.request(options, (proxyRes) => {
res.writeHead(proxyRes.statusCode, proxyRes.headers);
proxyRes.pipe(res, { end: true });
});
proxyReq.on('error', (err) => { res.writeHead(502); res.end('Proxy error: ' + err.message); });
proxyReq.write(modifiedBody);
proxyReq.end();
});
});
server.listen(PROXY_PORT, '127.0.0.1', () => {
console.log(`Proxy running on ${PROXY_PORT}, forwarding to ${OLLAMA_PORT}`);
});
Then set baseUrl in openclaw.json to point to the proxy:
"models": {
"providers": {
"ollama": {
"baseUrl": "http://localhost:11435"
}
}
}
Expected behavior
contextWindow or an equivalent config key under models.providers.ollama should control the num_ctx sent to Ollama.
Impact
This bug makes OpenClaw completely unusable with Ollama on any normal consumer PC. The workaround requires running a separate proxy process on every startup, which is not acceptable for a production setup.
Steps to reproduce
- Install Ollama and pull any model (tested: qwen3:4b)
- Add Ollama as a provider in openclaw.json with contextWindow: 4096
- Start openclaw gateway
- Send any message in TUI or Telegram
- Observe Ollama logs
Expected behavior
OpenClaw should pass num_ctx=4096 (or whatever contextWindow is set to)
to the Ollama runner.
Actual behavior
OpenClaw sends KvSize=262144 (128K tokens) to the Ollama runner regardless
of any config. Ollama requests 36GB RAM for a 2.5GB model and fails with:
"model requires more system memory (34.4 GiB) than is available (17.7 GiB)"
OpenClaw version
2026.3.2 (build 85377a2)
Operating system
Windows 11
Install method
npm install -g openclaw
Logs, screenshots, and evidence
msg=load request="{Operation:fit KvSize:262144 ...}"
msg="kv cache" device=CUDA0 size="36.0 GiB"
msg="model weights" device=CUDA0 size="2.3 GiB"
msg="total memory" size="39.5 GiB"
msg="Load failed" error="model requires more system memory (34.4 GiB) than is available (17.7 GiB)"
Impact and severity
Blocks workflow — completely. Any user running Ollama on a machine with
less than ~40GB RAM cannot use OpenClaw with local models at all.
This affects the majority of consumer hardware (most PCs have 16-32GB RAM).
Happens every time, 100% reproducible. Workaround requires running a
separate Node.js proxy process on every startup.
Additional information
Tried all of these config keys — all ignored:
- models.providers.ollama.models[].contextWindow: 4096
- models.providers.ollama.models[].numCtx: 4096
- OLLAMA_MAX_NUM_CTX=4096 env var
Workaround: HTTP proxy on port 11435 that rewrites num_ctx in the
request body before forwarding to Ollama on 11434. See proxy code in
the body above.
Bug type
Behavior bug (incorrect output/state without crash)
Summary
Environment
Summary
OpenClaw hardcodes 128K context (KvSize=262144) in the agent bootstrap and sends it directly to the Ollama runner, regardless of any config values. This causes Ollama to request 36GB of memory for a 2.5GB model, making it completely unusable on any machine with less than ~40GB RAM.
Error
Ollama runner log confirming hardcoded value
Config attempts that were all ignored
All of the following were tried and had zero effect:
Changing api type from
openai-completionstoollama— no effect.Workaround (working)
HTTP proxy on port 11435 that intercepts requests and rewrites num_ctx before forwarding to Ollama on 11434:
Then set
baseUrlin openclaw.json to point to the proxy:Expected behavior
contextWindowor an equivalent config key undermodels.providers.ollamashould control the num_ctx sent to Ollama.Impact
This bug makes OpenClaw completely unusable with Ollama on any normal consumer PC. The workaround requires running a separate proxy process on every startup, which is not acceptable for a production setup.
Steps to reproduce
Expected behavior
OpenClaw should pass num_ctx=4096 (or whatever contextWindow is set to)
to the Ollama runner.
Actual behavior
OpenClaw sends KvSize=262144 (128K tokens) to the Ollama runner regardless
of any config. Ollama requests 36GB RAM for a 2.5GB model and fails with:
"model requires more system memory (34.4 GiB) than is available (17.7 GiB)"
OpenClaw version
2026.3.2 (build 85377a2)
Operating system
Windows 11
Install method
npm install -g openclaw
Logs, screenshots, and evidence
Impact and severity
Blocks workflow — completely. Any user running Ollama on a machine with
less than ~40GB RAM cannot use OpenClaw with local models at all.
This affects the majority of consumer hardware (most PCs have 16-32GB RAM).
Happens every time, 100% reproducible. Workaround requires running a
separate Node.js proxy process on every startup.
Additional information
Tried all of these config keys — all ignored:
Workaround: HTTP proxy on port 11435 that rewrites num_ctx in the
request body before forwarding to Ollama on 11434. See proxy code in
the body above.