[Bug]: Ollama: agent bootstrap hardcodes KvSize=262144 (128K context), ignoring all config attempts to lower num_ctx — unusable on <32GB RAM

### Bug type

Behavior bug (incorrect output/state without crash)

### Summary

## Environment
- OpenClaw: 2026.3.2
- OS: Windows 11
- Node: 24.14.0
- Ollama model: qwen3:4b (Q4_K_M, 2.5GB)
- RAM: 16GB system + 11GB swap
- GPU: RTX 2060 6GB VRAM

## Summary
OpenClaw hardcodes 128K context (KvSize=262144) in the agent bootstrap and sends it directly to the Ollama runner, regardless of any config values. This causes Ollama to request 36GB of memory for a 2.5GB model, making it completely unusable on any machine with less than ~40GB RAM.

## Error
```
model requires more system memory (34.4 GiB) than is available (17.7 GiB)
```

## Ollama runner log confirming hardcoded value
```
msg=load request="{Operation:fit ... KvSize:262144 ...}"
msg="kv cache" device=CUDA0 size="36.0 GiB"
msg="model weights" device=CUDA0 size="2.3 GiB"
msg="total memory" size="39.5 GiB"
```

## Config attempts that were all ignored
All of the following were tried and had zero effect:
```json
"models.providers.ollama.models[].contextWindow": 4096
"models.providers.ollama.models[].numCtx": 4096
"models.providers.ollama.modelOptions.num_ctx": 4096
```
```powershell
$env:OLLAMA_MAX_NUM_CTX = "4096"
```
Changing api type from `openai-completions` to `ollama` — no effect.

## Workaround (working)
HTTP proxy on port 11435 that intercepts requests and rewrites num_ctx before forwarding to Ollama on 11434:
```javascript
// ollama-proxy.js
const http = require('http');
const PROXY_PORT = 11435;
const OLLAMA_PORT = 11434;
const MAX_CTX = 16000; // OpenClaw minimum is 16000

const server = http.createServer((req, res) => {
  let body = '';
  req.on('data', chunk => { body += chunk.toString(); });
  req.on('end', () => {
    let modifiedBody = body;
    if (body && (req.headers['content-type'] || '').includes('application/json')) {
      try {
        const parsed = JSON.parse(body);
        if (parsed.options && parsed.options.num_ctx) {
          parsed.options.num_ctx = MAX_CTX;
        } else {
          if (!parsed.options) parsed.options = {};
          parsed.options.num_ctx = MAX_CTX;
        }
        modifiedBody = JSON.stringify(parsed);
      } catch (e) {}
    }
    const options = {
      hostname: '127.0.0.1',
      port: OLLAMA_PORT,
      path: req.url,
      method: req.method,
      headers: { ...req.headers, 'host': `127.0.0.1:${OLLAMA_PORT}`, 'content-length': Buffer.byteLength(modifiedBody) }
    };
    const proxyReq = http.request(options, (proxyRes) => {
      res.writeHead(proxyRes.statusCode, proxyRes.headers);
      proxyRes.pipe(res, { end: true });
    });
    proxyReq.on('error', (err) => { res.writeHead(502); res.end('Proxy error: ' + err.message); });
    proxyReq.write(modifiedBody);
    proxyReq.end();
  });
});

server.listen(PROXY_PORT, '127.0.0.1', () => {
  console.log(`Proxy running on ${PROXY_PORT}, forwarding to ${OLLAMA_PORT}`);
});
```

Then set `baseUrl` in openclaw.json to point to the proxy:
```json
"models": {
  "providers": {
    "ollama": {
      "baseUrl": "http://localhost:11435"
    }
  }
}
```

## Expected behavior
`contextWindow` or an equivalent config key under `models.providers.ollama` should control the num_ctx sent to Ollama.

## Impact
This bug makes OpenClaw completely unusable with Ollama on any normal consumer PC. The workaround requires running a separate proxy process on every startup, which is not acceptable for a production setup.

### Steps to reproduce

1. Install Ollama and pull any model (tested: qwen3:4b)
2. Add Ollama as a provider in openclaw.json with contextWindow: 4096
3. Start openclaw gateway
4. Send any message in TUI or Telegram
5. Observe Ollama logs

### Expected behavior

OpenClaw should pass num_ctx=4096 (or whatever contextWindow is set to) 
to the Ollama runner.

### Actual behavior

OpenClaw sends KvSize=262144 (128K tokens) to the Ollama runner regardless 
of any config. Ollama requests 36GB RAM for a 2.5GB model and fails with:
"model requires more system memory (34.4 GiB) than is available (17.7 GiB)"

### OpenClaw version

2026.3.2 (build 85377a2)

### Operating system

Windows 11

### Install method

npm install -g openclaw

### Logs, screenshots, and evidence

```shell
msg=load request="{Operation:fit KvSize:262144 ...}"
msg="kv cache" device=CUDA0 size="36.0 GiB"
msg="model weights" device=CUDA0 size="2.3 GiB"
msg="total memory" size="39.5 GiB"
msg="Load failed" error="model requires more system memory (34.4 GiB) than is available (17.7 GiB)"
```

### Impact and severity

Blocks workflow — completely. Any user running Ollama on a machine with 
less than ~40GB RAM cannot use OpenClaw with local models at all. 
This affects the majority of consumer hardware (most PCs have 16-32GB RAM).
Happens every time, 100% reproducible. Workaround requires running a 
separate Node.js proxy process on every startup.

### Additional information

Tried all of these config keys — all ignored:
- models.providers.ollama.models[].contextWindow: 4096
- models.providers.ollama.models[].numCtx: 4096  
- OLLAMA_MAX_NUM_CTX=4096 env var

Workaround: HTTP proxy on port 11435 that rewrites num_ctx in the 
request body before forwarding to Ollama on 11434. See proxy code in 
the body above.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bug]: Ollama: agent bootstrap hardcodes KvSize=262144 (128K context), ignoring all config attempts to lower num_ctx — unusable on <32GB RAM #35436

Bug type

Summary

Environment

Summary

Error

Ollama runner log confirming hardcoded value

Config attempts that were all ignored

Workaround (working)

Expected behavior

Impact

Steps to reproduce

Expected behavior

Actual behavior

OpenClaw version

Operating system

Install method

Logs, screenshots, and evidence

Impact and severity

Additional information

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

[Bug]: Ollama: agent bootstrap hardcodes KvSize=262144 (128K context), ignoring all config attempts to lower num_ctx — unusable on <32GB RAM #35436

Description

Bug type

Summary

Environment

Summary

Error

Ollama runner log confirming hardcoded value

Config attempts that were all ignored

Workaround (working)

Expected behavior

Impact

Steps to reproduce

Expected behavior

Actual behavior

OpenClaw version

Operating system

Install method

Logs, screenshots, and evidence

Impact and severity

Additional information

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions