fix(agents): inject num_ctx for Ollama OpenAI-compat API to prevent 4096 token cap by Sid-Qin · Pull Request #27292 · openclaw/openclaw

Sid-Qin · 2026-02-26T07:18:33Z

Summary

Problem: Ollama models always cap input at 4096 tokens when configured with api: "openai-completions", causing conversation history to be lost after a few messages. The native api: "ollama" path sends num_ctx but the OpenAI-compat path does not.
Why it matters: Users lose all conversation context — the model forgets everything said earlier in the same conversation.
What changed: (1) Detect Ollama providers using the OpenAI-compat API (by provider name or baseUrl pattern) and inject num_ctx into the request payload via onPayload. (2) Fix fallback model resolution to match by model ID instead of blindly using models[0].
What did NOT change (scope boundary): The native api: "ollama" path, non-Ollama providers, and the model registry lookup are unaffected.

Change Type (select all)

Scope (select all touched areas)

Linked Issue/PR

User-visible / Behavior Changes

Ollama models configured with api: "openai-completions" now respect the configured contextWindow value. Conversation history is maintained up to the configured limit instead of being silently truncated at 4096 tokens.

Security Impact (required)

New permissions/capabilities: None — only adds a parameter to the existing HTTP request body
Auth/token changes: None
Data exposure risk: None

Testing

npx vitest run src/agents/pi-embedded-runner/run/attempt — 9 tests ✓
npx vitest run src/agents/pi-embedded-runner/model — 20 tests ✓

Rollback Plan

Revert the single commit. Non-Ollama providers are unaffected. Ollama reverts to the 4096 default.

…096 token cap Ollama defaults to num_ctx=4096, so conversations lose history after a few messages. The native "ollama" API already sends num_ctx via the options field, but when users configure api: "openai-completions" the parameter is never sent and the server silently truncates input. 1. Detect Ollama providers using the OpenAI-compat API (by provider name or baseUrl pattern) and wrap the stream function to inject num_ctx into the request payload via onPayload. 2. Fix fallback model resolution to match models by ID instead of blindly using models[0], so the correct contextWindow/maxTokens values are used when the model isn't in the registry. Closes openclaw#27278

greptile-apps · 2026-02-26T07:22:34Z

Greptile Summary

Fixes two related bugs affecting Ollama models configured with api: "openai-completions":

Root Cause 1: Missing num_ctx parameter - When using Ollama via OpenAI-compatible API, the num_ctx parameter was never sent, causing Ollama's server to default to 4096 tokens regardless of the configured contextWindow. The fix detects Ollama providers (by name or baseUrl heuristics) and injects num_ctx into the request payload via the onPayload callback, matching the behavior of the native Ollama API path.

Root Cause 2: Incorrect fallback model resolution - When a model wasn't found in the registry, the fallback code blindly used models[0] config instead of matching by model ID, potentially using the wrong contextWindow value. The fix now attempts to find the correct model by ID before falling back to models[0].

The implementation is backward-compatible, follows existing patterns (consistent with ollama-stream.ts:431), and properly chains existing onPayload callbacks. The detection logic uses provider name and baseUrl heuristics (port :11434 or path /ollama) which could theoretically false-positive, but the injected num_ctx parameter is harmless to non-Ollama providers.

Confidence Score: 5/5

Safe to merge - well-structured bug fix following existing patterns with minimal risk
The changes are narrowly scoped, backward-compatible, and consistent with existing codebase patterns. The Ollama detection logic is defensive, the payload injection properly chains callbacks, and the model resolution fix is a clear improvement. No new dependencies, security issues, or breaking changes.
No files require special attention

_{Last reviewed commit: d2cd136}

vincentkoc · 2026-02-27T23:22:35Z

Superseded by #29205, which carries forward the num_ctx OpenAI-compatible Ollama fix together with fallback model-ID token-limit resolution and additional tests.

Credit: @Sid-Qin for the core num_ctx direction and problem report captured in this PR.

openclaw-barnacle bot added agents Agent runtime and tooling size: XS experienced-contributor labels Feb 26, 2026

github-actions bot mentioned this pull request Feb 26, 2026

🦞 OpenClaw 项目动态日报 2026-02-26 duanyytop/agents-radar#15

Open

vincentkoc mentioned this pull request Feb 27, 2026

fix(ollama): unify context window handling across discovery, merge, and OpenAI-compat transport #29205

Merged

18 tasks

vincentkoc closed this Feb 27, 2026

vincentkoc mentioned this pull request Feb 28, 2026

fix(model): preserve reasoning in provider fallback resolution #29285

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(agents): inject num_ctx for Ollama OpenAI-compat API to prevent 4096 token cap#27292

fix(agents): inject num_ctx for Ollama OpenAI-compat API to prevent 4096 token cap#27292
Sid-Qin wants to merge 1 commit intoopenclaw:mainfrom
Sid-Qin:fix/27278-ollama-context-window-override

Sid-Qin commented Feb 26, 2026 •

edited

Loading

Uh oh!

greptile-apps bot commented Feb 26, 2026

Uh oh!

vincentkoc commented Feb 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

Sid-Qin commented Feb 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Change Type (select all)

Scope (select all touched areas)

Linked Issue/PR

User-visible / Behavior Changes

Security Impact (required)

Testing

Rollback Plan

Uh oh!

greptile-apps bot commented Feb 26, 2026

Greptile Summary

Confidence Score: 5/5

Uh oh!

vincentkoc commented Feb 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Sid-Qin commented Feb 26, 2026 •

edited

Loading