Skip to content

fix(ollama): query per-model context window via /api/show#24146

Closed
Sid-Qin wants to merge 1 commit intoopenclaw:mainfrom
Sid-Qin:fix/ollama-context-window-detection
Closed

fix(ollama): query per-model context window via /api/show#24146
Sid-Qin wants to merge 1 commit intoopenclaw:mainfrom
Sid-Qin:fix/ollama-context-window-detection

Conversation

@Sid-Qin
Copy link
Contributor

@Sid-Qin Sid-Qin commented Feb 23, 2026

Summary

  • Problem: discoverOllamaModels() hard-codes contextWindow: 128000 for every model. While this doesn't trigger the context-window guard, it shows misleading info in the TUI (e.g. "tokens ?/128k" for a 32k model). Additionally, users configuring Ollama as a Custom Provider get DEFAULT_CONTEXT_WINDOW=4096, which triggers "Model context window too small (4096 tokens)" and blocks the agent entirely.
  • Fix: After discovering models via /api/tags, query each model's actual context_length via POST /api/show and use it as contextWindow. Queries are parallelised with a 3s per-model timeout; on failure, the 128k fallback is preserved.

Change Type

  • Bug fix

Scope

  • Providers / Ollama

Linked Issue

Changes

  • src/agents/models-config.providers.ts:
    • Added queryOllamaContextWindow() — queries /api/show and extracts context_length from model_info (key varies by architecture, e.g. qwen2.context_length, llama.context_length)
    • Updated discoverOllamaModels() to query each model in parallel and use real context window values

Security Impact

  • New permissions/capabilities? No
  • Secrets/tokens handling changed? No
  • New/changed network calls? Yes — additional POST /api/show per model during Ollama discovery (local network only, 3s timeout)
  • Command/tool execution surface changed? No
  • Data access scope changed? No

Compatibility

  • Backward compatible? Yes — falls back to 128k default if /api/show fails
  • Config/env changes? No
  • Migration needed? No

Greptile Summary

This PR fixes Ollama model discovery by querying each model's actual context window via /api/show instead of hardcoding 128k for all models. The fix resolves two issues: misleading TUI display for models with smaller context windows, and blocking errors when Ollama is configured as a Custom Provider (which defaults to 4096 tokens, below the 16k minimum).

  • Adds queryOllamaContextWindow() that queries /api/show and extracts context_length from model_info (architecture-specific key like qwen2.context_length)
  • Updates discoverOllamaModels() to query models in parallel with a 3s timeout per model
  • Falls back to the existing 128k default if the query fails
  • Uses Promise.allSettled to ensure one failing model doesn't block discovery of others

Confidence Score: 5/5

  • This PR is safe to merge with minimal risk
  • The implementation is well-designed with proper error handling, graceful fallbacks, and parallel execution with timeouts. It addresses a real bug without introducing breaking changes or security risks. The 3s timeout and Promise.allSettled pattern ensure resilient discovery even if some models fail.
  • No files require special attention

Last reviewed commit: c6abbbc

discoverOllamaModels() previously hard-coded contextWindow to 128000
for every model.  While 128k is safe for the context-window guard, it
does not reflect the model's actual capability and leads to misleading
TUI display (e.g. "tokens ?/128k" for a 32k model).

More critically, users who configure Ollama as a Custom Provider get
DEFAULT_CONTEXT_WINDOW=4096 from onboard-custom, which triggers
"Model context window too small (4096 tokens)" — blocking the agent.

Now each discovered model is queried via POST /api/show to extract
the real context_length from model_info (e.g. qwen2.context_length).
Requests are parallelised with a 3s per-model timeout; on failure the
128k fallback is preserved.

Closes openclaw#24068

Co-authored-by: Cursor <cursoragent@cursor.com>
@vincentkoc
Copy link
Member

Superseded by #29205, which consolidates this fix with related Ollama context-window patches.

Credit: @Sid-Qin for the /api/show per-model context discovery approach and fallback design in this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling size: S

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: Context window not detected correctly from Ollama API (detects 4096 instead of actual value)

2 participants