fix(ollama): query per-model context window via /api/show#24146
Closed
Sid-Qin wants to merge 1 commit intoopenclaw:mainfrom
Closed
fix(ollama): query per-model context window via /api/show#24146Sid-Qin wants to merge 1 commit intoopenclaw:mainfrom
Sid-Qin wants to merge 1 commit intoopenclaw:mainfrom
Conversation
discoverOllamaModels() previously hard-coded contextWindow to 128000 for every model. While 128k is safe for the context-window guard, it does not reflect the model's actual capability and leads to misleading TUI display (e.g. "tokens ?/128k" for a 32k model). More critically, users who configure Ollama as a Custom Provider get DEFAULT_CONTEXT_WINDOW=4096 from onboard-custom, which triggers "Model context window too small (4096 tokens)" — blocking the agent. Now each discovered model is queried via POST /api/show to extract the real context_length from model_info (e.g. qwen2.context_length). Requests are parallelised with a 3s per-model timeout; on failure the 128k fallback is preserved. Closes openclaw#24068 Co-authored-by: Cursor <cursoragent@cursor.com>
3 tasks
18 tasks
Member
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
discoverOllamaModels()hard-codescontextWindow: 128000for every model. While this doesn't trigger the context-window guard, it shows misleading info in the TUI (e.g. "tokens ?/128k" for a 32k model). Additionally, users configuring Ollama as a Custom Provider getDEFAULT_CONTEXT_WINDOW=4096, which triggers "Model context window too small (4096 tokens)" and blocks the agent entirely./api/tags, query each model's actualcontext_lengthviaPOST /api/showand use it ascontextWindow. Queries are parallelised with a 3s per-model timeout; on failure, the 128k fallback is preserved.Change Type
Scope
Linked Issue
Changes
src/agents/models-config.providers.ts:queryOllamaContextWindow()— queries/api/showand extractscontext_lengthfrommodel_info(key varies by architecture, e.g.qwen2.context_length,llama.context_length)discoverOllamaModels()to query each model in parallel and use real context window valuesSecurity Impact
NoNoYes— additionalPOST /api/showper model during Ollama discovery (local network only, 3s timeout)NoNoCompatibility
Yes— falls back to 128k default if/api/showfailsNoNoGreptile Summary
This PR fixes Ollama model discovery by querying each model's actual context window via
/api/showinstead of hardcoding 128k for all models. The fix resolves two issues: misleading TUI display for models with smaller context windows, and blocking errors when Ollama is configured as a Custom Provider (which defaults to 4096 tokens, below the 16k minimum).queryOllamaContextWindow()that queries/api/showand extractscontext_lengthfrommodel_info(architecture-specific key likeqwen2.context_length)discoverOllamaModels()to query models in parallel with a 3s timeout per modelPromise.allSettledto ensure one failing model doesn't block discovery of othersConfidence Score: 5/5
Last reviewed commit: c6abbbc