fix: correct context-length resolution for kimi-k2.6 on Ollama Cloud and Kimi Coding#23980
Merged
kshitijk4poor merged 3 commits intoMay 11, 2026
Conversation
b362822 to
d63de45
Compare
…and Kimi Coding
Kimi-k2.6 (which supports 262K context) was incorrectly resolved as 32K,
tripping the 64K minimum-context guard and preventing use of the model on
Ollama Cloud and Kimi Coding / Moonshot providers.
Three fixes in the context-length resolution chain:
1. Ollama Cloud native /api/show query: new _query_ollama_api_show()
queries the Ollama native API for authoritative GGUF model_info
context_length. For hosted Ollama, prefers model_info over num_ctx
since users can't set their own num_ctx on Cloud. Added at step 5e
in get_model_context_length(), before the models.dev fallback.
2. models.dev :cloud/-cloud suffix fallback: lookup_models_dev_context()
now also tries appending :cloud and -cloud suffixes when the bare
model name doesn't match. models.dev stores 'kimi-k2.6:cloud' but
users and the live API use bare 'kimi-k2.6'.
3. Kimi-family 32K guard: after the OpenRouter metadata step, reject
exactly 32768 for Kimi-named models (kimi-*, moonshot*) and fall
through to hardcoded defaults ('kimi': 262144). OpenRouter reports
32768 for moonshotai/kimi-k2.6 but the model actually supports 262K.
Narrow filter — only 32768, only Kimi-family — becomes dead code
when OpenRouter updates its metadata.
---
d63de45 to
a20971d
Compare
ParthSareen
approved these changes
May 11, 2026
…onshot to PROVIDER_TO_MODELS_DEV Based on PR NousResearch#23950 by @nicoechaniz. - Add "kimi" and "moonshot" to PROVIDER_TO_MODELS_DEV → kimi-for-coding - Gate OpenRouter metadata step behind "if not effective_provider": known providers should not be overridden by community-maintained OR data - Keep the targeted Kimi-family 32k guard as a secondary safety net inside the OR gate (for unknown providers with Kimi models) Co-authored-by: nicoechaniz <nicoechaniz@altermundi.net>
4 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Kimi-k2.6 (which supports 262K context) was incorrectly resolved as 32K, tripping the 64K minimum-context guard and preventing use on Ollama Cloud, Kimi Coding / Moonshot, and any custom Ollama endpoint.
Closes #23949.
Root cause
The context-length resolution chain had three gaps:
kimi-k2.6:cloud— lookup_models_dev_context only searched for bare names, missing the :cloud-suffixed entry.DEFAULT_CONTEXT_LENGTHStable.Changes
1.
_query_ollama_api_show()— provider-agnostic Ollama native API probeQueries
/api/showat two points in the resolution chain:For non-Ollama servers, the POST returns 404/405 quickly. Results are cached per model+URL. For hosted Ollama, prefers GGUF
model_info.*.context_lengthovernum_ctx.2.
:cloud/-cloudsuffix fallback inlookup_models_dev_context()When exact lookup fails, also tries appending
:cloudand-cloudsuffixes. Makes barekimi-k2.6match thekimi-k2.6:cloudentry in models.dev.3. Gate OpenRouter metadata behind "if not effective_provider"
Based on PR #23950 by @nicoechaniz. Known providers should not be overridden by community-maintained OpenRouter data. When a provider is known (inferred from URL or set in config), skip OR and fall through to models.dev + curated defaults.
4. Kimi-family 32K guard (inside the OR gate)
For unknown providers where OR is still consulted: if OR returns exactly 32768 and
_model_name_suggests_kimi()matches, reject and fall through to hardcoded defaults ("kimi": 262144).5. Add
"kimi"and"moonshot"to PROVIDER_TO_MODELS_DEVMaps these bare provider names to
kimi-for-coding, consistent with existingkimi-codingandkimi-coding-cnentries. From PR #23950 by @nicoechaniz.Test plan