fix: correct context-length resolution for kimi-k2.6 on Ollama Cloud and Kimi Coding by kshitijk4poor · Pull Request #23980 · NousResearch/hermes-agent

kshitijk4poor · 2026-05-11T19:36:42Z

Summary

Kimi-k2.6 (which supports 262K context) was incorrectly resolved as 32K, tripping the 64K minimum-context guard and preventing use on Ollama Cloud, Kimi Coding / Moonshot, and any custom Ollama endpoint.

Closes #23949.

Root cause

The context-length resolution chain had three gaps:

Ollama native /api/show was never queried — the OpenAI-compat /v1/models correctly omits context_length, but Hermes never called the Ollama native endpoint which returns authoritative GGUF metadata (262144 for kimi-k2.6).
models.dev stores kimi-k2.6:cloud — lookup_models_dev_context only searched for bare names, missing the :cloud-suffixed entry.
OpenRouter reports 32768 for moonshotai/kimi-k2.6 — this stale metadata was accepted as truth, overriding the project's own curated DEFAULT_CONTEXT_LENGTHS table.

Changes

1. `_query_ollama_api_show()` — provider-agnostic Ollama native API probe

Queries /api/show at two points in the resolution chain:

Step 2b — for custom/unknown endpoints (after /v1/models fails)
Step 5e — for known providers with any base_url

For non-Ollama servers, the POST returns 404/405 quickly. Results are cached per model+URL. For hosted Ollama, prefers GGUF model_info.*.context_length over num_ctx.

2. `:cloud`/`-cloud` suffix fallback in `lookup_models_dev_context()`

When exact lookup fails, also tries appending :cloud and -cloud suffixes. Makes bare kimi-k2.6 match the kimi-k2.6:cloud entry in models.dev.

3. Gate OpenRouter metadata behind "if not effective_provider"

Based on PR #23950 by @nicoechaniz. Known providers should not be overridden by community-maintained OpenRouter data. When a provider is known (inferred from URL or set in config), skip OR and fall through to models.dev + curated defaults.

4. Kimi-family 32K guard (inside the OR gate)

For unknown providers where OR is still consulted: if OR returns exactly 32768 and _model_name_suggests_kimi() matches, reject and fall through to hardcoded defaults ("kimi": 262144).

5. Add `"kimi"` and `"moonshot"` to PROVIDER_TO_MODELS_DEV

Maps these bare provider names to kimi-for-coding, consistent with existing kimi-coding and kimi-coding-cn entries. From PR #23950 by @nicoechaniz.

Test plan

All 177 existing tests pass (test_model_metadata, test_models_dev, test_ollama_num_ctx, test_ollama_cloud_provider)
E2E verified: ollama-cloud, kimi-coding, and kimi-coding-cn all resolve kimi-k2.6 to 262144

…and Kimi Coding Kimi-k2.6 (which supports 262K context) was incorrectly resolved as 32K, tripping the 64K minimum-context guard and preventing use of the model on Ollama Cloud and Kimi Coding / Moonshot providers. Three fixes in the context-length resolution chain: 1. Ollama Cloud native /api/show query: new _query_ollama_api_show() queries the Ollama native API for authoritative GGUF model_info context_length. For hosted Ollama, prefers model_info over num_ctx since users can't set their own num_ctx on Cloud. Added at step 5e in get_model_context_length(), before the models.dev fallback. 2. models.dev :cloud/-cloud suffix fallback: lookup_models_dev_context() now also tries appending :cloud and -cloud suffixes when the bare model name doesn't match. models.dev stores 'kimi-k2.6:cloud' but users and the live API use bare 'kimi-k2.6'. 3. Kimi-family 32K guard: after the OpenRouter metadata step, reject exactly 32768 for Kimi-named models (kimi-*, moonshot*) and fall through to hardcoded defaults ('kimi': 262144). OpenRouter reports 32768 for moonshotai/kimi-k2.6 but the model actually supports 262K. Narrow filter — only 32768, only Kimi-family — becomes dead code when OpenRouter updates its metadata. ---

@nicoechaniz

…onshot to PROVIDER_TO_MODELS_DEV Based on PR NousResearch#23950 by @nicoechaniz. - Add "kimi" and "moonshot" to PROVIDER_TO_MODELS_DEV → kimi-for-coding - Gate OpenRouter metadata step behind "if not effective_provider": known providers should not be overridden by community-maintained OR data - Keep the targeted Kimi-family 32k guard as a secondary safety net inside the OR gate (for unknown providers with Kimi models) Co-authored-by: nicoechaniz <nicoechaniz@altermundi.net>

alt-glitch added type/bug Something isn't working comp/agent Core agent loop, run_agent.py, prompt builder provider/kimi Kimi / Moonshot provider/ollama Ollama / local models P2 Medium — degraded but workaround exists labels May 11, 2026

kshitijk4poor force-pushed the fix/kimi-context-length-resolution branch from b362822 to d63de45 Compare May 11, 2026 19:52

kshitijk4poor force-pushed the fix/kimi-context-length-resolution branch from d63de45 to a20971d Compare May 11, 2026 19:55

ParthSareen approved these changes May 11, 2026

View reviewed changes

kshitijk4poor mentioned this pull request May 11, 2026

fix(model-metadata): prioritize curated defaults over OpenRouter for known providers #23950

Closed

chore: add nicoechaniz to AUTHOR_MAP

45d9319

kshitijk4poor merged commit 9a63b5f into NousResearch:main May 11, 2026
6 of 7 checks passed

briandevans mentioned this pull request May 12, 2026

fix(model-metadata): extend Kimi 32k guard to Nous OpenRouter suffix-match #24066

Closed

4 tasks

alt-glitch mentioned this pull request May 12, 2026

Wrong context length for kimi-k2.6 family: OpenRouter returns 32K, overrides correct hardcoded 256K default #24268

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: correct context-length resolution for kimi-k2.6 on Ollama Cloud and Kimi Coding#23980

fix: correct context-length resolution for kimi-k2.6 on Ollama Cloud and Kimi Coding#23980
kshitijk4poor merged 3 commits into
NousResearch:mainfrom
kshitijk4poor:fix/kimi-context-length-resolution

kshitijk4poor commented May 11, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

kshitijk4poor commented May 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Root cause

Changes

1. _query_ollama_api_show() — provider-agnostic Ollama native API probe

2. :cloud/-cloud suffix fallback in lookup_models_dev_context()

3. Gate OpenRouter metadata behind "if not effective_provider"

4. Kimi-family 32K guard (inside the OR gate)

5. Add "kimi" and "moonshot" to PROVIDER_TO_MODELS_DEV

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

kshitijk4poor commented May 11, 2026 •

edited

Loading

1. `_query_ollama_api_show()` — provider-agnostic Ollama native API probe

2. `:cloud`/`-cloud` suffix fallback in `lookup_models_dev_context()`

5. Add `"kimi"` and `"moonshot"` to PROVIDER_TO_MODELS_DEV