Skip to content

kimi-k2.6 on Ollama Cloud detected as 32K context despite API reporting 256K #23949

@Lord-Kedaar

Description

@Lord-Kedaar

Bug Description

Hermes Agent rejects kimi-k2.6 on Ollama Cloud with the following error:

ValueError: Model kimi-k2.6 has a context window of 32,768 tokens,
below the minimum 64,000 required by Hermes Agent.

However, the Ollama Cloud API correctly reports a context length of 262,144 (256K), and DEFAULT_CONTEXT_LENGTHS["kimi"] in the Hermes source code is also set to 262144.

Evidence

1. Ollama Cloud API returns the correct value

Endpoint: GET https://ollama.com/api/show

The response includes:

"model_info": {
  "kimi-k2.context_length": 262144
}

2. Server type detection succeeds

Endpoint: GET https://ollama.com/api/tags

Returns a valid model list. detect_local_server_type() should therefore identify the provider as "ollama".

3. Hermes source already knows the correct value

DEFAULT_CONTEXT_LENGTHS in model_metadata.py contains:

"kimi": 262144

4. Despite the above, run_agent.py throws

Lines ~2000–2011 raise:

ValueError: Model kimi-k2.6 has a context window of 32,768 tokens...

Root Cause Hypothesis

detect_local_server_type() may fail to identify https://ollama.com/v1 as an "ollama" provider because it is a remote/cloud endpoint rather than a local server. Alternatively, query_ollama_num_ctx() may not be called for remote Ollama instances at all.

A hardcoded fallback of 32,768 appears somewhere in the context-resolution chain. This value is not present in DEFAULT_CONTEXT_LENGTHS nor in the API response, so its origin is unclear.

Workaround

Add the following to config.yaml under the model section:

model:
  context_length: 262144

This bypasses automatic detection and allows the session to start normally.

Environment

Key Value
Hermes Agent ~0.11.0
Provider Ollama Cloud (https://ollama.com/v1)
Model kimi-k2.6
OS macOS 26.3
Affected configs Global (~/.hermes/config.yaml) and profile (~/.hermes/profiles/<profile>/config.yaml)

Suggested Fix

Investigate the context-length resolution path in model_metadata.py for remote ollama providers. Ensure query_ollama_num_ctx() is called and its result is used, rather than silently falling back to 32,768.

Metadata

Metadata

Assignees

No one assigned

    Labels

    P2Medium — degraded but workaround existscomp/agentCore agent loop, run_agent.py, prompt builderprovider/kimiKimi / Moonshotprovider/ollamaOllama / local modelstype/bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions