Skip to content

provider: nous falls back to 32,768-token context, blocking boot with model.context_length workaround required #24000

@millerc79

Description

@millerc79

Summary

A model configured under provider: nous (e.g. moonshotai/kimi-k2.6) cannot boot Hermes Agent because agent.model_metadata.get_model_context_length() returns the hardcoded 32,768 fallback, which is below the 64,000-token minimum enforced at run_agent.py:2254.

Repro

~/.hermes/config.yaml:

model:
  default: moonshotai/kimi-k2.6
  provider: nous
  base_url: https://inference-api.nousresearch.com/v1

Result on startup:

Model moonshotai/kimi-k2.6 has a context window of 32,768 tokens, which is below the minimum 64,000 required by Hermes Agent.  Choose a model with at least 64K context, or set model.context_length in config.yaml to override.

Root cause

agent/model_metadata.py::get_model_context_length() reads ~/.hermes/models_dev_cache.json. The cache has no nous provider entries at top level. Lookups for provider: nous therefore miss and the function returns its hardcoded 32_768 default.

Per the providers doc, Nous Portal is documented as suffix-matching Nous model IDs against OpenRouter metadata — i.e. there is no standalone nous provider in models.dev; it piggybacks on OpenRouter. But when the user writes provider: nous literally in config, that suffix-match path isn't exercised by the context-length lookup, and the lookup falls through to the 32k default.

The 32k figure is not a real upstream API cap — moonshotai/kimi-k2.6 is 262,144 tokens per Moonshot's spec, and OpenRouter's cache correctly carries limit.context: 262144 for the same model.

Workaround

Add model.context_length to config.yaml:

model:
  default: moonshotai/kimi-k2.6
  provider: nous
  base_url: https://inference-api.nousresearch.com/v1
  context_length: 262144

The override path at run_agent.py:2220 (config_context_length=_config_context_length) feeds this directly into get_model_context_length(), bypassing the missing-metadata fallback.

Suggested fix (one or more)

  1. When provider: nous, have get_model_context_length() apply the same OpenRouter suffix-match fallback the wizard/providers doc describes — so the context lookup matches the model-resolution path.
  2. Or: populate models_dev_cache.json with a nous provider section that mirrors OpenRouter's Moonshot/Kimi entries.
  3. At minimum, surface a clearer error: distinguish "model's real context is below 64k" from "metadata not found for provider, falling back to 32,768" — the current message implies the model has a 32k window, which is wrong and misleads users into switching models unnecessarily.

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    P2Medium — degraded but workaround existscomp/agentCore agent loop, run_agent.py, prompt builderprovider/nousNous Research API (OAuth)type/bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions