provider: nous falls back to 32,768-token context, blocking boot with model.context_length workaround required

## Summary

A model configured under `provider: nous` (e.g. `moonshotai/kimi-k2.6`) cannot boot Hermes Agent because `agent.model_metadata.get_model_context_length()` returns the hardcoded `32,768` fallback, which is below the 64,000-token minimum enforced at `run_agent.py:2254`.

## Repro

`~/.hermes/config.yaml`:

```yaml
model:
  default: moonshotai/kimi-k2.6
  provider: nous
  base_url: https://inference-api.nousresearch.com/v1
```

Result on startup:

```
Model moonshotai/kimi-k2.6 has a context window of 32,768 tokens, which is below the minimum 64,000 required by Hermes Agent.  Choose a model with at least 64K context, or set model.context_length in config.yaml to override.
```

## Root cause

`agent/model_metadata.py::get_model_context_length()` reads `~/.hermes/models_dev_cache.json`. The cache has **no `nous` provider entries** at top level. Lookups for `provider: nous` therefore miss and the function returns its hardcoded `32_768` default.

Per the providers doc, Nous Portal is documented as suffix-matching Nous model IDs against OpenRouter metadata — i.e. there is no standalone `nous` provider in models.dev; it piggybacks on OpenRouter. But when the user writes `provider: nous` literally in config, that suffix-match path isn't exercised by the context-length lookup, and the lookup falls through to the 32k default.

The 32k figure is **not** a real upstream API cap — `moonshotai/kimi-k2.6` is 262,144 tokens per Moonshot's spec, and OpenRouter's cache correctly carries `limit.context: 262144` for the same model.

## Workaround

Add `model.context_length` to `config.yaml`:

```yaml
model:
  default: moonshotai/kimi-k2.6
  provider: nous
  base_url: https://inference-api.nousresearch.com/v1
  context_length: 262144
```

The override path at `run_agent.py:2220` (`config_context_length=_config_context_length`) feeds this directly into `get_model_context_length()`, bypassing the missing-metadata fallback.

## Suggested fix (one or more)

1. When `provider: nous`, have `get_model_context_length()` apply the same OpenRouter suffix-match fallback the wizard/providers doc describes — so the context lookup matches the model-resolution path.
2. Or: populate `models_dev_cache.json` with a `nous` provider section that mirrors OpenRouter's Moonshot/Kimi entries.
3. At minimum, surface a clearer error: distinguish "model's real context is below 64k" from "metadata not found for provider, falling back to 32,768" — the current message implies the model has a 32k window, which is wrong and misleads users into switching models unnecessarily.

## Related

- #5173 — same pattern (cache returns bogus 32k for a model that should be much bigger), different provider; PR #5179.
- #12440 — same error string, but root-caused to `delegate_task` ignoring model config; PR #12503.
- #31 — Nous Portal not first-class in the setup wizard; relates to the missing-provider gap.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

provider: nous falls back to 32,768-token context, blocking boot with model.context_length workaround required #24000

Summary

Repro

Root cause

Workaround

Suggested fix (one or more)

Related

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

provider: nous falls back to 32,768-token context, blocking boot with model.context_length workaround required #24000

Description

Summary

Repro

Root cause

Workaround

Suggested fix (one or more)

Related

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions