fix(gateway): read nested per-model context_length in session info banner by pinch-claw · Pull Request #14382 · NousResearch/hermes-agent

pinch-claw · 2026-04-23T04:59:37Z

Summary

Fixes #5089 (also partially addresses concerns raised in #2513, #12977 for the banner-display path).

The session-reset / info banner in _format_session_info (gateway/run.py) resolved the context window only from the top-level model.context_length key. When users configure context_length under either the newer providers.<name>.models.<model> dict schema or the legacy custom_providers[].models.<model> list schema, the banner fell through to get_model_context_length()'s probe chain. Remote OpenAI-compatible proxies frequently omit context_length from /models responses, so the probe failed silently and the banner displayed:

◆ Context: 128K tokens (default — set model.context_length in config to override)

— even though the runtime ContextCompressor was already budgeting with the correct value (resolved by AIAgent.__init__ at run_agent.py:1608-1643). A display-vs-truth inconsistency rather than a functional bug, but confusing and repeatedly reported (#5089, #2513).

Fix

After the existing top-level lookup, walk get_compatible_custom_providers(cfg), match the entry by base_url against the resolved runtime base_url, and read entry["models"][model]["context_length"]. Pass that through to get_model_context_length() as config_context_length; step 0 short-circuits the probe.

This mirrors the exact resolution AIAgent.__init__ already performs, so the banner's displayed value matches what the compressor actually uses.

Why this is an improvement over existing open PRs

Four other open PRs target #5089:

fix(gateway): honor custom provider context length #5096, fix(gateway): read custom provider context length in status #8240, fix: honor custom-provider context overrides in session info #10690: each hand-rolls traversal of the legacy custom_providers: list only. They do not cover the newer providers: dict schema that users land on via hermes model (and which config.yaml templates now use).
fix: pass config_context_length to /model display and gateway fallback paths #12380: broader in scope (touches /model display + gateway fallback paths) but still traverses lists manually.

This PR routes through get_compatible_custom_providers() — the existing compat shim at hermes_cli/config.py:2078 that already bridges both schemas for the rest of the runtime. One lookup, both schemas, stays consistent with AIAgent.__init__, and eliminates future drift between the banner and the compressor.

Reproduction

~/.hermes/config.yaml (new-schema form — the legacy list form reproduces identically):

model:
  default: kimi-k2.6
  provider: my-proxy
providers:
  my-proxy:
    name: my-proxy
    api: https://my-openai-proxy.example/v1
    api_key: ${MY_PROXY_KEY}
    models:
      kimi-k2.6:
        context_length: 262144

Restart gateway, trigger /reset:

Before: ◆ Context: 128K tokens (default — set model.context_length in config to override)
After: ◆ Context: 262K tokens (config)

Verification

Simulated the patched block against a real-world config (exact code shape of the fix, not a standalone re-implementation):

after top-level: None
after nested fallback: 262144
Banner would display: 262K tokens (config)

No functional change to the compressor or any runtime code path — this only affects what _format_session_info displays.

Test plan

Existing tests in tests/gateway/test_session_info.py still pass.
Manual: gateway restart with new-schema providers: config → banner shows nested context_length.
Manual: gateway restart with legacy custom_providers: list config → banner shows nested context_length (unchanged behavior vs fix(gateway): honor custom provider context length #5096 et al).
Manual: gateway restart without any nested context_length → probe still runs as before.

…nner The session-reset / info banner in `_format_session_info` resolved the context window only from the top-level `model.context_length` key. When users configured context_length under the new `providers.<name>.models.<model>` dict schema (or the legacy `custom_providers[].models.<model>` list schema), the banner fell through to `get_model_context_length()`'s probe chain. Remote OpenAI-compatible proxies frequently omit `context_length` from `/models` responses, so the probe failed silently and the banner displayed `128K tokens (default — set model.context_length in config to override)` — even though the runtime `ContextCompressor` was already budgeting with the correct value resolved by `AIAgent.__init__`. After the top-level lookup, walk `get_compatible_custom_providers(cfg)`, match the entry by `base_url` against the resolved runtime base_url, and read `entry["models"][model]["context_length"]`. This mirrors the exact resolution `AIAgent.__init__` performs at `run_agent.py:1608-1643` so the banner's displayed value matches what the compressor actually uses. Fixes NousResearch#5089. Relationship to existing open PRs (NousResearch/hermes-agent): - NousResearch#5096, NousResearch#8240, NousResearch#10690: each hand-roll traversal of the legacy `custom_providers:` list only. They do not cover the newer `providers:` dict schema users land on via `hermes model`. - NousResearch#12380: touches `/model` display + gateway fallback paths; overlaps in spirit but still traverses lists manually. Routing through `get_compatible_custom_providers()` — the existing compat shim at `hermes_cli/config.py:2078` — gives a single lookup that covers both schemas and stays consistent with every other runtime caller, eliminating future drift between display and compressor.

alt-glitch · 2026-04-23T05:14:00Z

Supersedes #5096, #8240, #10690 by covering both dict schema and legacy list schema. Those PRs only handle the legacy path.

alt-glitch · 2026-04-23T05:14:58Z

Supersedes #5096, #8240, #10690.

pinch-claw force-pushed the fix/banner-nested-context-length branch from e102354 to c47e863 Compare April 23, 2026 05:01

pinch-claw force-pushed the fix/banner-nested-context-length branch from c47e863 to 6d54494 Compare April 23, 2026 05:05

alt-glitch added type/bug Something isn't working P2 Medium — degraded but workaround exists comp/gateway Gateway runner, session dispatch, delivery area/config Config system, migrations, profiles labels Apr 23, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(gateway): read nested per-model context_length in session info banner#14382

fix(gateway): read nested per-model context_length in session info banner#14382
pinch-claw wants to merge 1 commit into
NousResearch:mainfrom
pinch-claw:fix/banner-nested-context-length

pinch-claw commented Apr 23, 2026

Uh oh!

alt-glitch commented Apr 23, 2026

Uh oh!

alt-glitch commented Apr 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

pinch-claw commented Apr 23, 2026

Summary

Fix

Why this is an improvement over existing open PRs

Reproduction

Verification

Test plan

Uh oh!

alt-glitch commented Apr 23, 2026

Uh oh!

alt-glitch commented Apr 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants