Skip to content

fix(gateway): read nested per-model context_length in session info banner#14382

Open
pinch-claw wants to merge 1 commit into
NousResearch:mainfrom
pinch-claw:fix/banner-nested-context-length
Open

fix(gateway): read nested per-model context_length in session info banner#14382
pinch-claw wants to merge 1 commit into
NousResearch:mainfrom
pinch-claw:fix/banner-nested-context-length

Conversation

@pinch-claw

Copy link
Copy Markdown

Summary

Fixes #5089 (also partially addresses concerns raised in #2513, #12977 for the banner-display path).

The session-reset / info banner in _format_session_info (gateway/run.py) resolved the context window only from the top-level model.context_length key. When users configure context_length under either the newer providers.<name>.models.<model> dict schema or the legacy custom_providers[].models.<model> list schema, the banner fell through to get_model_context_length()'s probe chain. Remote OpenAI-compatible proxies frequently omit context_length from /models responses, so the probe failed silently and the banner displayed:

◆ Context: 128K tokens (default — set model.context_length in config to override)

— even though the runtime ContextCompressor was already budgeting with the correct value (resolved by AIAgent.__init__ at run_agent.py:1608-1643). A display-vs-truth inconsistency rather than a functional bug, but confusing and repeatedly reported (#5089, #2513).

Fix

After the existing top-level lookup, walk get_compatible_custom_providers(cfg), match the entry by base_url against the resolved runtime base_url, and read entry["models"][model]["context_length"]. Pass that through to get_model_context_length() as config_context_length; step 0 short-circuits the probe.

This mirrors the exact resolution AIAgent.__init__ already performs, so the banner's displayed value matches what the compressor actually uses.

Why this is an improvement over existing open PRs

Four other open PRs target #5089:

This PR routes through get_compatible_custom_providers() — the existing compat shim at hermes_cli/config.py:2078 that already bridges both schemas for the rest of the runtime. One lookup, both schemas, stays consistent with AIAgent.__init__, and eliminates future drift between the banner and the compressor.

Reproduction

~/.hermes/config.yaml (new-schema form — the legacy list form reproduces identically):

model:
  default: kimi-k2.6
  provider: my-proxy
providers:
  my-proxy:
    name: my-proxy
    api: https://my-openai-proxy.example/v1
    api_key: ${MY_PROXY_KEY}
    models:
      kimi-k2.6:
        context_length: 262144

Restart gateway, trigger /reset:

  • Before: ◆ Context: 128K tokens (default — set model.context_length in config to override)
  • After: ◆ Context: 262K tokens (config)

Verification

Simulated the patched block against a real-world config (exact code shape of the fix, not a standalone re-implementation):

after top-level: None
after nested fallback: 262144
Banner would display: 262K tokens (config)

No functional change to the compressor or any runtime code path — this only affects what _format_session_info displays.

Test plan

  • Existing tests in tests/gateway/test_session_info.py still pass.
  • Manual: gateway restart with new-schema providers: config → banner shows nested context_length.
  • Manual: gateway restart with legacy custom_providers: list config → banner shows nested context_length (unchanged behavior vs fix(gateway): honor custom provider context length #5096 et al).
  • Manual: gateway restart without any nested context_length → probe still runs as before.

@pinch-claw pinch-claw force-pushed the fix/banner-nested-context-length branch from e102354 to c47e863 Compare April 23, 2026 05:01
…nner

The session-reset / info banner in `_format_session_info` resolved the
context window only from the top-level `model.context_length` key. When
users configured context_length under the new `providers.<name>.models.<model>`
dict schema (or the legacy `custom_providers[].models.<model>` list
schema), the banner fell through to `get_model_context_length()`'s probe
chain. Remote OpenAI-compatible proxies frequently omit `context_length`
from `/models` responses, so the probe failed silently and the banner
displayed `128K tokens (default — set model.context_length in config to
override)` — even though the runtime `ContextCompressor` was already
budgeting with the correct value resolved by `AIAgent.__init__`.

After the top-level lookup, walk `get_compatible_custom_providers(cfg)`,
match the entry by `base_url` against the resolved runtime base_url, and
read `entry["models"][model]["context_length"]`. This mirrors the exact
resolution `AIAgent.__init__` performs at `run_agent.py:1608-1643` so the
banner's displayed value matches what the compressor actually uses.

Fixes NousResearch#5089.

Relationship to existing open PRs (NousResearch/hermes-agent):
- NousResearch#5096, NousResearch#8240, NousResearch#10690: each hand-roll traversal of the legacy
  `custom_providers:` list only. They do not cover the newer
  `providers:` dict schema users land on via `hermes model`.
- NousResearch#12380: touches `/model` display + gateway fallback paths; overlaps
  in spirit but still traverses lists manually.

Routing through `get_compatible_custom_providers()` — the existing
compat shim at `hermes_cli/config.py:2078` — gives a single lookup that
covers both schemas and stays consistent with every other runtime
caller, eliminating future drift between display and compressor.
@pinch-claw pinch-claw force-pushed the fix/banner-nested-context-length branch from c47e863 to 6d54494 Compare April 23, 2026 05:05
@alt-glitch alt-glitch added type/bug Something isn't working P2 Medium — degraded but workaround exists comp/gateway Gateway runner, session dispatch, delivery area/config Config system, migrations, profiles labels Apr 23, 2026
@alt-glitch

Copy link
Copy Markdown
Collaborator

Supersedes #5096, #8240, #10690 by covering both dict schema and legacy list schema. Those PRs only handle the legacy path.

@alt-glitch

Copy link
Copy Markdown
Collaborator

Supersedes #5096, #8240, #10690.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/config Config system, migrations, profiles comp/gateway Gateway runner, session dispatch, delivery P2 Medium — degraded but workaround exists type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bug: Status banner fails to detect context_length from custom_providers

2 participants