Skip to content

fix: pass config_context_length to /model display and gateway fallback paths#12380

Closed
hsy5571615 wants to merge 1 commit into
NousResearch:mainfrom
hsy5571615:fix/custom-provider-context-length-display
Closed

fix: pass config_context_length to /model display and gateway fallback paths#12380
hsy5571615 wants to merge 1 commit into
NousResearch:mainfrom
hsy5571615:fix/custom-provider-context-length-display

Conversation

@hsy5571615

Copy link
Copy Markdown

Problem

/model display and several other runtime paths call get_model_context_length() without passing config_context_length. For custom providers whose /models endpoint doesn't return a context_length field, this silently falls back to the hardcoded default (128,000 tokens), ignoring the user's explicit custom_providers[].models.<model>.context_length in config.yaml.

This affects:

Fix

  1. New helper read_config_context_length() in agent/model_metadata.py — reads context_length from config.yaml, checking both model.context_length and custom_providers[].models.<model>.context_length (matching the resolution order in run_agent.py).

  2. Updated all get_model_context_length() call sites that were missing config_context_length:

    • cli.py × 2 (both /model display fallback paths)
    • gateway/run.py × 2 (/model display + context references)
  3. 8 new tests in test_model_metadata.py covering:

    • Top-level model.context_length
    • custom_providers per-model context_length
    • Priority (model.context_length wins)
    • Edge cases: zero, invalid, missing config, URL mismatch, trailing slash normalization

Testing

tests/agent/test_model_metadata.py — 88 passed (8 new)

…k paths

When switching models via /model command or processing context references
in the gateway, get_model_context_length() was called without the
config_context_length parameter. For custom providers whose /models
endpoint does not return a context_length field, this silently fell back
to the hardcoded default (128K), ignoring the user's explicit
custom_providers[].models.<model>.context_length in config.yaml.

Changes:
- Add read_config_context_length() helper to agent/model_metadata.py
- Fix /model display fallback in cli.py (2 locations)
- Fix /model display fallback in gateway/run.py
- Fix context references call in gateway/run.py
- Add 8 tests for read_config_context_length()

Fixes NousResearch#5089
Fixes NousResearch#8785
pinch-claw added a commit to pinch-claw/hermes-agent that referenced this pull request Apr 23, 2026
…nner

The session-reset / info banner in `_format_session_info` resolved the
context window only from the top-level `model.context_length` key. When
users configured context_length under the new `providers.<name>.models.<model>`
dict schema (or the legacy `custom_providers[].models.<model>` list
schema), the banner fell through to `get_model_context_length()`'s probe
chain. Remote OpenAI-compatible proxies frequently omit `context_length`
from `/models` responses, so the probe failed silently and the banner
displayed `128K tokens (default — set model.context_length in config to
override)` — even though the runtime `ContextCompressor` was already
budgeting with the correct value resolved by `AIAgent.__init__`.

After the top-level lookup, walk `get_compatible_custom_providers(cfg)`,
match the entry by `base_url` against the resolved runtime base_url, and
read `entry["models"][model]["context_length"]`. This mirrors the exact
resolution `AIAgent.__init__` performs at `run_agent.py:1608-1643` so the
banner's displayed value matches what the compressor actually uses.

Fixes NousResearch#5089.

Relationship to existing open PRs (NousResearch/hermes-agent):
- NousResearch#5096, NousResearch#8240, NousResearch#10690: each hand-roll traversal of the legacy
  `custom_providers:` list only. They do not cover the newer
  `providers:` dict schema users land on via `hermes model`.
- NousResearch#12380: touches `/model` display + gateway fallback paths; overlaps
  in spirit but still traverses lists manually.

Routing through `get_compatible_custom_providers()` — the existing
compat shim at `hermes_cli/config.py:2078` — gives a single lookup that
covers both schemas and stays consistent with every other runtime
caller, eliminating future drift between display and compressor.
@alt-glitch alt-glitch added type/bug Something isn't working P2 Medium — degraded but workaround exists comp/cli CLI entry point, hermes_cli/, setup wizard comp/gateway Gateway runner, session dispatch, delivery labels Apr 23, 2026
@alt-glitch

Copy link
Copy Markdown
Collaborator

Likely duplicate of #12630 — same root cause: config_context_length not passed to get_model_context_length() call sites.

@teknium1

Copy link
Copy Markdown
Contributor

Thanks for the contribution @hsy5571615! This is a valid bug report and the fix direction is correct.

This is an automated hermes-sweeper review. The root cause you identified — get_model_context_length() call sites in cli.py and gateway/run.py silently ignoring custom_providers[].models.<model>.context_length — has since been fixed on main by a broader implementation:

Your PR also noted #5089 and #8785 as related — those are covered by the same fix. The duplicate flagged by @alt-glitch (#12630) is also superseded by the same commit.

@teknium1 teknium1 closed this Apr 27, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/cli CLI entry point, hermes_cli/, setup wizard comp/gateway Gateway runner, session dispatch, delivery P2 Medium — degraded but workaround exists type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants