fix: pass config_context_length to /model display and gateway fallback paths#12380
Closed
hsy5571615 wants to merge 1 commit into
Closed
fix: pass config_context_length to /model display and gateway fallback paths#12380hsy5571615 wants to merge 1 commit into
hsy5571615 wants to merge 1 commit into
Conversation
…k paths When switching models via /model command or processing context references in the gateway, get_model_context_length() was called without the config_context_length parameter. For custom providers whose /models endpoint does not return a context_length field, this silently fell back to the hardcoded default (128K), ignoring the user's explicit custom_providers[].models.<model>.context_length in config.yaml. Changes: - Add read_config_context_length() helper to agent/model_metadata.py - Fix /model display fallback in cli.py (2 locations) - Fix /model display fallback in gateway/run.py - Fix context references call in gateway/run.py - Add 8 tests for read_config_context_length() Fixes NousResearch#5089 Fixes NousResearch#8785
4 tasks
pinch-claw
added a commit
to pinch-claw/hermes-agent
that referenced
this pull request
Apr 23, 2026
…nner The session-reset / info banner in `_format_session_info` resolved the context window only from the top-level `model.context_length` key. When users configured context_length under the new `providers.<name>.models.<model>` dict schema (or the legacy `custom_providers[].models.<model>` list schema), the banner fell through to `get_model_context_length()`'s probe chain. Remote OpenAI-compatible proxies frequently omit `context_length` from `/models` responses, so the probe failed silently and the banner displayed `128K tokens (default — set model.context_length in config to override)` — even though the runtime `ContextCompressor` was already budgeting with the correct value resolved by `AIAgent.__init__`. After the top-level lookup, walk `get_compatible_custom_providers(cfg)`, match the entry by `base_url` against the resolved runtime base_url, and read `entry["models"][model]["context_length"]`. This mirrors the exact resolution `AIAgent.__init__` performs at `run_agent.py:1608-1643` so the banner's displayed value matches what the compressor actually uses. Fixes NousResearch#5089. Relationship to existing open PRs (NousResearch/hermes-agent): - NousResearch#5096, NousResearch#8240, NousResearch#10690: each hand-roll traversal of the legacy `custom_providers:` list only. They do not cover the newer `providers:` dict schema users land on via `hermes model`. - NousResearch#12380: touches `/model` display + gateway fallback paths; overlaps in spirit but still traverses lists manually. Routing through `get_compatible_custom_providers()` — the existing compat shim at `hermes_cli/config.py:2078` — gives a single lookup that covers both schemas and stays consistent with every other runtime caller, eliminating future drift between display and compressor.
Collaborator
|
Likely duplicate of #12630 — same root cause: |
This was referenced Apr 24, 2026
Contributor
|
Thanks for the contribution @hsy5571615! This is a valid bug report and the fix direction is correct. This is an automated hermes-sweeper review. The root cause you identified —
Your PR also noted #5089 and #8785 as related — those are covered by the same fix. The duplicate flagged by @alt-glitch (#12630) is also superseded by the same commit. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
/modeldisplay and several other runtime paths callget_model_context_length()without passingconfig_context_length. For custom providers whose/modelsendpoint doesn't return acontext_lengthfield, this silently falls back to the hardcoded default (128,000 tokens), ignoring the user's explicitcustom_providers[].models.<model>.context_lengthin config.yaml.This affects:
/modelcommand display (CLI + gateway) — always shows 128K regardless of config@references) — uses wrong context windowFix
New helper
read_config_context_length()inagent/model_metadata.py— reads context_length from config.yaml, checking bothmodel.context_lengthandcustom_providers[].models.<model>.context_length(matching the resolution order inrun_agent.py).Updated all
get_model_context_length()call sites that were missingconfig_context_length:cli.py× 2 (both/modeldisplay fallback paths)gateway/run.py× 2 (/modeldisplay + context references)8 new tests in
test_model_metadata.pycovering:Testing