Bug Description
When a running gateway session switches to a named custom provider via /model, Hermes ignores the per-model context_length configured under custom_providers[].models.<model>.context_length and falls back to the default 128,000 token window.
This is reproducible even when the same config is already sufficient for the startup/session-reset path to resolve the correct context window.
Related but not identical: #5089.
Steps to Reproduce
- Configure a named custom provider in
~/.hermes/config.yaml with a per-model context override:
model:
default: MiniMax-M2.7
provider: minimax-cn
base_url: https://api.minimaxi.com/v1
custom_providers:
- name: my-custom-endpoint
base_url: https://example.invalid/v1
api_key: <redacted>
model: gpt-5.5
models:
gpt-5.5:
context_length: 1050000
- Start a fresh gateway chat session with
/new.
- In that running session, switch to the named custom provider:
/model gpt-5.5 --provider custom:my-custom-endpoint
(Equivalent triple syntax also reproduces it: /model custom:my-custom-endpoint:gpt-5.5)
- Observe the
/model switch confirmation.
Expected Behavior
After the switch, Hermes should use the configured per-model context window from custom_providers and report something equivalent to:
Model switched to gpt-5.5
Provider: my-custom-endpoint
Context: 1,050,000 tokens
Actual Behavior
Hermes instead reports the fallback window:
Model switched to gpt-5.5
Provider: my-custom-endpoint
Context: 128,000 tokens
(session only -- add --global to persist)
Affected Component
Gateway model switching (/model), custom provider context resolution
Messaging Platform (if gateway-related)
Feishu
Debug Report
Report https://paste.rs/cXlQQ
agent.log https://paste.rs/i86Gq
gateway.log https://paste.rs/0ESpk
Operating System
macOS
Python Version
No response
Hermes Version
Observed on v2026.4.23 (commit bf196a3)
Additional Logs / Traceback (optional)
No traceback. The issue is in the resolved context window after `/model`.
Root Cause Analysis (optional)
There appear to be two different context-resolution paths:
- Startup / session reset path
run_agent.py reads per-model context_length from custom_providers during agent initialization.
- Mid-session
/model switch path
gateway/run.py builds the /model switch confirmation by calling get_model_context_length(...) without carrying the same per-model override.
run_agent.py's model-switch update path similarly forwards only _config_context_length, which comes from top-level model.context_length, not from custom_providers[].models.<id>.context_length.
Relevant locations observed while debugging:
run_agent.py:1619-1640 — startup path reads custom_providers per-model context_length
run_agent.py:1994-2004 — model-switch path recalculates context using only _config_context_length
gateway/run.py:5753-5760 — /model response path calls get_model_context_length(...) directly
This makes the behavior diverge:
- startup/reset can use the configured custom-provider context window
/model switch falls back to 128,000
Proposed Fix (optional)
Make the /model switch path reuse the same custom-provider override resolution that startup already uses.
Concretely, when config_context_length is absent at the top-level model: block and the active provider/base URL matches an entry in custom_providers, the switch path should look up:
custom_providers[].models.<resolved_model>.context_length
before falling back to endpoint probing or the default 128,000 window.
One way to do this would be to centralize that lookup so both startup and /model switch paths share the same resolution helper instead of duplicating partial logic.
Are you willing to submit a PR for this?
Bug Description
When a running gateway session switches to a named custom provider via
/model, Hermes ignores the per-modelcontext_lengthconfigured undercustom_providers[].models.<model>.context_lengthand falls back to the default128,000token window.This is reproducible even when the same config is already sufficient for the startup/session-reset path to resolve the correct context window.
Related but not identical: #5089.
Steps to Reproduce
~/.hermes/config.yamlwith a per-model context override:/new.(Equivalent triple syntax also reproduces it:
/model custom:my-custom-endpoint:gpt-5.5)/modelswitch confirmation.Expected Behavior
After the switch, Hermes should use the configured per-model context window from
custom_providersand report something equivalent to:Actual Behavior
Hermes instead reports the fallback window:
Affected Component
Gateway model switching (
/model), custom provider context resolutionMessaging Platform (if gateway-related)
Feishu
Debug Report
Report https://paste.rs/cXlQQ
agent.log https://paste.rs/i86Gq
gateway.log https://paste.rs/0ESpk
Operating System
macOS
Python Version
No response
Hermes Version
Observed on v2026.4.23 (commit bf196a3)
Additional Logs / Traceback (optional)
Root Cause Analysis (optional)
There appear to be two different context-resolution paths:
run_agent.pyreads per-modelcontext_lengthfromcustom_providersduring agent initialization./modelswitch pathgateway/run.pybuilds the/modelswitch confirmation by callingget_model_context_length(...)without carrying the same per-model override.run_agent.py's model-switch update path similarly forwards only_config_context_length, which comes from top-levelmodel.context_length, not fromcustom_providers[].models.<id>.context_length.Relevant locations observed while debugging:
run_agent.py:1619-1640— startup path readscustom_providersper-modelcontext_lengthrun_agent.py:1994-2004— model-switch path recalculates context using only_config_context_lengthgateway/run.py:5753-5760—/modelresponse path callsget_model_context_length(...)directlyThis makes the behavior diverge:
/modelswitch falls back to128,000Proposed Fix (optional)
Make the
/modelswitch path reuse the same custom-provider override resolution that startup already uses.Concretely, when
config_context_lengthis absent at the top-levelmodel:block and the active provider/base URL matches an entry incustom_providers, the switch path should look up:custom_providers[].models.<resolved_model>.context_lengthbefore falling back to endpoint probing or the default
128,000window.One way to do this would be to centralize that lookup so both startup and
/modelswitch paths share the same resolution helper instead of duplicating partial logic.Are you willing to submit a PR for this?