fix(/model): respect per-model context_length from custom_providers config by pdscomp · Pull Request #11438 · NousResearch/hermes-agent

pdscomp · 2026-04-17T06:03:19Z

Summary

fix: resolve context_length from per-model custom_providers[].models[].context_length before falling back to the generic get_model_context_length probe chain
fix: AIAgent.switch_model() clears _config_context_length on provider change so per-model overrides are re-resolved on every /model switch
fix: ModelSwitchResult carries resolved context_length to all confirmation paths (CLI, gateway Telegram, gateway webhook) with priority over model_info.context_window
fix: _restore_modal_input_snapshot() discards dispatched slash-commands (prevents /model re-appearing in input bar after send)

Root cause

When switching to a custom provider model (e.g. legion / Qwen3.6), the context_length was resolved via get_model_context_length() which probes the provider's OpenAI-compatible /models endpoint. The R523/R528 llama.cpp server at custom base URLs does not surface context window in its /models response, so the probe falls back to a default (128k). The config correctly specifies context_length under custom_providers[].models[], but that value was never consulted during /model switching — only at startup.

Additionally, AIAgent._config_context_length was cached on the agent object and never cleared on provider change, so even the startup lookup was stale when switching between custom providers.

Files changed

hermes_cli/model_switch.py: Added context_length field to ModelSwitchResult; lookup per-model context_length from custom_providers config before fallback
run_agent.py: Clear _config_context_length on provider change; pass per-model override to get_model_context_length
cli.py: Prefer result.context_length in CLI confirmation output
gateway/run.py: Prefer result.context_length in Telegram and webhook confirmation output

Validation

/model switch to legion+Qwen3.6 reports correct 256K context in Telegram confirmation
python -m py_compile on all 4 modified files passes
Gateway and CLI confirmations both show correct context length for per-model overrides

Copilot

Pull request overview

This PR fixes /model switching so per-model context_length overrides defined under custom_providers[].models[].context_length are respected (rather than falling back to probing /models, which can be incomplete for some OpenAI-compatible servers), and propagates the resolved context length into user-facing confirmations.

Changes:

Resolve per-model context_length from custom_providers during model switch and propagate it via ModelSwitchResult.context_length.
Reset cached context-length override behavior in AIAgent.switch_model() to re-resolve overrides on provider changes.
Prefer result.context_length in CLI and gateway confirmations; discard dispatched slash-commands when restoring modal input snapshot.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 6 comments.

File	Description
run_agent.py	Clears cached context-length override on provider change; attempts per-model override lookup before probing context length.
hermes_cli/model_switch.py	Adds `context_length` to `ModelSwitchResult` and resolves it from `custom_providers` config during switch.
cli.py	Updates `/model` confirmation output to prefer `result.context_length`; avoids restoring dispatched slash-commands into the input buffer.
gateway/run.py	Updates Telegram/webhook confirmation output to prefer `result.context_length` before models.dev/probing fallback.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

…onfig ## Summary - fix: resolve context_length from per-model custom_providers[].models[].context_length before falling back to the generic get_model_context_length probe chain - fix: AIAgent.switch_model() clears _config_context_length on provider change so per-model overrides are re-resolved on every /model switch - fix: ModelSwitchResult carries resolved context_length to all confirmation paths (CLI, gateway Telegram, gateway webhook) with priority over model_info.context_window - fix: _restore_modal_input_snapshot() discards slash-commands that were dispatched (prevents /model re-appearing in input bar after send) - fix: run_agent._restore_modal_input_snapshot no longer restores dispatched slash commands ## Root cause When switching to a custom provider model (e.g. legion / Qwen3.6), the context_length was resolved via get_model_context_length() which probes the provider's OpenAI-compatible /models endpoint. The R523/R528 llama.cpp server at custom base URLs does not surface context window in its /models response, so the probe falls back to a default (128k). The config correctly specifies context_length under custom_providers[].models[], but that value was never consulted during /model switching — only at startup. Additionally, AIAgent._config_context_length was cached on the agent object and never cleared on provider change, so even the startup lookup was stale when switching between custom providers. ## Validation - /model switch to legion+Qwen3.6 reports correct 256K context in Telegram confirmation - python -m py_compile on all 4 modified files passes - Gateway and CLI confirmations both show correct context length for per-model overrides

Render max_output, cost, and capabilities even when ModelSwitchResult.context_length is present, using context_length only for the context line override in CLI and gateway confirmations.\n\nAdd regression tests for both CLI and gateway confirmation paths.\n\nCo-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Keep global model.context_length override intact during provider changes and resolve per-model custom provider context_length from switch_model(custom_providers=...) before any config reload fallback.\n\nAdd focused regressions for provider-change override persistence and custom_providers argument priority.\n\nCo-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

alt-glitch · 2026-04-24T23:08:28Z

Overlaps significantly with #13052 — both fix custom_providers per-model context_length being ignored during /model switch. Also related to #12316 and #12380.

teknium1 · 2026-04-27T04:34:15Z

Thanks for the thorough investigation and fix, @pdscomp! This has been superseded by a maintainer-authored fix that landed on main.

This is an automated hermes-sweeper review.

Why closing: PR #15844 (merged 2026-04-26 by @teknium1) resolves the same root cause — custom_providers per-model context_length ignored on /model switch — and covers every fix surface this PR addresses:

hermes_cli/config.py line 2245 — new get_custom_provider_context_length() helper (single source of truth for the per-model lookup)
run_agent.py line 1798 — switch_model() re-reads custom_providers from live config and passes overrides to get_model_context_length on every /model switch
hermes_cli/model_switch.py — resolve_display_context_length() gains a custom_providers= kwarg wired through the display path

Commit: 125de02056eab84362fc91f57bd7041a19860b22

Related: #15779 (upstream issue, now closed), #13052 (overlapping fix PR), #15787 (another fix PR for the same issue).

Copilot AI review requested due to automatic review settings April 17, 2026 06:03

Copilot started reviewing on behalf of pdscomp April 17, 2026 06:03 View session

Copilot AI reviewed Apr 17, 2026

View reviewed changes

Comment thread cli.py Outdated

Comment thread cli.py Outdated

Comment thread run_agent.py Outdated

Comment thread hermes_cli/model_switch.py Outdated

Comment thread hermes_cli/model_switch.py Outdated

Comment thread gateway/run.py Outdated

pdscomp force-pushed the fix/per-model-context-length branch from 9b95d14 to d9e68f6 Compare April 17, 2026 06:40

pdscomp added 2 commits April 17, 2026 02:45

alt-glitch mentioned this pull request Apr 22, 2026

fix: resolve context_length from custom_providers on model switch #13052

Closed

alt-glitch added type/bug Something isn't working P2 Medium — degraded but workaround exists comp/cli CLI entry point, hermes_cli/, setup wizard comp/gateway Gateway runner, session dispatch, delivery comp/agent Core agent loop, run_agent.py, prompt builder labels Apr 24, 2026

This was referenced Apr 25, 2026

Bug: /model switch to named custom provider ignores custom_providers model context_length #15779

Closed

fix(model-switch): honor custom_providers per-model context_length on /model switch (#15779) #15787

Closed

teknium1 closed this Apr 27, 2026

alt-glitch mentioned this pull request May 16, 2026

fix: clear stale global context length on model switch #26900

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(/model): respect per-model context_length from custom_providers config#11438

fix(/model): respect per-model context_length from custom_providers config#11438
pdscomp wants to merge 3 commits into
NousResearch:mainfrom
pdscomp:fix/per-model-context-length

pdscomp commented Apr 17, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

alt-glitch commented Apr 24, 2026

Uh oh!

teknium1 commented Apr 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

pdscomp commented Apr 17, 2026

Summary

Root cause

Files changed

Validation

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

alt-glitch commented Apr 24, 2026

Uh oh!

teknium1 commented Apr 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants