fix(model-switch): honor custom_providers per-model context_length on /model switch (#15779)#15787
Conversation
… /model switch (NousResearch#15779) When a live session switched to a named custom provider via `/model`, Hermes ignored `custom_providers[].models.<model>.context_length` in two places: 1. **Display** — `resolve_display_context_length()` called `get_model_context_length()` without `config_context_length`, so the per-model override never reached the resolver and the confirmation message always reported the auto-detected default (128 K). 2. **Runtime** — `run_agent.switch_model()` forwarded `self._config_context_length`, which was computed at startup for the *original* model and never refreshed for the new target. Context compression after the switch used the wrong window. Fix: introduce `_lookup_custom_provider_context_length(model, base_url)` in `hermes_cli.model_switch` that reads `get_compatible_custom_providers()` and returns the configured value for the matching (model, base_url) pair. Both `resolve_display_context_length()` and `switch_model()` in `run_agent.py` now call this helper so the correct context window is used for display and compression after every model switch. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Fixes incorrect context window handling when switching to named custom providers via /model, ensuring both the confirmation display and runtime context compression honor custom_providers[].models.<model>.context_length.
Changes:
- Add
_lookup_custom_provider_context_length(model, base_url)to read per-model context length overrides fromcustom_providers. - Forward the per-model override into
resolve_display_context_length()viaconfig_context_length. - Refresh the agent’s config context length during
run_agent.switch_model()to keep compression aligned after a switch.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
hermes_cli/model_switch.py |
Adds custom-provider per-model context length lookup and forwards it into display context resolution. |
run_agent.py |
Updates model-switch path to refresh the context-length override used by the context compressor. |
tests/hermes_cli/test_model_switch_custom_provider_context_length.py |
Adds regression tests for the helper and display path behavior. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| for cp in providers: | ||
| if not isinstance(cp, dict): | ||
| continue | ||
| if (cp.get("base_url") or "").rstrip("/").lower() == norm_url: | ||
| cp_models = cp.get("models") or {} | ||
| if isinstance(cp_models, dict): | ||
| model_cfg = cp_models.get(model) or {} | ||
| if isinstance(model_cfg, dict): | ||
| ctx = model_cfg.get("context_length") | ||
| if ctx is not None: | ||
| try: | ||
| return int(ctx) | ||
| except (TypeError, ValueError): | ||
| return None | ||
| break | ||
| return None |
| try: | ||
| return int(ctx) | ||
| except (TypeError, ValueError): | ||
| return None |
| from unittest.mock import patch, MagicMock | ||
|
|
||
| import pytest | ||
|
|
| # Refresh config_context_length for the new model/provider: the | ||
| # startup value captured the original model's custom_providers entry | ||
| # and does not reflect the newly switched target. | ||
| _switch_config_ctx = getattr(self, "_config_context_length", None) | ||
| try: | ||
| from hermes_cli.model_switch import _lookup_custom_provider_context_length | ||
| _cp_ctx = _lookup_custom_provider_context_length(self.model, self.base_url) | ||
| if _cp_ctx is not None: | ||
| _switch_config_ctx = _cp_ctx | ||
| except Exception: | ||
| pass | ||
| self._config_context_length = _switch_config_ctx | ||
| new_context_length = get_model_context_length( | ||
| self.model, | ||
| base_url=self.base_url, | ||
| api_key=self.api_key, | ||
| provider=self.provider, | ||
| config_context_length=getattr(self, "_config_context_length", None), | ||
| config_context_length=_switch_config_ctx, | ||
| ) |
Four issues found by Copilot inline review: 1. `_lookup_custom_provider_context_length` had a `break` that exited the loop on first base_url match regardless of whether the target model was found. Multiple providers sharing a base_url would silently return None even when a later entry defined the model. Removed the break so all matching-URL providers are scanned. 2. `int(ctx)` silently accepted booleans (True→1) and floats-as-strings ("1.5"→1). Now explicitly rejects bools, requires positive value, and only accepts digit-only strings. 3. `switch_model` defaulted `_switch_config_ctx` to the *startup* `_config_context_length`, so switching away from a custom provider that had a large per-model limit would keep that stale cap on the new model. Now resets to None each switch and only sets if the lookup finds an entry. 4. Unused `MagicMock` and `pytest` imports in test file removed. Tests: added 6 new cases covering shared-URL providers, boolean rejection, negative ints, digit-only strings, and the stale-cap reset. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Thanks @copilot — all four findings are real. Addressed in `08e50549`. 1. Early-exit `break` blocks second provider with same base_url ✅ 2. `int(ctx)` silently accepts booleans and floats ✅ 3. Stale `_config_context_length` carried over on model switch ✅ 4. Unused `MagicMock` and `pytest` imports ✅ |
|
Re @alt-glitch's cross-reference to #15706 and #11438/#13052 — positioning:
#15706 also fixes the agent-init startup path (initial Happy to defer to whichever approach the maintainers prefer — they're solving the same root cause. |
|
All four issues were addressed in the follow-up commit
|
|
Closing — superseded by @teknium1's #15844 (commit 125de02), which landed first and is the broader fix. Both PRs target #15779 (custom_providers per-model
No remaining gap. Thanks @teknium1. |
Summary
_lookup_custom_provider_context_length(model, base_url)helper inhermes_cli.model_switchthat readscustom_providers[].models.<model>.context_lengthfor a given (model, base_url) pairresolve_display_context_length()now forwards the per-model override asconfig_context_lengthto the resolver, so/modelconfirmation shows the correct windowrun_agent.switch_model()refreshesself._config_context_lengthon each switch so context compression uses the correct window after the switchThe bug
When a live session switched to a named custom provider via
/model, Hermes ignored per-modelcontext_lengthconfigured undercustom_providers[].models.<model>.context_lengthin two independent code paths:Display (
resolve_display_context_length): calledget_model_context_length()withoutconfig_context_length, so the per-model override never reached the resolver and the confirmation message always reported the auto-detected default (128 K instead of the configured 1,050,000).Runtime (
run_agent.switch_model): forwardedself._config_context_length, which was computed at startup for the original model and never refreshed for the new target. Context compression after the switch operated on the wrong window size.The fix
A new
_lookup_custom_provider_context_length(model, base_url)helper inhermes_cli.model_switchlooks up the per-modelcontext_lengthfromget_compatible_custom_providers()by matchingbase_url(trailing-slash insensitive, lowercase). Both broken sites call this helper so the correct context window is used for display and compression after every model switch.Test plan
ImportError(function doesn't exist on stockmain) — proves tests are not tautologicalImportErroron import of_lookup_custom_provider_context_length; restored → 13/13 passtest_model_switch_context_display,test_model_switch_custom_providers,test_user_providers_model_switch) unchanged — 36 pass, 6 pre-existing baseline failures intest_custom_provider_model_switch.pyalso present on cleanorigin/mainRelated
🤖 Generated with Claude Code