fix(model-switch): honor custom_providers per-model context_length on /model switch (#15779) by briandevans · Pull Request #15787 · NousResearch/hermes-agent

briandevans · 2026-04-25T21:13:03Z

Summary

Adds _lookup_custom_provider_context_length(model, base_url) helper in hermes_cli.model_switch that reads custom_providers[].models.<model>.context_length for a given (model, base_url) pair
resolve_display_context_length() now forwards the per-model override as config_context_length to the resolver, so /model confirmation shows the correct window
run_agent.switch_model() refreshes self._config_context_length on each switch so context compression uses the correct window after the switch

The bug

When a live session switched to a named custom provider via /model, Hermes ignored per-model context_length configured under custom_providers[].models.<model>.context_length in two independent code paths:

Display (resolve_display_context_length): called get_model_context_length() without config_context_length, so the per-model override never reached the resolver and the confirmation message always reported the auto-detected default (128 K instead of the configured 1,050,000).
Runtime (run_agent.switch_model): forwarded self._config_context_length, which was computed at startup for the original model and never refreshed for the new target. Context compression after the switch operated on the wrong window size.

The fix

A new _lookup_custom_provider_context_length(model, base_url) helper in hermes_cli.model_switch looks up the per-model context_length from get_compatible_custom_providers() by matching base_url (trailing-slash insensitive, lowercase). Both broken sites call this helper so the correct context window is used for display and compression after every model switch.

Test plan

Before: tests fail with ImportError (function doesn't exist on stock main) — proves tests are not tautological
After: 13/13 new tests pass covering positive, negative, edge, and idempotency cases
Regression guard: stashed fix → observed ImportError on import of _lookup_custom_provider_context_length; restored → 13/13 pass
Adjacent suites (test_model_switch_context_display, test_model_switch_custom_providers, test_user_providers_model_switch) unchanged — 36 pass, 6 pre-existing baseline failures in test_custom_provider_model_switch.py also present on clean origin/main

… /model switch (NousResearch#15779) When a live session switched to a named custom provider via `/model`, Hermes ignored `custom_providers[].models.<model>.context_length` in two places: 1. **Display** — `resolve_display_context_length()` called `get_model_context_length()` without `config_context_length`, so the per-model override never reached the resolver and the confirmation message always reported the auto-detected default (128 K). 2. **Runtime** — `run_agent.switch_model()` forwarded `self._config_context_length`, which was computed at startup for the *original* model and never refreshed for the new target. Context compression after the switch used the wrong window. Fix: introduce `_lookup_custom_provider_context_length(model, base_url)` in `hermes_cli.model_switch` that reads `get_compatible_custom_providers()` and returns the configured value for the matching (model, base_url) pair. Both `resolve_display_context_length()` and `switch_model()` in `run_agent.py` now call this helper so the correct context window is used for display and compression after every model switch. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Copilot

Pull request overview

Fixes incorrect context window handling when switching to named custom providers via /model, ensuring both the confirmation display and runtime context compression honor custom_providers[].models.<model>.context_length.

Changes:

Add _lookup_custom_provider_context_length(model, base_url) to read per-model context length overrides from custom_providers.
Forward the per-model override into resolve_display_context_length() via config_context_length.
Refresh the agent’s config context length during run_agent.switch_model() to keep compression aligned after a switch.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.

File	Description
`hermes_cli/model_switch.py`	Adds custom-provider per-model context length lookup and forwards it into display context resolution.
`run_agent.py`	Updates model-switch path to refresh the context-length override used by the context compressor.
`tests/hermes_cli/test_model_switch_custom_provider_context_length.py`	Adds regression tests for the helper and display path behavior.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+    for cp in providers:
+        if not isinstance(cp, dict):
+            continue
+        if (cp.get("base_url") or "").rstrip("/").lower() == norm_url:
+            cp_models = cp.get("models") or {}
+            if isinstance(cp_models, dict):
+                model_cfg = cp_models.get(model) or {}
+                if isinstance(model_cfg, dict):
+                    ctx = model_cfg.get("context_length")
+                    if ctx is not None:
+                        try:
+                            return int(ctx)
+                        except (TypeError, ValueError):
+                            return None
+            break
+    return None


+                        try:
+                            return int(ctx)
+                        except (TypeError, ValueError):
+                            return None


+from unittest.mock import patch, MagicMock
+
+import pytest
+


+            # Refresh config_context_length for the new model/provider: the
+            # startup value captured the original model's custom_providers entry
+            # and does not reflect the newly switched target.
+            _switch_config_ctx = getattr(self, "_config_context_length", None)
+            try:
+                from hermes_cli.model_switch import _lookup_custom_provider_context_length
+                _cp_ctx = _lookup_custom_provider_context_length(self.model, self.base_url)
+                if _cp_ctx is not None:
+                    _switch_config_ctx = _cp_ctx
+            except Exception:
+                pass
+            self._config_context_length = _switch_config_ctx
            new_context_length = get_model_context_length(
                self.model,
                base_url=self.base_url,
                api_key=self.api_key,
                provider=self.provider,
-                config_context_length=getattr(self, "_config_context_length", None),
+                config_context_length=_switch_config_ctx,
            )


alt-glitch · 2026-04-25T21:21:02Z

Likely duplicate of #15706 — same root cause: custom_providers per-model context_length ignored on /model switch. Also overlaps with #11438 and #13052.

Four issues found by Copilot inline review: 1. `_lookup_custom_provider_context_length` had a `break` that exited the loop on first base_url match regardless of whether the target model was found. Multiple providers sharing a base_url would silently return None even when a later entry defined the model. Removed the break so all matching-URL providers are scanned. 2. `int(ctx)` silently accepted booleans (True→1) and floats-as-strings ("1.5"→1). Now explicitly rejects bools, requires positive value, and only accepts digit-only strings. 3. `switch_model` defaulted `_switch_config_ctx` to the *startup* `_config_context_length`, so switching away from a custom provider that had a large per-model limit would keep that stale cap on the new model. Now resets to None each switch and only sets if the lookup finds an entry. 4. Unused `MagicMock` and `pytest` imports in test file removed. Tests: added 6 new cases covering shared-URL providers, boolean rejection, negative ints, digit-only strings, and the stale-cap reset. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

briandevans · 2026-04-25T22:12:00Z

Thanks @copilot — all four findings are real. Addressed in `08e50549`.

1. Early-exit `break` blocks second provider with same base_url ✅
Removed the `break` so the loop always scans all providers sharing a URL. Added `test_multiple_providers_same_base_url_second_has_model` to cover this case.

2. `int(ctx)` silently accepts booleans and floats ✅
Replaced `try: return int(ctx)` with explicit type-guarded validation: booleans are rejected first (since `bool` is a subclass of `int`), plain ints must be `> 0`, and only digit-only strings are parsed. Added `test_boolean_context_length_returns_none`, `test_negative_int_context_length_returns_none`, `test_digit_string_context_length_is_accepted`, and `test_float_string_context_length_returns_none`.

3. Stale `_config_context_length` carried over on model switch ✅
Changed `_switch_config_ctx` to default to `None` (not `self._config_context_length`) so switching away from a custom provider no longer inherits the old cap. Added `test_switch_model_resets_stale_custom_provider_limit` and updated the pre-existing `test_switch_model_preserves_config_context_length` (which was testing the now-fixed buggy behavior) to `test_switch_model_uses_custom_provider_context_length` — it now mocks the custom_providers lookup for the new model and asserts the correct value flows through.

4. Unused `MagicMock` and `pytest` imports ✅
Removed.

briandevans · 2026-04-25T22:12:14Z

Re @alt-glitch's cross-reference to #15706 and #11438/#13052 — positioning:

	This PR (#15787)	#15706 (kevin930321)
Display path	✅ `resolve_display_context_length` passes `config_context_length`	✅ same
Runtime path	✅ `switch_model` re-derives from `custom_providers`	✅ same
Stale-cap reset	✅ resets to `None` when new model has no entry (post-review fix)	unclear from diff
Multi-provider same URL	✅ scans all matching entries (post-review fix)	unclear
Input validation	✅ rejects booleans, negatives, float-strings	not visible
Test count added	6 new + 3 updated = 9	~4 new
Files changed	4	4 (includes `run_agent.py` agent-init path)

#15706 also fixes the agent-init startup path (initial _config_context_length assignment), which this PR currently does not touch. That's a legitimate scope difference and maintainers should weigh which is more complete.

Happy to defer to whichever approach the maintainers prefer — they're solving the same root cause.

briandevans · 2026-04-26T16:38:51Z

All four issues were addressed in the follow-up commit 08e50549:

Removed the premature break in _lookup_custom_provider_context_length so all base_url-matching providers are scanned.
Added isinstance(ctx, bool) guard before int(ctx) to reject boolean config values.
Removed the unused MagicMock and pytest imports from the test file.
Reset _switch_config_ctx to None when switching to a provider/model with no custom context override, preventing stale overrides from persisting.

briandevans · 2026-04-28T10:11:17Z

Closing — superseded by @teknium1's #15844 (commit 125de02), which landed first and is the broader fix.

Both PRs target #15779 (custom_providers per-model context_length ignored on /model switch). Coverage comparison:

Root cause: Both honor custom_providers[].models.<id>.context_length across resolution paths beyond agent startup.
Scope: fix(context): honor custom_providers context_length on /model switch + bump probe tier to 256K #15844 is broader — it adds a single helper get_custom_provider_context_length() as the source of truth, threads custom_providers= through agent.model_metadata.get_model_context_length(), and wires it through five call sites (run_agent startup, AIAgent.switch_model(), resolve_display_context_length(), gateway /model confirmation, _format_session_info). My PR only touched model_switch.py and the run_agent loop.
Probe tier bump: fix(context): honor custom_providers context_length on /model switch + bump probe tier to 256K #15844 also bumps CONTEXT_PROBE_TIERS[0] to 256K, which my PR did not address.

No remaining gap. Thanks @teknium1.

Copilot AI review requested due to automatic review settings April 25, 2026 21:13

Copilot started reviewing on behalf of briandevans April 25, 2026 21:13 View session

Copilot AI reviewed Apr 25, 2026

View reviewed changes

alt-glitch added type/bug Something isn't working P2 Medium — degraded but workaround exists comp/cli CLI entry point, hermes_cli/, setup wizard comp/agent Core agent loop, run_agent.py, prompt builder area/config Config system, migrations, profiles labels Apr 25, 2026

teknium1 mentioned this pull request Apr 27, 2026

fix(/model): respect per-model context_length from custom_providers config #11438

Closed

briandevans mentioned this pull request Apr 27, 2026

fix(gateway): propagate credential_pool through /model session overrides (#16678) #16701

Closed

4 tasks

briandevans closed this Apr 28, 2026

AgentArcLab mentioned this pull request May 9, 2026

fix(agent): clear stale config context_length on model switch #22387

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(model-switch): honor custom_providers per-model context_length on /model switch (#15779)#15787

fix(model-switch): honor custom_providers per-model context_length on /model switch (#15779)#15787
briandevans wants to merge 2 commits into
NousResearch:mainfrom
briandevans:fix/model-switch-custom-provider-context-length-15779

briandevans commented Apr 25, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

alt-glitch commented Apr 25, 2026

Uh oh!

briandevans commented Apr 25, 2026

Uh oh!

briandevans commented Apr 25, 2026

Uh oh!

briandevans commented Apr 26, 2026

Uh oh!

briandevans commented Apr 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

briandevans commented Apr 25, 2026

Summary

The bug

The fix

Test plan

Related

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

alt-glitch commented Apr 25, 2026

Uh oh!

briandevans commented Apr 25, 2026

Uh oh!

briandevans commented Apr 25, 2026

Uh oh!

briandevans commented Apr 26, 2026

Uh oh!

briandevans commented Apr 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants