Skip to content

fix: resolve context_length from custom_providers on model switch#13052

Closed
TroyMitchell911 wants to merge 1 commit into
NousResearch:mainfrom
TroyMitchell911:fix/custom-provider-context-length
Closed

fix: resolve context_length from custom_providers on model switch#13052
TroyMitchell911 wants to merge 1 commit into
NousResearch:mainfrom
TroyMitchell911:fix/custom-provider-context-length

Conversation

@TroyMitchell911

@TroyMitchell911 TroyMitchell911 commented Apr 20, 2026

Copy link
Copy Markdown
Contributor

Summary

Three related bugs caused custom_providers per-model context_length overrides to be ignored, particularly when switching models at runtime via /model. This resulted in models like gemini-3.1-pro-preview (1M context) being incorrectly reported as 128K when served through proxy endpoints (LiteLLM, NewAPI, etc.).

Root Cause

  1. __init__ assignment order (run_agent.py): self._config_context_length was assigned before the custom_providers lookup, so per-model overrides from custom_providers were never stored on the agent instance.

  2. switch_model stale value (run_agent.py): When switching models, the old model's _config_context_length was reused instead of re-resolving from custom_providers for the new (model, base_url) pair.

  3. Early return on unknown endpoints (model_metadata.py): When a custom endpoint's /models response omitted context_length, get_model_context_length() returned 128K immediately instead of falling through to models.dev / OpenRouter / hardcoded defaults.

Changes

  • run_agent.py: Move self._config_context_length assignment after the custom_providers lookup in __init__.
  • run_agent.py: In switch_model(), re-resolve config_context_length from config.yaml (both top-level model.context_length and custom_providers per-model overrides) for the new model.
  • agent/model_metadata.py: Remove the early return DEFAULT_FALLBACK_CONTEXT for unknown custom endpoints when /models lacks context_length — fall through to downstream registries instead.
  • Tests: Updated existing test_switch_model_context.py and added new test files for both fixes.

Test Plan

  • All 30 related tests pass (context switch, compression feasibility, invalid context warnings, custom endpoint fallthrough).
  • Manual verification: switching to gemini-3.1-pro-preview via a proxy endpoint now correctly reports 1M context.

Three related bugs caused custom_providers per-model context_length
overrides to be ignored, particularly when switching models at runtime:

1. __init__: self._config_context_length was assigned *before* the
   custom_providers lookup, so per-model overrides were never stored
   on the agent instance.  Moved the assignment after the lookup.

2. switch_model: re-resolve config_context_length from config.yaml
   for the new (model, base_url) pair instead of reusing the stale
   value from __init__.  Checks both top-level model.context_length
   and custom_providers per-model overrides.

3. get_model_context_length: when a custom endpoint's /models response
   omits context_length, fall through to models.dev / OpenRouter /
   hardcoded defaults instead of returning 128K immediately.  Proxy
   endpoints (LiteLLM, NewAPI, etc.) often omit context_length but
   serve well-known models whose metadata is available downstream.

Signed-off-by: Troy Mitchell <i@troy-y.org>
@trevorgordon981

Copy link
Copy Markdown

Good work on this fix.

Key improvements:

  • Properly resolves context_length from custom_providers on model switch
  • Fixes three related bugs that caused per-model overrides to be ignored
  • Comprehensive test coverage (4 tests, all passing)

The changes address:

  1. Correct assignment order in init
  2. Fresh resolution of config_context_length in switch_model
  3. Proper fallthrough logic for unknown custom endpoints

This ensures models served through proxy endpoints correctly report their context lengths. Ready to merge.

@TroyMitchell911 TroyMitchell911 force-pushed the fix/custom-provider-context-length branch 2 times, most recently from ca849be to bc0ce12 Compare April 20, 2026 14:27
@TroyMitchell911

Copy link
Copy Markdown
Contributor Author

Good work on this fix.

Key improvements:

  • Properly resolves context_length from custom_providers on model switch
  • Fixes three related bugs that caused per-model overrides to be ignored
  • Comprehensive test coverage (4 tests, all passing)

The changes address:

  1. Correct assignment order in init
  2. Fresh resolution of config_context_length in switch_model
  3. Proper fallthrough logic for unknown custom endpoints

This ensures models served through proxy endpoints correctly report their context lengths. Ready to merge.

The PR is ready to be merged. :)

@alt-glitch alt-glitch added type/bug Something isn't working P2 Medium — degraded but workaround exists comp/agent Core agent loop, run_agent.py, prompt builder comp/cli CLI entry point, hermes_cli/, setup wizard area/config Config system, migrations, profiles labels Apr 22, 2026
@alt-glitch

Copy link
Copy Markdown
Collaborator

Related to #14008 (context length cache key normalization) and #11438 (same custom_providers context_length path). Multiple PRs targeting this area — may need coordinated merge.

@teknium1

Copy link
Copy Markdown
Contributor

Thanks for the well-structured fix and the thorough test coverage, @TroyMitchell911!

This is an automated hermes-sweeper review. After inspecting current main, all three bugs described here have already been resolved by a separate fix that landed as PR #15844.

Evidence:

  • Commit 125de0205fix(context): honor custom_providers context_length on /model switch + bump probe tier to 256K — ships a get_custom_provider_context_length() helper in hermes_cli/config.py wired through five call sites (startup init, switch_model(), display resolver, and both gateway paths).
  • agent/model_metadata.py gains a new custom_providers= kwarg (step 0b) that resolves the per-model override before any probe, closing the early-return / 128K fallback gap.
  • The PR fix(context): honor custom_providers context_length on /model switch + bump probe tier to 256K #15844 commit message explicitly cites #15779, which @alt-glitch linked here as related, confirming the same root cause.

The fix on main is broader in scope than this PR (five call sites vs three), so closing this as superseded. The tests added here may still be worth cherry-picking if they cover paths not already covered by the new tests/hermes_cli/test_custom_provider_context_length.py introduced in #15844.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/config Config system, migrations, profiles comp/agent Core agent loop, run_agent.py, prompt builder comp/cli CLI entry point, hermes_cli/, setup wizard P2 Medium — degraded but workaround exists type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants