fix(context): honor custom_providers context_length on /model switch + bump probe tier to 256K by teknium1 · Pull Request #15844 · NousResearch/hermes-agent

teknium1 · 2026-04-26T01:47:48Z

Summary

/model <model> --provider custom:<name> now reports the context window the user configured in custom_providers[].models.<id>.context_length instead of falling back to 128K. Closes #15779. Also bumps the top context-probe tier (and default fallback) from 128K to 256K.

Root cause: the per-model override was read only in AIAgent.__init__. switch_model(), resolve_display_context_length(), and _format_session_info() all ignored it and fell through to the probe-down fallback.

Changes

hermes_cli/config.py: new get_custom_provider_context_length() helper — single source of truth for the per-model override lookup, trailing-slash-insensitive
agent/model_metadata.py: get_model_context_length() gains custom_providers= kwarg (step 0b — after explicit config_context_length, before every probe); CONTEXT_PROBE_TIERS prefixed with 256K so DEFAULT_FALLBACK_CONTEXT = 256K; stale 128000 literal in OR metadata-miss path replaced with DEFAULT_FALLBACK_CONTEXT
run_agent.py: startup path refactored to use the helper (dedups inline loop, preserves invalid-value warning); AIAgent.switch_model() re-reads custom_providers from live config so mid-session switches resolve the override
hermes_cli/model_switch.py: resolve_display_context_length() accepts + forwards custom_providers
gateway/run.py: /model confirmation (picker callback + text path) and _format_session_info thread custom_providers through

Validation

	Before	After
`/model gpt-5.5 --provider custom:my-endpoint` with `context_length: 1050000`	Context: 128,000	Context: 1,050,000
Unknown-model default (no detection succeeds)	128,000	256,000
`CONTEXT_PROBE_TIERS[0]`	128,000	256,000

Targeted tests: tests/agent/test_model_metadata.py tests/hermes_cli/test_custom_provider_context_length.py tests/hermes_cli/test_model_switch_context_display.py tests/hermes_cli/test_model_switch_custom_providers.py tests/hermes_cli/test_custom_provider_model_switch.py tests/run_agent/test_invalid_context_length_warning.py tests/run_agent/test_switch_model_context.py tests/agent/test_model_metadata_local_ctx.py tests/gateway/test_session_info.py — 171/171 pass. Broader tests/gateway/ tests/run_agent/ delta: 47 failing on origin/main → 46 failing on branch (same pre-existing failures, none new).

E2E via execute_code: exact #15779 repro config + helper edge cases (trailing slash, wrong url/model, invalid value types, zero/negative, config_context_length precedence) all green.

New tests

tests/hermes_cli/test_custom_provider_context_length.py — 19 tests: helper unit + step 0b integration + CONTEXT_PROBE_TIERS invariants
tests/hermes_cli/test_model_switch_context_display.py — added regression tests for Bug: /model switch to named custom provider ignores custom_providers model context_length #15779 through the display resolver

Closes #15779.

…+ bump probe tier to 256K Fixes #15779. Custom-provider per-model context_length (`custom_providers[].models.<id>.context_length`) is now honored across every resolution path, not just agent startup. Also adds 256K as the top probe tier and default fallback. ## What changed New helper `hermes_cli.config.get_custom_provider_context_length()` — single source of truth for the per-model override lookup, with trailing-slash-insensitive base-url matching. `agent.model_metadata.get_model_context_length()` gains an optional `custom_providers=` kwarg (step 0b — runs after explicit `config_context_length` but before every other probe). Wired through five call sites that previously either duplicated the lookup or ignored it entirely: - `run_agent.py` startup — refactored to use the new helper (dedups legacy inline loop, keeps invalid-value warning) - `AIAgent.switch_model()` — re-reads custom_providers from live config on every /model switch - `hermes_cli.model_switch.resolve_display_context_length()` — new `custom_providers=` kwarg - `gateway/run.py` /model confirmation (picker callback + text path) - `gateway/run.py` `_format_session_info` (/info) ## Context probe tiers `CONTEXT_PROBE_TIERS = [256_000, 128_000, 64_000, 32_000, 16_000, 8_000]` — was `[128_000, ...]`. `DEFAULT_FALLBACK_CONTEXT` follows tier[0], so unknown models now default to 256K. The stale `128000` literal in the OpenRouter metadata-miss path is replaced with `DEFAULT_FALLBACK_CONTEXT` for consistency. ## Repro (from #15779) ```yaml custom_providers: - name: my-custom-endpoint base_url: https://example.invalid/v1 model: gpt-5.5 models: gpt-5.5: context_length: 1050000 ``` `/model gpt-5.5 --provider custom:my-custom-endpoint` → previously "Context: 128,000", now "Context: 1,050,000". ## Tests - `tests/hermes_cli/test_custom_provider_context_length.py` — new file, 19 tests covering the helper, step-0b integration, and the 256K tier invariants - `tests/hermes_cli/test_model_switch_context_display.py` — added regression tests for #15779 through the display resolver - `tests/gateway/test_session_info.py` — updated default-fallback assertion (128K → 256K) - `tests/agent/test_model_metadata.py` — updated tier assertions for the new top tier

The _check_compression_model_feasibility call to get_model_context_length was missing the custom_providers= kwarg, so the aux compression model fell back to 256K instead of reading the per-model context_length from custom_providers config. This caused a spurious "Auto-lowered threshold" warning when both main and aux models are on the same custom endpoint. Also removes dead _ctx_kwargs code left over from the merge. Fixes the same gap that cbabf6a addressed via agent_config=, now using the upstream custom_providers= pattern from PR NousResearch#15844.

…+ bump probe tier to 256K (NousResearch#15844) Fixes NousResearch#15779. Custom-provider per-model context_length (`custom_providers[].models.<id>.context_length`) is now honored across every resolution path, not just agent startup. Also adds 256K as the top probe tier and default fallback. ## What changed New helper `hermes_cli.config.get_custom_provider_context_length()` — single source of truth for the per-model override lookup, with trailing-slash-insensitive base-url matching. `agent.model_metadata.get_model_context_length()` gains an optional `custom_providers=` kwarg (step 0b — runs after explicit `config_context_length` but before every other probe). Wired through five call sites that previously either duplicated the lookup or ignored it entirely: - `run_agent.py` startup — refactored to use the new helper (dedups legacy inline loop, keeps invalid-value warning) - `AIAgent.switch_model()` — re-reads custom_providers from live config on every /model switch - `hermes_cli.model_switch.resolve_display_context_length()` — new `custom_providers=` kwarg - `gateway/run.py` /model confirmation (picker callback + text path) - `gateway/run.py` `_format_session_info` (/info) ## Context probe tiers `CONTEXT_PROBE_TIERS = [256_000, 128_000, 64_000, 32_000, 16_000, 8_000]` — was `[128_000, ...]`. `DEFAULT_FALLBACK_CONTEXT` follows tier[0], so unknown models now default to 256K. The stale `128000` literal in the OpenRouter metadata-miss path is replaced with `DEFAULT_FALLBACK_CONTEXT` for consistency. ## Repro (from NousResearch#15779) ```yaml custom_providers: - name: my-custom-endpoint base_url: https://example.invalid/v1 model: gpt-5.5 models: gpt-5.5: context_length: 1050000 ``` `/model gpt-5.5 --provider custom:my-custom-endpoint` → previously "Context: 128,000", now "Context: 1,050,000". ## Tests - `tests/hermes_cli/test_custom_provider_context_length.py` — new file, 19 tests covering the helper, step-0b integration, and the 256K tier invariants - `tests/hermes_cli/test_model_switch_context_display.py` — added regression tests for NousResearch#15779 through the display resolver - `tests/gateway/test_session_info.py` — updated default-fallback assertion (128K → 256K) - `tests/agent/test_model_metadata.py` — updated tier assertions for the new top tier

teknium1 merged commit 125de02 into main Apr 26, 2026
10 of 11 checks passed

teknium1 deleted the hermes/hermes-c8b3b316 branch April 26, 2026 01:47

teknium1 mentioned this pull request Apr 26, 2026

Bug: /model switch to named custom provider ignores custom_providers model context_length #15779

Closed

1 task

sadlay mentioned this pull request Apr 26, 2026

Fix config-driven context length resolution #15948

Open

This was referenced Apr 27, 2026

Generic 400/disconnect errors misclassified as context_overflow in 1M-context sessions #16351

Closed

fix(error_classifier): avoid large-context false overflow heuristics #16352

Closed

chevyphillip mentioned this pull request Apr 27, 2026

Bug: Status banner fails to detect context_length from custom_providers #5089

Closed

alt-glitch mentioned this pull request Apr 27, 2026

fix(gateway): read context_length from custom_providers in session info header #16579

Closed

This was referenced Apr 28, 2026

fix(model-switch): honor custom_providers per-model context_length on /model switch (#15779) #15787

Closed

fix(context-length): fall through to catalog when endpoint probe misses; pass config_context_length to display resolver #15732

Closed

This was referenced Apr 29, 2026

fix(model_metadata): auto-load global custom_providers in get_model_context_length (closes #12977) (closes #13807) #17460

Open

fix(agent): honor custom provider context in compression feasibility check #8721

Closed

This was referenced May 2, 2026

fix(context): pass config_context_length to all get_model_context_length() call sites #12630

Closed

refactor(gateway): use get_custom_provider_context_length() helper + pass custom_providers to @context path #18844

Open

alt-glitch mentioned this pull request May 3, 2026

fix(cli): honor custom context lengths in model switch display #19034

Closed

19 tasks

alt-glitch mentioned this pull request May 15, 2026

fix(fallback): forward custom_providers to fallback model context-length detection #26043

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(context): honor custom_providers context_length on /model switch + bump probe tier to 256K#15844

fix(context): honor custom_providers context_length on /model switch + bump probe tier to 256K#15844
teknium1 merged 1 commit into
mainfrom
hermes/hermes-c8b3b316

teknium1 commented Apr 26, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

teknium1 commented Apr 26, 2026

Summary

Changes

Validation

New tests

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants