Skip to content

fix(agent): clear stale config context_length on model switch#22387

Closed
AgentArcLab wants to merge 1 commit into
NousResearch:mainfrom
AgentArcLab:fix/clear-config-context-length-on-model-switch
Closed

fix(agent): clear stale config context_length on model switch#22387
AgentArcLab wants to merge 1 commit into
NousResearch:mainfrom
AgentArcLab:fix/clear-config-context-length-on-model-switch

Conversation

@AgentArcLab

@AgentArcLab AgentArcLab commented May 9, 2026

Copy link
Copy Markdown
Contributor

Problem

When switching models (via /model or fallback), AIAgent._config_context_length is never cleared, so the new model inherits the previous model's context window instead of auto-detecting the correct one via get_model_context_length().

Root Cause

AIAgent._config_context_length is set once during __init__ from model.context_length in config.yaml. This value is never cleared in either:

  1. switch_model() — the /model command handler
  2. _try_activate_fallback() — the failover path on primary model failure

Since get_model_context_length() checks this value at resolution step 0 and returns it immediately, the new/fallback model inherits the old override instead of going through the full resolution chain (custom_providers per-model, endpoint metadata, models.dev, etc.).

Fix

Clear self._config_context_length = None in both code paths, before the runtime field swap. This allows get_model_context_length() to skip the stale step-0 override and properly resolve the context window for the newly selected model through the standard chain (step 0b: custom_providers per-model, then endpoint probe, models.dev, etc.).

Testing

  1. Configure model.context_length: 1048576 in config.yaml for model A (1M context)
  2. Start Hermes with model A — verify 1M context window
  3. Switch to model B (e.g. 200K context) via /modelbefore fix: still shows 1M; after fix: correctly shows 200K
  4. Switch back to model A — correctly shows 1M again
  5. Trigger a fallback (e.g. rate-limit the primary model) — before fix: fallback model inherits primary's context window; after fix: fallback model resolves its own context window

Closes #21509

When switching models via /model, AIAgent._config_context_length was
never cleared, so the new model inherited the previous model's context
window instead of auto-detecting the correct one via
get_model_context_length().

Clear _config_context_length to None before the runtime field swap so
the full resolution chain (custom_providers per-model, endpoint probe,
models.dev, etc.) is re-evaluated for the newly selected model.

Closes NousResearch#21509
@AgentArcLab AgentArcLab force-pushed the fix/clear-config-context-length-on-model-switch branch from 738538b to e25ac0e Compare May 9, 2026 07:58
@alt-glitch alt-glitch added type/bug Something isn't working P2 Medium — degraded but workaround exists comp/agent Core agent loop, run_agent.py, prompt builder area/config Config system, migrations, profiles duplicate This issue or pull request already exists labels May 9, 2026
@alt-glitch

Copy link
Copy Markdown
Collaborator

Duplicate of #11438 — same root cause (stale _config_context_length not cleared on model switch). #11438 also covers the custom_providers per-model resolution path. See also closed #21509 (identical fix, never merged).

@AgentArcLab

Copy link
Copy Markdown
Contributor Author

Thanks for the pointer to #11438! You're right that the root cause is the same — stale _config_context_length not cleared on model switch.

A couple of notes on where this PR differs:

  1. Also covers the fallback path. This fix clears _config_context_length in both switch_model() (the /model handler) and _try_activate_fallback() (the failover path on primary model failure). fix(/model): respect per-model context_length from custom_providers config #11438 and the other related PRs (fix(model-switch): honor custom_providers per-model context_length on /model switch (#15779) #15787, fix: resolve context_length from custom_providers on model switch #13052) only address the /model switch path — when the agent falls back to a different model after a rate limit or error, the same stale context_length bug applies.

  2. Minimal scope. This PR is intentionally narrow — only 2 insertion points in run_agent.py, no changes to model_switch.py, cli.py, or gateway/run.py. The confirmation display paths are a separate concern and can be addressed independently.

I noticed that #11438, #15787, and #13052 were all closed without merging, so the bug remains unfixed on main. Happy to adjust the approach if there's a preferred direction — just want to make sure this doesn't fall through the cracks again 🙂

@teknium1

Copy link
Copy Markdown
Contributor

Merged via PR #24724 (cherry-picked onto current main with your authorship preserved). Thanks for the contribution!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/config Config system, migrations, profiles comp/agent Core agent loop, run_agent.py, prompt builder duplicate This issue or pull request already exists P2 Medium — degraded but workaround exists type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants