Skip to content

fix: preserve configured context lengths across runtime paths#14008

Open
tongguang2 wants to merge 1 commit into
NousResearch:mainfrom
tongguang2:fix/context-length-cache-and-runtime-overrides
Open

fix: preserve configured context lengths across runtime paths#14008
tongguang2 wants to merge 1 commit into
NousResearch:mainfrom
tongguang2:fix/context-length-cache-and-runtime-overrides

Conversation

@tongguang2

Copy link
Copy Markdown

Summary

This PR fixes a set of context length inconsistencies that still affect custom-provider runtimes across different execution paths.

What it fixes

  • Normalize persistent context-length cache keys so /v1 and /v1/ resolve to the same provider endpoint
  • Preserve configured context-length overrides when switching models
  • Preserve configured context-length overrides when activating fallback models
  • Avoid incorrectly reusing a per-model custom_providers context override for a different target model
  • Add regression tests for cache normalization, switch-model handling, fallback handling, and configured override persistence

Root cause

There were two separate issues:

  1. Persistent context-length cache lookups treated .../v1 and .../v1/ as different keys, which caused cache misses for the same custom provider endpoint in different runtime paths.
  2. Runtime context refresh paths (switch_model() / fallback activation) did not consistently resolve the configured context-length override for the active target model and endpoint.

As a result, custom providers that were correctly configured in config.yaml could still fall back to an incorrect detected context length in some paths.

Implementation details

  • Added base URL normalization for persistent context-length cache writes
  • Made cache reads backward-compatible with legacy slash variants
  • Added runtime-target-aware context-length resolution in AIAgent
  • Reused that runtime-target-aware resolution when refreshing model metadata during model switch and fallback activation

Tests

Verified with:

  • tests/agent/test_model_metadata.py
  • tests/run_agent/test_switch_model_context.py
  • tests/run_agent/test_fallback_model.py
  • tests/run_agent/test_provider_fallback.py
  • tests/run_agent/test_invalid_context_length_warning.py
  • tests/run_agent/test_compression_feasibility.py

Local result:

  • 147 passed

Notes

This PR is intentionally scoped to the runtime-path inconsistency and cache-key normalization issue. It does not introduce broader changes to provider metadata discovery beyond what is needed to make configured context lengths behave consistently.

- normalize context cache keys so /v1 and /v1/ share the same entry
- resolve configured context overrides per runtime target during switch and fallback
- add regression coverage for cache normalization and per-model override handling
@alt-glitch alt-glitch added type/bug Something isn't working P1 High — major feature broken, no workaround comp/agent Core agent loop, run_agent.py, prompt builder labels Apr 22, 2026
@alt-glitch

Copy link
Copy Markdown
Collaborator

Related to #11437 (context length display/runtime alignment), #8785 / #8786 (compression context overrides) — same family of context-length consistency bugs across different runtime paths.

@alt-glitch

Copy link
Copy Markdown
Collaborator

Related to #11437, #8785, #8786.

1 similar comment
@alt-glitch

Copy link
Copy Markdown
Collaborator

Related to #11437, #8785, #8786.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/agent Core agent loop, run_agent.py, prompt builder P1 High — major feature broken, no workaround type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants