Skip to content

fix: unify context length overrides for compression runtimes#8786

Open
slicenferqin wants to merge 1 commit into
NousResearch:mainfrom
slicenferqin:fix/compression-context-unify
Open

fix: unify context length overrides for compression runtimes#8786
slicenferqin wants to merge 1 commit into
NousResearch:mainfrom
slicenferqin:fix/compression-context-unify

Conversation

@slicenferqin

Copy link
Copy Markdown

Summary

  • unify context_length override resolution for compression-related runtimes
  • reuse custom_providers[].models[*].context_length across init, switch, fallback, and compression feasibility checks
  • add regression coverage for override refresh/clear paths

Closes #8785.

Problem

Custom provider context length overrides were not applied consistently.

The main runtime could honor custom_providers[].models[*].context_length, but other paths could still re-probe provider metadata and fall back to 128000 when /models did not expose a context field. This caused spurious compression feasibility warnings and inconsistent runtime state.

Changes

  • add a shared override resolver in AIAgent
  • apply it during initial compressor setup
  • apply it when switching runtimes
  • apply it during compression feasibility checks for auxiliary compression models
  • apply it when activating fallback runtimes and when restoring the primary runtime snapshot
  • add regression tests for:
    • init uses custom provider override
    • switch refreshes override for new runtime
    • switch clears stale override when the new runtime has none
    • compression feasibility checks use the same override

Testing

  • uv run --extra dev pytest tests/run_agent/test_switch_model_context.py tests/run_agent/test_primary_runtime_restore.py tests/agent/test_model_metadata.py -q -n 0

@slicenferqin

Copy link
Copy Markdown
Author

Correction to my previous comment: follow-up enhancement request filed in #8810 to add an explicit auxiliary.compression.context_length config field. This PR fixes the current inconsistency, while #8810 would make compression runtime context configuration first-class and less dependent on provider metadata for custom endpoints.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/agent Core agent loop, run_agent.py, prompt builder P2 Medium — degraded but workaround exists type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bug: compression runtime context_length overrides are not applied consistently

2 participants