fix(context): pass custom_providers in aux compression and fallback paths#18877
Closed
rkt2spc wants to merge 1 commit into
Closed
fix(context): pass custom_providers in aux compression and fallback paths#18877rkt2spc wants to merge 1 commit into
rkt2spc wants to merge 1 commit into
Conversation
…aths get_model_context_length() honors per-model context_length overrides defined in custom_providers[].models[].context_length only when called with the custom_providers= keyword. Two of the four call sites in run_agent.py already do this; two do not — the auxiliary compression context probe (~:2655) and the fallback-model activation path (~:7635). Without the fix, a user who pins context_length on a model exposed by a custom provider sees the main agent correctly use the override but the compression model fall back to the 256K default, producing the "Compression model ... context is 256,000 tokens" warning and an auto-lowered session threshold even though the model can in fact handle the configured window. The fix mirrors the existing pattern from the switch_model() path at ~:2382: load_config() + get_compatible_custom_providers(), defensively wrapped in try/except, then pass via custom_providers=. No behavior change for users not using custom_providers.
Collaborator
Collaborator
|
Likely duplicate of #13540 — same root cause. |
This was referenced May 5, 2026
4 tasks
Contributor
|
This appears to be implemented on current main. Automated hermes-sweeper review found both paths from this PR covered now, though the code has since been refactored out of Evidence:
Thanks for the clear writeup and repro details; they match the behavior now present on main. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
fix(context): pass
custom_providersin aux compression and fallback pathsProblem
Users who pin
context_lengthon a model incustom_providerssee the main agent honor that override but the auxiliary compression model fall back to the default 256K — even when it's literally the same model. This produces a confusing startup warning and silently lowers the compression threshold every session.What goes wrong (user perspective)
Take a
custom_providersentry with a per-model context override (typical for the 1M-context Opus variant against a custom proxy):Expected: compression triggers at 500K tokens (50% of 1M).
Actual: at session start, this warning appears:
Both labels in that message are the same model —
claude-opus-4-7[1m]against the same custom provider. Yet Hermes resolves it to two different context windows: 1M for the main agent (correct), 256K for the compression model (wrong). The session quietly runs at half the configured threshold and the user has to either ignore the warning every session or manually dropcompression.thresholdto 0.25.Root cause
agent/model_metadata.py:1229'sget_model_context_length()accepts acustom_providerskeyword. When supplied, step 0b in its resolution order checkscustom_providers[].models[].context_lengthfor an explicit override. Without it, that step is skipped and resolution falls through (eventually) to theDEFAULT_FALLBACK_CONTEXTof 256K for unknown custom endpoints.There are four call sites in
run_agent.py. Two passcustom_providers; two don't:custom_providers?/modelswitch (closes #15779)Site 2655 is what produces the warning. Site 7635 has the same bug but is latent — only triggers when a user has both a custom-provider main model AND a
fallback_model; the symptom would be the fallback's compressor mis-sizing its threshold.Fix
Mirror the existing pattern already used at
run_agent.py:2382-2389— loadcustom_providersfrom current config (defensively wrapped intry/exceptso a load failure doesn't break the path) and pass via the existingcustom_providers=keyword:Same shape applied at site 7635 for the fallback path.
No new code paths. No behavior change for users not using
custom_providers.Test plan
uv run pytest(existing suite passes)custom_providers[].models[].context_length: 1000000andcompression.threshold: 0.5against a custom Anthropic-compatible endpoint; start a session → warning does not appear,context_compressor.threshold_tokensis 500,000.fallback_modelpointing at the same custom provider; force the primary to fail (e.g. wrong port); confirm fallback'scontext_compressor.context_lengthreflects the per-model override (not the 256K default).