fix(context): pass config_context_length to all get_model_context_length() call sites#12630
fix(context): pass config_context_length to all get_model_context_length() call sites#12630quinnmacro wants to merge 1 commit into
Conversation
4ca8f04 to
3ae67e2
Compare
PR Update: Rebased + RefactoredRebased onto latest Changes SummaryBefore:
Key Decisions
All lint + syntax checks pass. Ready for review. Gentle ping — this PR has been open for 7 days without review. The bug affects any user with |
3ae67e2 to
672c715
Compare
… sites Custom endpoints (like Infini-AI) don't return context_length from their /models API, causing get_model_context_length() to fall through to the 128K default before reaching hardcoded fallbacks. The AIAgent init path correctly resolves context_length from custom_providers, but several display and utility call sites never passed this value. Changes: - agent/model_metadata.py: add resolve_custom_providers_context_length() helper that replaces all inline custom_providers lookup loops - gateway/run.py: use helper in @context expansion, /modelinfo display, and hygiene path (was 3× duplicated inline code, now 3× 2-line calls) - run_agent.py: use helper in AIAgent.__init__ custom_providers check (preserves warn_on_invalid=True for user-facing diagnostics) - cli.py: pass config_context_length to get_model_context_length() at @context expansion and /model display paths - run_agent.py: pass config_context_length to fallback model switch Note: upstream refactored /model switch display to use ModelInfo objects (removing the else fallback), so the gateway _on_model_selected inline block from the original PR is no longer needed.
672c715 to
98f3cfc
Compare
|
Closing in favor of a new, leaner PR (#NEW). Since this PR was opened, upstream merged #15844 which fixed the core bug (custom_providers context_length not passed to display/switch paths) using a different approach — adding a This branch accumulated 1035 commits of divergence (dirty merge state) and included unrelated code reversions. Rather than rebase-rescue it, I'm opening a clean PR from the latest main that:
Net diff: +19/-21 lines. One real bug fix, one deduplication. |
What does this PR do?
Passes
config_context_lengthto allget_model_context_length()call sites so that custom OpenAI-compatible endpoints (e.g., self-hosted, Infini-AI) that don't exposecontext_lengthvia their/modelsAPI show and use the correct context window fromcustom_providers[].models..context_lengthin config.yaml.Problem
get_model_context_length()accepts aconfig_context_lengthparameter that lets callers pass an explicit context length fromcustom_providers[].models.<model>.context_lengthin config.yaml. However, several display and utility call sites never pass this value, causing the function to fall through to the 128K default when custom endpoints don't returncontext_lengthfrom their/modelsAPI.The
AIAgent.__init__path inrun_agent.pycorrectly resolvescontext_lengthfromcustom_providers, but these call sites miss it:gateway/run.py@contextreference expansiongateway/run.py/modelinfodisplaygateway/run.py/modelswitch displaycli.py/modeldisplay (first variant)cli.py/modeldisplay (fallback variant)cli.py@contextreference expansionrun_agent.pyThe remaining 6 call sites already pass
config_context_length:gateway/run.pyL4067: already passes_hyg_config_context_lengthrun_agent.pyL1513: already passes_config_context_lengthrun_agent.pyL1800: already passesgetattr(self, "_config_context_length", None)run_agent.pyL2014: already passes_aux_context_configagent/context_compressor.pyL305: constructor parameter, already receivedhermes_cli/web_server.pyL624: intentionally passesNone(wants auto-detected value)Duplicate Check
I searched existing upstream PRs for the exact code path and issue terms before opening:
config_context_lengthget_model_context_length passthroughcustom_providers context_length128K default contextI did not find an open PR that fixes this specific passthrough gap.
Type of Change
Changes Made
gateway/run.py: Addedconfig_context_length=getattr(self, "_config_context_length", None)at L3777 (@contextexpansion path).gateway/run.py: Added inlinecustom_providersresolution at L4718 (/modelinfodisplay) and L5577 (/modelswitch display) since these paths don't haveself._config_context_lengthcached. The resolution logic mirrorsAIAgent.__init__.cli.py: Addedconfig_context_length=getattr(self, "_config_context_length", None)at L4750, L4979, and L7934.run_agent.py: Addedconfig_context_length=getattr(self, "_config_context_length", None)at L6437 (fallback model switch path).Impact
@contextreference expansion (gateway + CLI)@contextexpansion uses the returned context length to set injection limits (hard_limit = context_length * 0.50,soft_limit = context_length * 0.25). With the 128K default, a@contextreference on a 200K model hits the hard limit at 64K tokens — 36K tokens short of the correct 100K limit. File content beyond 64K tokens is silently refused injection.Fallback model switch (run_agent.py)
When the primary model fails and the agent falls back,
get_model_context_length()at L6437 is called withoutconfig_context_length. If the fallback model is also a custom endpoint, the context compressor updates to an incorrect 128K threshold, causing premature compression.Display-only (no functional impact)
/modelinfoand/modelswitch displays show 128K instead of the configured 200K — misleading but not functionally harmful.How to Test
Configure a custom OpenAI-compatible endpoint that does not return
context_lengthfrom its/modelsAPI:Start Hermes with that model and run:
Before: shows 128K context.
After: shows 200K context (from config).
Test
@contextreference expansion with a large file:Before: hard limit = 64K tokens; large files are refused injection.
After: hard limit = 100K tokens; correct injection.
Test
/modelswitch display — switch to the custom model and verify context length shown is correct.Testing
Verified with Infini-AI custom endpoint:
/modelinfoshowed 128K,@contexthard limit = 64K/modelinfoshows 200K (configured),@contexthard limit = 100KFull call-site audit: all 13
get_model_context_length()invocations acrossgateway/run.py,cli.py,run_agent.py,agent/context_compressor.py, andhermes_cli/web_server.pyhave been verified — 7 fixed, 6 already correct.Checklist
Code
fix(scope):,feat(scope):, etc.)Documentation & Housekeeping
cli-config.yaml.exampleif I added/changed config keys — N/ACONTRIBUTING.mdorAGENTS.mdif I changed architecture or workflows — N/A