fix(context): pass config_context_length to all get_model_context_length() call sites by quinnmacro · Pull Request #12630 · NousResearch/hermes-agent

quinnmacro · 2026-04-19T17:15:17Z

What does this PR do?

Passes config_context_length to all get_model_context_length() call sites so that custom OpenAI-compatible endpoints (e.g., self-hosted, Infini-AI) that don't expose context_length via their /models API show and use the correct context window from custom_providers[].models..context_length in config.yaml.

Problem

get_model_context_length() accepts a config_context_length parameter that lets callers pass an explicit context length from custom_providers[].models.<model>.context_length in config.yaml. However, several display and utility call sites never pass this value, causing the function to fall through to the 128K default when custom endpoints don't return context_length from their /models API.

The AIAgent.__init__ path in run_agent.py correctly resolves context_length from custom_providers, but these call sites miss it:

File	Line	Call site	Status
`gateway/run.py`	L3777	`@context` reference expansion	Fixed in commit 1
`gateway/run.py`	L4718	`/modelinfo` display	Fixed in commit 1
`gateway/run.py`	L5577	`/model` switch display	Fixed in commit 1
`cli.py`	L4750	`/model` display (first variant)	Fixed in commit 1
`cli.py`	L4979	`/model` display (fallback variant)	Fixed in commit 1
`cli.py`	L7934	`@context` reference expansion	Fixed in commit 1
`run_agent.py`	L6437	Fallback model switch	Fixed in commit 2

The remaining 6 call sites already pass config_context_length:

gateway/run.py L4067: already passes _hyg_config_context_length
run_agent.py L1513: already passes _config_context_length
run_agent.py L1800: already passes getattr(self, "_config_context_length", None)
run_agent.py L2014: already passes _aux_context_config
agent/context_compressor.py L305: constructor parameter, already received
hermes_cli/web_server.py L624: intentionally passes None (wants auto-detected value)

Duplicate Check

I searched existing upstream PRs for the exact code path and issue terms before opening:

config_context_length
get_model_context_length passthrough
custom_providers context_length
128K default context

I did not find an open PR that fixes this specific passthrough gap.

Type of Change

Bug fix (non-breaking change that fixes an issue)
New feature (non-breaking change that adds functionality)
Security fix
Documentation update
Tests (adding or improving test coverage)
Refactor (no behavior change)

Changes Made

gateway/run.py: Added config_context_length=getattr(self, "_config_context_length", None) at L3777 (@context expansion path).
gateway/run.py: Added inline custom_providers resolution at L4718 (/modelinfo display) and L5577 (/model switch display) since these paths don't have self._config_context_length cached. The resolution logic mirrors AIAgent.__init__.
cli.py: Added config_context_length=getattr(self, "_config_context_length", None) at L4750, L4979, and L7934.
run_agent.py: Added config_context_length=getattr(self, "_config_context_length", None) at L6437 (fallback model switch path).

Impact

`@context` reference expansion (gateway + CLI)

@context expansion uses the returned context length to set injection limits (hard_limit = context_length * 0.50, soft_limit = context_length * 0.25). With the 128K default, a @context reference on a 200K model hits the hard limit at 64K tokens — 36K tokens short of the correct 100K limit. File content beyond 64K tokens is silently refused injection.

Fallback model switch (run_agent.py)

When the primary model fails and the agent falls back, get_model_context_length() at L6437 is called without config_context_length. If the fallback model is also a custom endpoint, the context compressor updates to an incorrect 128K threshold, causing premature compression.

Display-only (no functional impact)

/modelinfo and /model switch displays show 128K instead of the configured 200K — misleading but not functionally harmful.

How to Test

Configure a custom OpenAI-compatible endpoint that does not return context_length from its /models API:

# config.yaml
custom_providers:
  - base_url: "https://your-custom-endpoint/v1"
    models:
      your-model:
        context_length: 200000

Start Hermes with that model and run:
```
/modelinfo
```
Before: shows 128K context.
After: shows 200K context (from config).
Test @context reference expansion with a large file:
```
@context Read the file at /path/to/large/file.txt
```
Before: hard limit = 64K tokens; large files are refused injection.
After: hard limit = 100K tokens; correct injection.
Test /model switch display — switch to the custom model and verify context length shown is correct.

Testing

Verified with Infini-AI custom endpoint:

Before: /modelinfo showed 128K, @context hard limit = 64K
After: /modelinfo shows 200K (configured), @context hard limit = 100K

Full call-site audit: all 13 get_model_context_length() invocations across gateway/run.py, cli.py, run_agent.py, agent/context_compressor.py, and hermes_cli/web_server.py have been verified — 7 fixed, 6 already correct.

Checklist

Code

I've read the Contributing Guide
My commit messages follow Conventional Commits (fix(scope):, feat(scope):, etc.)
I searched for existing PRs to make sure this isn't a duplicate
My PR contains only changes related to this fix (no unrelated commits)
I've verified all call sites of the affected function are covered

Documentation & Housekeeping

I've updated relevant documentation — N/A (internal fix, no API change)
I've updated cli-config.yaml.example if I added/changed config keys — N/A
I've updated CONTRIBUTING.md or AGENTS.md if I changed architecture or workflows — N/A
I've considered cross-platform impact — no platform-specific behavior

quinnmacro · 2026-04-26T01:00:22Z

PR Update: Rebased + Refactored

Rebased onto latest main (resolved 3 conflicts from upstream ModelInfo refactor) and extracted duplicated inline custom_providers parsing into a helper function.

Changes Summary

Before: +69/-1 across 3 files (minimal passthrough fix)
After: +94/-56 across 4 files (passthrough fix + deduplication)

File	Change
`agent/model_metadata.py`	New `resolve_custom_providers_context_length()` helper — single source of truth for custom_providers context length resolution
`cli/cli.py`	Resolved conflict: old fallback branch removed (upstream `ModelInfo` replaces it)
`cli/run_agent.py`	39-line inline → 2-line helper call (`warn_on_invalid=True` preserves diagnostic logging)
`gateway/run.py`	2 conflict resolutions + 2 inline blocks → helper calls (67 lines removed)

Key Decisions

Upstream's ModelInfo refactor made two of the original fix sites unnecessary (cli.py L5401, gateway L5727) — those fallback branches were deleted
Helper placed in agent/model_metadata.py (same module as get_model_context_length)
warn_on_invalid kw-only param preserves run_agent.py init-path warning while keeping gateway paths silent

All lint + syntax checks pass. Ready for review.

Gentle ping — this PR has been open for 7 days without review. The bug affects any user with custom_providers models (Infini-AI, local endpoints, etc.) where context_length gets silently truncated to 128K. Happy to make any adjustments the maintainers suggest.

@context

… sites Custom endpoints (like Infini-AI) don't return context_length from their /models API, causing get_model_context_length() to fall through to the 128K default before reaching hardcoded fallbacks. The AIAgent init path correctly resolves context_length from custom_providers, but several display and utility call sites never passed this value. Changes: - agent/model_metadata.py: add resolve_custom_providers_context_length() helper that replaces all inline custom_providers lookup loops - gateway/run.py: use helper in @context expansion, /modelinfo display, and hygiene path (was 3× duplicated inline code, now 3× 2-line calls) - run_agent.py: use helper in AIAgent.__init__ custom_providers check (preserves warn_on_invalid=True for user-facing diagnostics) - cli.py: pass config_context_length to get_model_context_length() at @context expansion and /model display paths - run_agent.py: pass config_context_length to fallback model switch Note: upstream refactored /model switch display to use ModelInfo objects (removing the else fallback), so the gateway _on_model_selected inline block from the original PR is no longer needed.

quinnmacro · 2026-05-02T13:59:00Z

Closing in favor of a new, leaner PR (#NEW).

Since this PR was opened, upstream merged #15844 which fixed the core bug (custom_providers context_length not passed to display/switch paths) using a different approach — adding a custom_providers kwarg to get_model_context_length() + a get_custom_provider_context_length() helper.

This branch accumulated 1035 commits of divergence (dirty merge state) and included unrelated code reversions. Rather than rebase-rescue it, I'm opening a clean PR from the latest main that:

Fixes a remaining bug: the @context expansion path still doesn't pass custom_providers to get_model_context_length(), so step 0b is skipped
Code cleanup: replaces 27-line inline custom_providers parsing in the hygiene run with the existing get_custom_provider_context_length() helper

Net diff: +19/-21 lines. One real bug fix, one deduplication.

alt-glitch added type/bug Something isn't working P2 Medium — degraded but workaround exists comp/gateway Gateway runner, session dispatch, delivery comp/cli CLI entry point, hermes_cli/, setup wizard comp/agent Core agent loop, run_agent.py, prompt builder labels Apr 23, 2026

quinnmacro force-pushed the fix/context-length-passthrough branch from 4ca8f04 to 3ae67e2 Compare April 26, 2026 01:00

quinnmacro force-pushed the fix/context-length-passthrough branch from 3ae67e2 to 672c715 Compare April 26, 2026 10:05

quinnmacro force-pushed the fix/context-length-passthrough branch from 672c715 to 98f3cfc Compare April 27, 2026 12:18

quinnmacro mentioned this pull request Apr 27, 2026

fix: prevent stale timestamp perception by injecting current time per-turn #15872

Open

alt-glitch mentioned this pull request May 2, 2026

fix: pass custom_providers to get_model_context_length() in all call sites #18777

Open

quinnmacro closed this May 2, 2026

quinnmacro mentioned this pull request May 2, 2026

refactor(gateway): use get_custom_provider_context_length() helper + pass custom_providers to @context path #18844

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(context): pass config_context_length to all get_model_context_length() call sites#12630

fix(context): pass config_context_length to all get_model_context_length() call sites#12630
quinnmacro wants to merge 1 commit into
NousResearch:mainfrom
quinnmacro:fix/context-length-passthrough

quinnmacro commented Apr 19, 2026 •

edited

Loading

Uh oh!

quinnmacro commented Apr 26, 2026

Uh oh!

quinnmacro commented May 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

quinnmacro commented Apr 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Problem

Duplicate Check

Type of Change

Changes Made

Impact

@context reference expansion (gateway + CLI)

Fallback model switch (run_agent.py)

Display-only (no functional impact)

How to Test

Testing

Checklist

Code

Documentation & Housekeeping

Uh oh!

quinnmacro commented Apr 26, 2026

PR Update: Rebased + Refactored

Changes Summary

Key Decisions

Uh oh!

quinnmacro commented May 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

quinnmacro commented Apr 19, 2026 •

edited

Loading

`@context` reference expansion (gateway + CLI)