Skip to content

fix(gateway): read custom provider context length in status#8240

Open
anthhub wants to merge 1 commit into
NousResearch:mainfrom
anthhub:fix/gateway-custom-provider-context-length
Open

fix(gateway): read custom provider context length in status#8240
anthhub wants to merge 1 commit into
NousResearch:mainfrom
anthhub:fix/gateway-custom-provider-context-length

Conversation

@anthhub

@anthhub anthhub commented Apr 12, 2026

Copy link
Copy Markdown

What does this PR do?

Fixes gateway session info / status context-length detection for models defined under custom_providers.

Previously _format_session_info() only respected model.context_length at the top level, so custom provider models with per-model context_length metadata fell back to the default 128K display/hint path.

This PR adds a shared lookup helper for custom_providers per-model context lengths and uses it in both session-info formatting and gateway session hygiene fallback logic.

Related Issue

Fixes #5089

Type of Change

  • 🐛 Bug fix (non-breaking change that fixes an issue)
  • ✅ Tests (adding or improving test coverage)

Changes Made

  • gateway/run.py
    • added a helper to resolve context_length from custom_providers
    • taught _format_session_info() to use that helper when top-level model.context_length is absent
    • reused the same helper in gateway hygiene fallback logic
    • supports both custom_providers[].models as a dict map and as a list of {name, context_length} entries
  • tests/gateway/test_session_info.py
    • added regression coverage for custom_providers list entries carrying context_length

How to Test

  1. Configure a custom provider like:
    model:
      default: my-huge-model
      provider: custom
      base_url: https://my-gateway.io/v1
    
    custom_providers:
      - name: my-custom
        base_url: https://my-gateway.io/v1
        models:
          - name: my-huge-model
            context_length: 1100000
  2. Open the gateway status/session info view.
  3. Verify the context display uses the configured custom-provider value instead of falling back to 128K.

Or run:

python -m pytest tests/gateway/test_session_info.py -q -n0

Checklist

Code

  • I've read the Contributing Guide
  • My commit messages follow Conventional Commits
  • I searched for existing PRs to make sure this isn't a duplicate
  • My PR contains only changes related to this fix/feature
  • I've run relevant pytest coverage locally
  • I've added tests for my changes
  • I've tested on my platform:
    • macOS

Documentation & Housekeeping

  • I've considered cross-platform impact
  • N/A — no docs/config schema changes
  • N/A — no tool schema changes

pinch-claw added a commit to pinch-claw/hermes-agent that referenced this pull request Apr 23, 2026
…nner

The session-reset / info banner in `_format_session_info` resolved the
context window only from the top-level `model.context_length` key. When
users configured context_length under the new `providers.<name>.models.<model>`
dict schema (or the legacy `custom_providers[].models.<model>` list
schema), the banner fell through to `get_model_context_length()`'s probe
chain. Remote OpenAI-compatible proxies frequently omit `context_length`
from `/models` responses, so the probe failed silently and the banner
displayed `128K tokens (default — set model.context_length in config to
override)` — even though the runtime `ContextCompressor` was already
budgeting with the correct value resolved by `AIAgent.__init__`.

After the top-level lookup, walk `get_compatible_custom_providers(cfg)`,
match the entry by `base_url` against the resolved runtime base_url, and
read `entry["models"][model]["context_length"]`. This mirrors the exact
resolution `AIAgent.__init__` performs at `run_agent.py:1608-1643` so the
banner's displayed value matches what the compressor actually uses.

Fixes NousResearch#5089.

Relationship to existing open PRs (NousResearch/hermes-agent):
- NousResearch#5096, NousResearch#8240, NousResearch#10690: each hand-roll traversal of the legacy
  `custom_providers:` list only. They do not cover the newer
  `providers:` dict schema users land on via `hermes model`.
- NousResearch#12380: touches `/model` display + gateway fallback paths; overlaps
  in spirit but still traverses lists manually.

Routing through `get_compatible_custom_providers()` — the existing
compat shim at `hermes_cli/config.py:2078` — gives a single lookup that
covers both schemas and stays consistent with every other runtime
caller, eliminating future drift between display and compressor.
@alt-glitch alt-glitch added type/bug Something isn't working P3 Low — cosmetic, nice to have comp/gateway Gateway runner, session dispatch, delivery area/config Config system, migrations, profiles labels Apr 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/config Config system, migrations, profiles comp/gateway Gateway runner, session dispatch, delivery P3 Low — cosmetic, nice to have type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bug: Status banner fails to detect context_length from custom_providers

2 participants