fix(model_metadata): fall through to defaults when endpoint has model but no context_length by LucidPaths · Pull Request #12059 · NousResearch/hermes-agent

LucidPaths · 2026-04-18T08:04:07Z

Problem

When any OpenAI-compatible custom endpoint (Ollama Cloud, Chutes, LM Studio, local vLLM, etc.) returns a model in its /v1/models response but without a context_length field, the context resolution chain in get_model_context_length() immediately returned DEFAULT_FALLBACK_CONTEXT (128K) at step 2, bypassing models.dev, OpenRouter, and hardcoded defaults entirely.

This is a provider-agnostic bug — any server that lists models without complete metadata hits this wall. The specific trigger was GLM-5.1 on Ollama Cloud resolving to 128K instead of the correct 202,752 window, but the same issue affects any model on any custom endpoint with incomplete /v1/models responses.

Root Cause

In agent/model_metadata.py, when step 2 matched a model at a custom endpoint but found no context_length in the response, two things happened before the early return:

If _is_known_provider_base_url() returned True, the code skipped the endpoint query entirely (correct — Copilot/OpenAI report provider-limited context, not model context).
If _is_known_provider_base_url() returned False and is_local_endpoint() returned False, the code immediately returned DEFAULT_FALLBACK_CONTEXT (128K) — incorrect. It should have continued to steps 3-8 which check models.dev and hardcoded defaults.

Changes

`agent/model_metadata.py`

Remove early return DEFAULT_FALLBACK_CONTEXT when an endpoint matches a model but context_length is missing. Instead, log a debug message and fall through to steps 3-8 (local probe, Anthropic API, models.dev, hardcoded defaults).
Move local server query out of the _is_known_provider_base_url guard so it still fires for known-provider local endpoints. No behavioral change with current _URL_TO_PROVIDER entries (all remote), but eliminates an incorrect nesting dependency.

`agent/models_dev.py`

Add "ollama" → "ollama-cloud" mapping in PROVIDER_TO_MODELS_DEV. When a user configures provider: ollama (the natural name for Ollama Cloud), step 5's lookup_models_dev_context("ollama-cloud", model) can now resolve context windows via models.dev.

`tests/agent/test_model_metadata.py`

Fix regression test to use https://llm.chutes.ai/v1 (unknown custom endpoint) instead of https://ollama.com/v1 (known provider). The old URL skipped the patched code path entirely — ollama.com is in _URL_TO_PROVIDER, so _is_known_provider_base_url() returned True, and the endpoint metadata block was never entered.
Remove misleading assert result == DEFAULT_FALLBACK_CONTEXT or result == 202752 — this would accept the old buggy 128K value.
Rename test from test_custom_endpoint_without_metadata_skips_name_based_default → test_custom_endpoint_empty_metadata_falls_through_to_defaults (empty endpoint response) and add test_custom_endpoint_match_no_context_length_falls_through (endpoint returns model without context_length).
Add explicit mock_endpoint_fetch.return_value = {} to the empty-metadata test instead of relying on MagicMock default behavior.

Testing

$ python -m pytest tests/agent/test_model_metadata.py -v
81 passed in 1.48s

Both new tests specifically exercise the patched code path:

test_custom_endpoint_empty_metadata_falls_through_to_defaults — empty endpoint response, model resolves via hardcoded defaults
test_custom_endpoint_match_no_context_length_falls_through — endpoint returns model name but no context_length, resolves via hardcoded defaults instead of 128K

Architecture Context

get_model_context_length() has a 10-step resolution chain. Steps 1-2 are fast local lookups (cache, endpoint metadata). Steps 3-8 are progressively more expensive (local probe, Anthropic API, OpenRouter, models.dev, hardcoded defaults). Step 9 is a final local probe. Step 10 is the 128K fallback.

The fix ensures that failing at step 2 doesn't short-circuit the entire chain. This aligns with the documented resolution order and the design intent of having multiple fallback layers.

… but no context_length When any custom endpoint (Ollama, Chutes, LM Studio, etc.) returns a model in its /v1/models response but without context_length metadata, the resolution chain previously returned DEFAULT_FALLBACK_CONTEXT (128K) immediately, bypassing models.dev, OpenRouter, and hardcoded defaults. Any OpenAI-compatible server may list models without context_length — this is not provider-specific. The resolver should keep trying rather than giving up at step 2. Changes: - Remove early return from step 2 when endpoint matches a model but has no context_length. Fall through to steps 3-8 (local probe, Anthropic API, models.dev, hardcoded defaults). - Move local server query out of the _is_known_provider_base_url guard so it still works for known provider URLs that need local probing. - Add ollama->ollama-cloud mapping in PROVIDER_TO_MODELS_DEV so Ollama Cloud endpoints can resolve model context via models.dev. - Fix regression test to use an unknown custom endpoint URL (not a known provider URL that skips the patched code path entirely). - Remove misleading `or` assertion that would accept the old buggy 128K value. - Rename test to clarify it tests empty vs no-context-length metadata.

alt-glitch added type/bug Something isn't working P2 Medium — degraded but workaround exists comp/agent Core agent loop, run_agent.py, prompt builder labels Apr 24, 2026

ztcshen mentioned this pull request Apr 24, 2026

[codex] Prevent long-context probe collapse #14499

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(model_metadata): fall through to defaults when endpoint has model but no context_length#12059

fix(model_metadata): fall through to defaults when endpoint has model but no context_length#12059
LucidPaths wants to merge 1 commit into
NousResearch:mainfrom
LucidPaths:fix/context-length-custom-endpoint-fallback

LucidPaths commented Apr 18, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

LucidPaths commented Apr 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Root Cause

Changes

agent/model_metadata.py

agent/models_dev.py

tests/agent/test_model_metadata.py

Testing

Architecture Context

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

LucidPaths commented Apr 18, 2026 •

edited

Loading

`agent/model_metadata.py`

`agent/models_dev.py`

`tests/agent/test_model_metadata.py`