fix(model_metadata): add gpt-5.x context lengths + guard against poisoned cache by Prithvi1994 · Pull Request #5179 · NousResearch/hermes-agent

Prithvi1994 · 2026-04-05T05:30:46Z

Fixes #5173 — gpt-5.4 shows 32k context in Hermes instead of 1,050,000

Root Cause

Two independent bugs conspire to produce the wrong context window:

Missing DEFAULT_CONTEXT_LENGTHS entries — gpt-5.4 (and other gpt-5.x variants) were absent from the fallback dict. Lookups fell through to the generic "gpt-5": 128000 catch-all, returning 128k instead of 1,050,000.
Cache poisoning — When connecting via the Codex endpoint, Hermes probes the API and may receive max_output_tokens (32k) where it expects context_length. That value gets written to context_length_cache.yaml. Since the persistent cache is checked first in the resolution order, the bad 32k value overrides everything permanently.

Fix (two-part)

1. Add specific gpt-5.x entries to `DEFAULT_CONTEXT_LENGTHS`

New entries added before the generic "gpt-5": 128000 catch-all in agent/model_metadata.py:

Model	Context Length
`gpt-5.4`	1,050,000
`gpt-5.4-mini`	1,050,000
`gpt-5.4-pro`	1,050,000
`gpt-5.4-nano`	1,050,000
`gpt-5.3-codex`	1,048,576
`gpt-5.2-codex`	1,048,576
`gpt-5.1-codex-max`	1,048,576
`gpt-5.1-codex-mini`	1,048,576

The existing sorted() in get_model_context_length ensures longest-key-first matching, so specific variants correctly shadow the catch-all.

2. Sanity guard in `save_context_length()`

Added a pre-write check: if the model name contains "gpt-5" and the value being cached is <= 128,000, the write is rejected and a warning is logged. This stops max_output_tokens (32k) from ever being written into context_length_cache.yaml for gpt-5 family models.

The guard does not affect non-gpt-5 models — e.g. llama-3 can still be cached at 32k normally.

Users who need to force a specific value can always set model.context_length in config.yaml, which is checked before the cache.

Tests

Added TestGpt5ContextLengths in tests/agent/test_model_metadata.py:

gpt-5.4 -> 1,050,000 via DEFAULT_CONTEXT_LENGTHS
gpt-5.4-mini -> 1,050,000 via DEFAULT_CONTEXT_LENGTHS
save_context_length("gpt-5.4", ..., 32000) -> silently rejected
save_context_length("gpt-5.4", ..., 1_050_000) -> cached successfully
Sanity guard does NOT block llama-3 at 32k

All 80 tests pass.

… entries

fix(model_metadata): add gpt-5.x context lengths + reject small cache…

a6379a5

… entries

This was referenced Apr 24, 2026

[codex] Prevent long-context probe collapse #14499

Closed

[codex] Guard untrusted context probe shrink #14858

Closed

alt-glitch added type/bug Something isn't working P2 Medium — degraded but workaround exists comp/agent Core agent loop, run_agent.py, prompt builder provider/openai OpenAI / Codex Responses API labels May 1, 2026

millerc79 mentioned this pull request May 11, 2026

provider: nous falls back to 32,768-token context, blocking boot with model.context_length workaround required #24000

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(model_metadata): add gpt-5.x context lengths + guard against poisoned cache#5179

fix(model_metadata): add gpt-5.x context lengths + guard against poisoned cache#5179
Prithvi1994 wants to merge 1 commit into
NousResearch:mainfrom
Prithvi1994:fix/gpt-5-context-length

Prithvi1994 commented Apr 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Prithvi1994 commented Apr 5, 2026

Root Cause

Fix (two-part)

1. Add specific gpt-5.x entries to DEFAULT_CONTEXT_LENGTHS

2. Sanity guard in save_context_length()

Tests

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

1. Add specific gpt-5.x entries to `DEFAULT_CONTEXT_LENGTHS`

2. Sanity guard in `save_context_length()`