fix(compression): pass provider to context length resolver in feasibility check by kshitijk4poor · Pull Request #15661 · NousResearch/hermes-agent

kshitijk4poor · 2026-04-25T13:56:58Z

Problem

_check_compression_model_feasibility calls get_model_context_length(aux_model, base_url=..., api_key=...) without provider=. For Codex OAuth users:

_infer_provider_from_url("chatgpt.com") returns "openai" — not "openai-codex"
The if effective_provider == "openai-codex": branch in get_model_context_length never fires
Falls through to lookup_models_dev_context("openai", "gpt-5.5") → returns 1,050,000
Actual Codex OAuth limit is 272,000 (resolved correctly when provider="openai-codex" is passed)

User evidence (paste.rs/vsra3, GPT 5.5 via Codex OAuth Pro):

Session hygiene: 426 messages, ~213,995 tokens — auto-compressing (threshold: 85% of 1,050,000 = 892,500 tokens)

The 1,050,000 is wrong — should be 272,000.

Consequence: Compression threshold set at 892K instead of 231K. Conversations never trigger compression. When gateway hygiene eventually forces it, the Codex endpoint drops the oversized streaming summarization request (peer closed connection without sending complete message body), producing 15+ Context compression failed after 3 attempts errors in a row.

Fix

One line: forward self.provider to get_model_context_length:

provider=getattr(self, "provider", ""),

This ensures provider-specific resolution branches fire correctly:

Codex OAuth → 272K (via _resolve_codex_oauth_context_length)
Copilot → live /models API (max_prompt_tokens)
Nous → suffix-match via OpenRouter cache

Test

3 existing tests updated for the new provider= kwarg. All 18 feasibility tests pass.

…lity check _check_compression_model_feasibility calls get_model_context_length without provider=, so Codex OAuth users get 1,050,000 (from models.dev for 'openai') instead of the actual 272,000 limit. This happens because _infer_provider_from_url maps chatgpt.com → 'openai' (not 'openai-codex'), skipping the Codex-specific resolution branch entirely. Result: compression threshold set at 85% of 1.05M = 892K — conversations never trigger compression, the context grows unbounded, and when gateway hygiene eventually forces compression, the Codex endpoint drops the oversized streaming request ('peer closed connection without sending complete message body'). Fix: forward self.provider to get_model_context_length so provider- specific resolution branches (Codex OAuth 272K, Copilot live /models, Nous suffix-match) fire correctly. Reported by user on GPT 5.5 via Codex OAuth Pro (paste.rs/vsra3).

kshitijk4poor mentioned this pull request Apr 25, 2026

fix(flush_memories): always deduct headroom + resolve flush aux model + trim defence #15638

Closed

alt-glitch added type/bug Something isn't working P1 High — major feature broken, no workaround comp/agent Core agent loop, run_agent.py, prompt builder provider/openai OpenAI / Codex Responses API labels Apr 25, 2026

teknium1 merged commit d635e2d into main Apr 25, 2026
10 of 12 checks passed

teknium1 deleted the fix/feasibility-pass-provider branch April 25, 2026 14:09

alt-glitch mentioned this pull request May 1, 2026

Fix gpt-5.4 context length resolution for Codex #5174

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(compression): pass provider to context length resolver in feasibility check#15661

fix(compression): pass provider to context length resolver in feasibility check#15661
teknium1 merged 1 commit into
mainfrom
fix/feasibility-pass-provider

kshitijk4poor commented Apr 25, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants