Skip to content

fix(compression): pass provider to context length resolver in feasibility check#15661

Merged
teknium1 merged 1 commit into
mainfrom
fix/feasibility-pass-provider
Apr 25, 2026
Merged

fix(compression): pass provider to context length resolver in feasibility check#15661
teknium1 merged 1 commit into
mainfrom
fix/feasibility-pass-provider

Conversation

@kshitijk4poor

Copy link
Copy Markdown
Collaborator

Problem

_check_compression_model_feasibility calls get_model_context_length(aux_model, base_url=..., api_key=...) without provider=. For Codex OAuth users:

  1. _infer_provider_from_url("chatgpt.com") returns "openai" — not "openai-codex"
  2. The if effective_provider == "openai-codex": branch in get_model_context_length never fires
  3. Falls through to lookup_models_dev_context("openai", "gpt-5.5") → returns 1,050,000
  4. Actual Codex OAuth limit is 272,000 (resolved correctly when provider="openai-codex" is passed)

User evidence (paste.rs/vsra3, GPT 5.5 via Codex OAuth Pro):

Session hygiene: 426 messages, ~213,995 tokens — auto-compressing (threshold: 85% of 1,050,000 = 892,500 tokens)

The 1,050,000 is wrong — should be 272,000.

Consequence: Compression threshold set at 892K instead of 231K. Conversations never trigger compression. When gateway hygiene eventually forces it, the Codex endpoint drops the oversized streaming summarization request (peer closed connection without sending complete message body), producing 15+ Context compression failed after 3 attempts errors in a row.

Fix

One line: forward self.provider to get_model_context_length:

provider=getattr(self, "provider", ""),

This ensures provider-specific resolution branches fire correctly:

  • Codex OAuth → 272K (via _resolve_codex_oauth_context_length)
  • Copilot → live /models API (max_prompt_tokens)
  • Nous → suffix-match via OpenRouter cache

Test

3 existing tests updated for the new provider= kwarg. All 18 feasibility tests pass.

…lity check

_check_compression_model_feasibility calls get_model_context_length
without provider=, so Codex OAuth users get 1,050,000 (from models.dev
for 'openai') instead of the actual 272,000 limit. This happens because
_infer_provider_from_url maps chatgpt.com → 'openai' (not 'openai-codex'),
skipping the Codex-specific resolution branch entirely.

Result: compression threshold set at 85% of 1.05M = 892K — conversations
never trigger compression, the context grows unbounded, and when gateway
hygiene eventually forces compression, the Codex endpoint drops the
oversized streaming request ('peer closed connection without sending
complete message body').

Fix: forward self.provider to get_model_context_length so provider-
specific resolution branches (Codex OAuth 272K, Copilot live /models,
Nous suffix-match) fire correctly.

Reported by user on GPT 5.5 via Codex OAuth Pro (paste.rs/vsra3).
@alt-glitch alt-glitch added type/bug Something isn't working P1 High — major feature broken, no workaround comp/agent Core agent loop, run_agent.py, prompt builder provider/openai OpenAI / Codex Responses API labels Apr 25, 2026
@teknium1 teknium1 merged commit d635e2d into main Apr 25, 2026
10 of 12 checks passed
@teknium1 teknium1 deleted the fix/feasibility-pass-provider branch April 25, 2026 14:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/agent Core agent loop, run_agent.py, prompt builder P1 High — major feature broken, no workaround provider/openai OpenAI / Codex Responses API type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants