fix(compression): pass provider to context length resolver in feasibility check#15661
Merged
Conversation
…lity check
_check_compression_model_feasibility calls get_model_context_length
without provider=, so Codex OAuth users get 1,050,000 (from models.dev
for 'openai') instead of the actual 272,000 limit. This happens because
_infer_provider_from_url maps chatgpt.com → 'openai' (not 'openai-codex'),
skipping the Codex-specific resolution branch entirely.
Result: compression threshold set at 85% of 1.05M = 892K — conversations
never trigger compression, the context grows unbounded, and when gateway
hygiene eventually forces compression, the Codex endpoint drops the
oversized streaming request ('peer closed connection without sending
complete message body').
Fix: forward self.provider to get_model_context_length so provider-
specific resolution branches (Codex OAuth 272K, Copilot live /models,
Nous suffix-match) fire correctly.
Reported by user on GPT 5.5 via Codex OAuth Pro (paste.rs/vsra3).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
_check_compression_model_feasibilitycallsget_model_context_length(aux_model, base_url=..., api_key=...)withoutprovider=. For Codex OAuth users:_infer_provider_from_url("chatgpt.com")returns"openai"— not"openai-codex"if effective_provider == "openai-codex":branch inget_model_context_lengthnever fireslookup_models_dev_context("openai", "gpt-5.5")→ returns 1,050,000provider="openai-codex"is passed)User evidence (paste.rs/vsra3, GPT 5.5 via Codex OAuth Pro):
The 1,050,000 is wrong — should be 272,000.
Consequence: Compression threshold set at 892K instead of 231K. Conversations never trigger compression. When gateway hygiene eventually forces it, the Codex endpoint drops the oversized streaming summarization request (
peer closed connection without sending complete message body), producing 15+Context compression failed after 3 attemptserrors in a row.Fix
One line: forward
self.providertoget_model_context_length:This ensures provider-specific resolution branches fire correctly:
_resolve_codex_oauth_context_length)/modelsAPI (max_prompt_tokens)Test
3 existing tests updated for the new
provider=kwarg. All 18 feasibility tests pass.