fix(compression): deterministic fallback when summary LLM fails (salvages #31242)#34310
Merged
Conversation
Contributor
🔎 Lint report:
|
9 tasks
15 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Salvages #31242 onto current main: when context compression's summary LLM fails (e.g. Codex 429
usage_limit_reached), the dropped-window placeholder is now a deterministic locally-extracted handoff instead of a content-freeN messages were removedmarker.Cherry-picked 3 commits from @Hinotoi-agent's
fix/deterministic-compression-fallbackonto current main, authorship preserved.What it does
Adds
_build_static_fallback_summary()toagent/context_compressor.py. Only fires on the existing failure path (abort_on_summary_failure=FalseAND aux LLM returned empty/errored). From the compacted window it extracts:path/workdir/file_path/output_pathkeys in tool args)error|failed|exception|traceback|timeout)Output is capped (8KB total, 700 chars/turn), deduped, and run through
redact_sensitive_text+ an extra gh-token-prefix scrub before insertion. Normal LLM-summarized compression is unchanged.Why this matters
Reference incident: Discord user @damien hit Codex Plus
usage_limit_reachedmid-compression on a 145k-token session. The aux compressor 429'd → 142 messages dropped → the content-free placeholder gave the next turn nothing to work with → continuity collapsed. With this PR the same failure still drops the window, but the model gets back user asks, file paths, tool activity, and error snippets so it can keep going.compression.abort_on_summary_failure=trueremains the alternative — freeze the session instead of dropping. This PR only improves the drop path that runs when that flag is false (default).Scope
agent/context_compressor.py— new_build_static_fallback_summary(), helper extractors, fallback-branch wiringtests/agent/test_context_compressor.py— coverage for dict/object tool calls, path extraction (POSIX + Windows), gh-token redaction, bounded output, protected-tail exclusion, dropped-turn snippet behaviourNo provider, gateway, loop, CLI, or storage changes.
Validation
agent/context_compressor.py+ tests compilescripts/run_tests.sh tests/agent/test_context_compressor.pyscripts/run_tests.sh tests/acp/test_server.pyNotes on the ACP test failure on the original PR
Upstream PR #31242 had
test (3)red ontests/acp/test_server.py::test_model_switch_uses_requested_providerwith'custom' == 'anthropic'. That was an unrelated xdist-worker flake caused by another test in the same worker mutating_KNOWN_PROVIDER_NAMES/_PROVIDER_ALIASES. Already pinned on current main by commits 3127a41 and 2b76853 (monkeypatchparse_model_input, drop the brittle tail-position assertion). Re-running 78/78 green here confirms.Infographic
Credits