Port from Kilo-Org/kilocode#9434: strip historical media after context compression#19951
Closed
teknium1 wants to merge 1 commit into
Closed
Port from Kilo-Org/kilocode#9434: strip historical media after context compression#19951teknium1 wants to merge 1 commit into
teknium1 wants to merge 1 commit into
Conversation
…ssion
After context compression, the protected tail messages retain their
original image parts. When those include multi-MB pasted screenshots,
every subsequent API request re-ships the same base-64 blobs forever —
which can push the request past provider body-size limits and wedge the
session even though compression 'succeeded'.
Add _strip_historical_media() to agent/context_compressor.py. After the
summary is built, find the newest user message that carries an image
part and replace image parts in every earlier message with a short
text placeholder ('[Attached image — stripped after compression]').
The newest image-bearing user turn keeps its media so the model can
still analyse what the user just sent.
Handles all three multimodal shapes:
- OpenAI chat.completions image_url
- OpenAI Responses API input_image
- Anthropic native {type: image, source: ...}
Includes 27 unit tests covering the helpers and the end-to-end
compress() integration, plus a manual E2E check confirming a ~4MB
two-image conversation shrinks to ~2MB after compression.
Contributor
Author
|
Closing in favor of #27189, which salvages this PR onto current main. Clean cherry-pick — 107 compressor tests pass, E2E confirms newest image preserved + older replaced with placeholder + input not mutated. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Conversations that include large pasted images survive context compression instead of wedging a few turns later.
Root cause
ContextCompressor.compress()summarises the middle of the conversation but leaves head + tail messages untouched. When a user pastes a multi-MB screenshot in the protected tail, every subsequent turn keeps re-shipping that base-64 blob, eventually breaching the provider's request-size limit even though compression looked successful.Changes
agent/context_compressor.py: add_strip_historical_media(messages)plus helpers (_is_image_part,_content_has_images,_strip_images_from_content). Finds the newest user message with an image part and replaces image parts in all earlier messages with{"type": "text", "text": "[Attached image — stripped after compression]"}. Called fromcompress()right after_sanitize_tool_pairs. Handles OpenAI chat (image_url), Responses API (input_image), and Anthropic native (image) shapes. Shallow copies only; inputs are never mutated.tests/agent/test_compressor_historical_media.py: 27 tests — helper unit tests, strip-logic edge cases (no images, only-first-message image, non-dict entries, idempotence, non-mutation), and one integration test through the realcompress()path.Validation
tests/agent/test_compressor_historical_media.pytests/agent/(full)test_bedrock_1m_context)Port notes
This is a port of Kilo-Org/kilocode#9434. Kilo's version works on their typed
MessageV2structure and gates on "has a completed summary" message. In hermes-agent the equivalent invariant is "compress() ran at least once on this conversation," which is automatically true the moment the helper is called from insidecompress()— so the gate is implicit rather than a separate boolean. We also anchor on "newest user with images" rather than Kilo's "newest non-synthetic user part" because hermes-agent's synthetic-user messages (todo snapshots, etc.) are text-only and can't accidentally become the anchor.Existing behaviour preserved
_try_shrink_image_parts_in_messages(post-hoc rescue on "image too large") still runs unchanged._prune_old_tool_resultsis unchanged._preprocess_anthropic_content/_prepare_messages_for_non_vision_model(non-vision stripping) are unchanged.Spotted via the weekly Kilo Code PR scout cron.