Skip to content

Port from Kilo-Org/kilocode#9434: strip historical media after context compression#19951

Closed
teknium1 wants to merge 1 commit into
mainfrom
kilocode-port/compact-strip-media
Closed

Port from Kilo-Org/kilocode#9434: strip historical media after context compression#19951
teknium1 wants to merge 1 commit into
mainfrom
kilocode-port/compact-strip-media

Conversation

@teknium1

@teknium1 teknium1 commented May 5, 2026

Copy link
Copy Markdown
Contributor

Summary

Conversations that include large pasted images survive context compression instead of wedging a few turns later.

Root cause

ContextCompressor.compress() summarises the middle of the conversation but leaves head + tail messages untouched. When a user pastes a multi-MB screenshot in the protected tail, every subsequent turn keeps re-shipping that base-64 blob, eventually breaching the provider's request-size limit even though compression looked successful.

Changes

  • agent/context_compressor.py: add _strip_historical_media(messages) plus helpers (_is_image_part, _content_has_images, _strip_images_from_content). Finds the newest user message with an image part and replaces image parts in all earlier messages with {"type": "text", "text": "[Attached image — stripped after compression]"}. Called from compress() right after _sanitize_tool_pairs. Handles OpenAI chat (image_url), Responses API (input_image), and Anthropic native (image) shapes. Shallow copies only; inputs are never mutated.
  • tests/agent/test_compressor_historical_media.py: 27 tests — helper unit tests, strip-logic edge cases (no images, only-first-message image, non-dict entries, idempotence, non-mutation), and one integration test through the real compress() path.

Validation

Before After
4 MB two-image conversation, post-compression JSON 4,000,682 bytes 2,000,659 bytes (~50% reduction; newest image kept)
tests/agent/test_compressor_historical_media.py N/A 27/27 pass
tests/agent/ (full) 2447 pass, 1 pre-existing failure (test_bedrock_1m_context) same

Port notes

This is a port of Kilo-Org/kilocode#9434. Kilo's version works on their typed MessageV2 structure and gates on "has a completed summary" message. In hermes-agent the equivalent invariant is "compress() ran at least once on this conversation," which is automatically true the moment the helper is called from inside compress() — so the gate is implicit rather than a separate boolean. We also anchor on "newest user with images" rather than Kilo's "newest non-synthetic user part" because hermes-agent's synthetic-user messages (todo snapshots, etc.) are text-only and can't accidentally become the anchor.

Existing behaviour preserved

  • hermes-agent's separate _try_shrink_image_parts_in_messages (post-hoc rescue on "image too large") still runs unchanged.
  • _prune_old_tool_results is unchanged.
  • _preprocess_anthropic_content / _prepare_messages_for_non_vision_model (non-vision stripping) are unchanged.
  • Conversations that never trigger compression are unaffected.

Spotted via the weekly Kilo Code PR scout cron.

…ssion

After context compression, the protected tail messages retain their
original image parts. When those include multi-MB pasted screenshots,
every subsequent API request re-ships the same base-64 blobs forever —
which can push the request past provider body-size limits and wedge the
session even though compression 'succeeded'.

Add _strip_historical_media() to agent/context_compressor.py. After the
summary is built, find the newest user message that carries an image
part and replace image parts in every earlier message with a short
text placeholder ('[Attached image — stripped after compression]').
The newest image-bearing user turn keeps its media so the model can
still analyse what the user just sent.

Handles all three multimodal shapes:
  - OpenAI chat.completions image_url
  - OpenAI Responses API input_image
  - Anthropic native {type: image, source: ...}

Includes 27 unit tests covering the helpers and the end-to-end
compress() integration, plus a manual E2E check confirming a ~4MB
two-image conversation shrinks to ~2MB after compression.
@alt-glitch alt-glitch added type/bug Something isn't working P1 High — major feature broken, no workaround comp/agent Core agent loop, run_agent.py, prompt builder labels May 5, 2026
@teknium1

Copy link
Copy Markdown
Contributor Author

Closing in favor of #27189, which salvages this PR onto current main. Clean cherry-pick — 107 compressor tests pass, E2E confirms newest image preserved + older replaced with placeholder + input not mutated.

@teknium1 teknium1 closed this May 17, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/agent Core agent loop, run_agent.py, prompt builder P1 High — major feature broken, no workaround type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants