Context compaction can misread preserved todo/tool state as current user intent and leak MEDIA directives

## Bug description

`context compression` can preserve cross-session/tool state in a way that looks like a fresh user request in the new session.

In the failure mode I hit, three things stack together:

1. the compaction summary carries forward an old `## Active Task`
2. the preserved todo list is injected as a normal `user` message
3. tool outputs such as `memory` / `session_search` are serialized verbatim into the summarizer input, including strings like `MEDIA:`

That can cause the resumed assistant to follow an old task instead of the latest real user message, and can also make `MEDIA:` directives leak back into normal assistant text.

## Why this matters

There are two separate bad outcomes here:

### 1) Wrong task resumption after compaction
The post-compaction todo injection currently looks like ordinary conversation text, so the model can treat it as the current user ask.

### 2) `MEDIA:` directive contamination
If `memory` / `session_search` / other tool results contain text like `MEDIA:/tmp/foo.png`, that text can be preserved in the compaction chain and later echoed by the model as plain content.

On gateway integrations that parse `MEDIA:` tags for file delivery, this can lead to bogus attachment attempts (for example trying to send a non-existent file path extracted from quoted prose or preference text).

## Minimal repro shape

A deterministic repro can be built with a compressed conversation containing:

- a compaction summary with an old `## Active Task`
- a preserved active todo snapshot
- a `memory` or `session_search` tool result containing `MEDIA:` text
- a latest real user message that should be the only active request

Observed behavior:

- the assistant may resume the old task / preserved todo state instead of the latest real user message
- `MEDIA:` text from tool state can survive into later assistant-visible context as if it were ordinary text

## Suspect locations

- `agent/context_compressor.py`
  - `_serialize_for_summary()` currently serializes tool result content and tool-call args directly into the summarizer input
- `tools/todo_tool.py`
  - `format_for_injection()` renders preserved todo state as natural-language text
- `run_agent.py`
  - `_compress_context()` injects the todo snapshot back into the compressed message list as a `user` message

## Why the existing gateway-side `MEDIA:` hardening is not enough

I know there was already work around stricter `MEDIA:` extraction in gateway parsing, but this bug happens earlier in the pipeline:

- summary contamination / stale task carry-over
- todo state being injected as if it were a user utterance
- tool-state text containing control directives being preserved and resurfaced

So even if gateway extraction is stricter, the conversation state can still get semantically polluted after compaction.

## Suggested fix directions

1. Treat `memory`, `session_search`, `todo` and similar tool state as **non-intent state**, not current user intent, when building summary input
2. Mask control directives like `MEDIA:` before tool outputs are fed into compaction summaries
3. Do not inject preserved todo state as natural-language text that looks like a fresh `user` message
4. Ensure preserved todo state does not outrank the latest real user message after compaction

## Regression coverage that would be useful

- summary input containing `memory` / `session_search` results with `MEDIA:` should not preserve raw `MEDIA:` tokens
- preserved todo state should be clearly machine-generated state, not look like a new user request
- after compaction, the latest real user message should remain the active request even when summary + preserved todo state are both present

If helpful, I can turn the local repro/fix into a PR next.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Context compaction can misread preserved todo/tool state as current user intent and leak MEDIA directives #14665

Bug description

Why this matters

1) Wrong task resumption after compaction

2) `MEDIA:` directive contamination

Minimal repro shape

Suspect locations

Why the existing gateway-side `MEDIA:` hardening is not enough

Suggested fix directions

Regression coverage that would be useful

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Context compaction can misread preserved todo/tool state as current user intent and leak MEDIA directives #14665

Description

Bug description

Why this matters

1) Wrong task resumption after compaction

2) MEDIA: directive contamination

Minimal repro shape

Suspect locations

Why the existing gateway-side MEDIA: hardening is not enough

Suggested fix directions

Regression coverage that would be useful

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

2) `MEDIA:` directive contamination

Why the existing gateway-side `MEDIA:` hardening is not enough