[Bug]: protect_first_n causes head message fossilization across compressions — old user messages become immortal

## Title

[Bug]: `protect_first_n` causes head message fossilization across compressions — old user messages become immortal

## Labels

bug, compression, P2

## Body

### Bug Description

`protect_first_n` (default 3) in `context_compressor.py` blindly preserves the first N messages of the session **unconditionally**, without checking whether their content is already covered by the compression summary or relevant to the current conversation.

Every time context compression fires:
1. First N messages are copied verbatim as the new "head"
2. Middle messages get summarized
3. Tail messages are preserved
4. A **new session** is created with: `[head] + [summary] + [tail]`

On the **next** compression, the same head messages are again preserved as the head of the new session — even if they were asked hours ago, already answered, and included in the summary.

This creates **"immortal messages"** that persist across an entire session lineage, surviving every compression pass. In long Telegram sessions with multiple ReadTimeout-triggered compressions, the model may see a user question from 6+ hours ago as its most recent instruction.

### Reproduction

1. Start a Telegram session with Hermes Agent
2. Send a message (e.g. "what's new in v2026.4.16?")
3. Continue the conversation until context compression triggers naturally (or trigger via ReadTimeout retry)
4. Observe: the first message is now part of the compressed head
5. Keep chatting → another compression fires → same first message is STILL the head
6. Repeat: the original message appears in every compressed session

### Evidence

Gateway log confirms the original message was received once at `07:18:28`:

```
2026-04-18 07:18:28,968 INFO gateway.run: inbound message: platform=telegram user=ryanchao chat=1092516733 msg='hermes agent v2026.4.16 版本更新與修正了什麼?'
```

SessionDB shows this same message appearing as the first user message across 6+ compressed sessions:
- `071828` (07:18) — original
- `111714` (11:17) — 1st compression
- `114823` (11:48) — 2nd compression
- `121142` (12:11) — 3rd compression
- `133124` (13:31) — 4th compression
- `135043` (13:50) — 5th compression

### Root Cause

In `agent/context_compressor.py`, the `compress()` method:

```python
# Line ~1040
compress_start = self.protect_first_n  # default 3
```

The head messages (`messages[:compress_start]`) are copied verbatim into the compressed output **without any deduplication check against the summary**. Since compression creates a new session, and the gateway reloads from the transcript on the next turn, these head messages become the permanent first messages of every future session.

### Why existing mitigations don't catch this

- **#8107** (`SUMMARY_PREFIX` rewrite) — fixes stale-question answering from the **summary** body, but the fossilized head messages are **outside** the summary entirely (they're real messages before the summary block)
- **#9631** (iterative topic bleed) — addresses topic drift in summary content, not raw message preservation
- **#2224** (file-read history injection) — different mechanism (synthetic user messages from file reads)

### Suggested Fix

Options (not mutually exclusive):

**A. Dedup head against summary content**
Before preserving head messages, check if their content or semantic meaning already exists in the generated summary (e.g. matching against "Resolved Questions"). Skip or truncate fossilized messages.

**B. Add staleness check**
Only preserve head messages if they were sent within a recent time window (e.g. within the last N minutes or within the current session's creation time). If the session has been compressed multiple times and the head predates the oldest summary, drop it.

**C. Make `protect_first_n = 0` safer**
Currently, setting `protect_first_n: 0` means the system prompt is the only preserved context from the beginning. This is actually reasonable since the summary covers everything else. Consider lowering the default or documenting this as a recommended workaround.

**D. Head content-aware preservation**
Instead of preserving the first N messages by position, preserve messages that are:
- System prompts (always)
- The most recent context-switch markers (e.g. `/new`, `/topic`)
- Messages not already covered by the summary

### Environment

- Hermes Agent: v2026.4.16 (v0.10.0)
- Platform: Telegram gateway
- Trigger: ReadTimeout retries → compression
- `protect_first_n`: 3 (default)
- `protect_last_n`: 4 (default)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: protect_first_n causes head message fossilization across compressions — old user messages become immortal #11996

Title

Labels

Body

Bug Description

Reproduction

Evidence

Root Cause

Why existing mitigations don't catch this

Suggested Fix

Environment

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[Bug]: protect_first_n causes head message fossilization across compressions — old user messages become immortal #11996

Description

Title

Labels

Body

Bug Description

Reproduction

Evidence

Root Cause

Why existing mitigations don't catch this

Suggested Fix

Environment

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions