Problem
The compaction observation/summary step loses critical context that the user was actively working on, causing apparent amnesia. The bot appears to forget the topic of conversation, its own identity context, and which skills were relevant.
Root Cause
Compaction has three phases:
- Phase 1 — Tool result clearing (keeps last N messages)
- Phase 2 — Observation compression via LLM sidecar
- Phase 3 — Event recording (replaces history)
The observation prompt (`ObservationPromptBuilder`) asks the compaction LLM to summarize discarded messages into bullet points. Issues:
- No instruction to preserve identity/skill context — the summary doesn't know which skills were auto-loaded or what the bot's active identity context was
- KeepRecentMessages = 6 — roughly 2 turns with tool calls, which may not capture the full conversational thread
- Compaction model quality — if using a cheaper/faster model for compaction (via `CompactionModelId`), summary quality degrades
- No validation — there's no check that the summary actually preserved the key facts/intent
Impact
- After compaction, bot appears to have amnesia about the conversation topic
- Skills cleared + poor summary = complete context loss
- User reports "split brain" behavior
Relevant Code
- `ObservationPromptBuilder.cs` — summary generation prompt
- `ExtractiveSessionReducer.cs` — Phase 1 message selection
- `LlmSessionActor.cs:721-817` — full compaction flow
Incidents
Session `D0AC6CKBK5K/1774021483_588239` (0.7.0) — mid-conversation amnesia, likely post-compaction.
All three sessions may have experienced compaction-related quality loss.
Eval Requirement
This is a prime candidate for the eval suite — we need to measure compaction summary quality against ground truth to detect regressions.
Problem
The compaction observation/summary step loses critical context that the user was actively working on, causing apparent amnesia. The bot appears to forget the topic of conversation, its own identity context, and which skills were relevant.
Root Cause
Compaction has three phases:
The observation prompt (`ObservationPromptBuilder`) asks the compaction LLM to summarize discarded messages into bullet points. Issues:
Impact
Relevant Code
Incidents
Session `D0AC6CKBK5K/1774021483_588239` (0.7.0) — mid-conversation amnesia, likely post-compaction.
All three sessions may have experienced compaction-related quality loss.
Eval Requirement
This is a prime candidate for the eval suite — we need to measure compaction summary quality against ground truth to detect regressions.