Skip to content

Compaction safeguard silently loses messages when summarization fails #10099

@Drickon

Description

@Drickon

Problem

When the default compaction mode (safeguard) activates and needs to prune history, it attempts to summarize dropped messages. If the summarization LLM call fails (e.g., due to API issues, rate limiting, or provider outages), the catch block in compaction-safeguard.ts logs a warning but continues — meaning the user's messages are permanently replaced with a generic placeholder:

"Summary unavailable due to context limits. Older messages were truncated."

This results in silent, irrecoverable message loss.

Location

  • src/agents/pi-extensions/compaction-safeguard.ts
  • pruneHistoryForContextShare() — dropped message summarization catch block (~line 197-208)
  • Final fallback catch (~line 247) returns fallbackSummary with no content preservation

Reproduction

  1. Start a long session that approaches the compaction threshold
  2. Have the LLM provider return errors during the summarization call (simulate API outage)
  3. Compaction runs, drops older messages, attempts summarization, fails
  4. User's earlier messages are replaced with generic placeholder — no way to recover them

Suggested Fixes

  1. Preserve raw content: When summarization fails, keep a truncated version of the original messages rather than a generic placeholder
  2. Retry with fallback model: Try a different/smaller model before giving up on summarization
  3. Block compaction on failure: If summarization fails, skip compaction and let the context overflow error propagate (giving the user a clear error rather than silently losing context)
  4. Notify the user: When compaction drops messages without proper summarization, send a visible notification

Context

Discovered while investigating #6016. The direct cascade from embedding failures to compaction doesn't exist (sync errors are properly caught with .catch()), but the compaction summarization failure path is a genuine vulnerability that can cause data loss during provider outages.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions