Skip to content

Compaction timeout: add escape hatch to prevent session hang #44031

@flatmountain4U

Description

@flatmountain4U

Problem

When compaction times out, selectCompactionTimeoutSnapshot() falls back to an already-overflowed snapshot (e.g., 234k tokens in a 200k context window). The LLM call then hangs or fails repeatedly, blocking the entire lane. All subsequent messages to that agent queue up indefinitely — the bot appears "dead."

Expected behavior

On compaction timeout, there should be a forced recovery path that prevents the session from becoming permanently stuck. For example:

  • Truncate to system/bootstrap prompt + last N turns
  • Archive the overflowed transcript and start a fresh session
  • Skip the failed session and process other queued messages

Suggestion

Add a compaction.timeoutAction setting:

"compaction": {
  "timeoutAction": "truncate" // "reset" | "truncate" | "fallback"
}
  • "truncate" — keep system prompt + bootstrap + last N turns, discard the rest
  • "reset" — archive transcript, create a new empty session
  • "fallback" — current behavior (use timeout snapshot as-is)

Additional: Lane isolation

A single session's compaction failure should not block the entire lane. Other sessions/messages should continue to be processed. Consider per-session error isolation so one stuck session doesn't take down the agent.

Current workaround

  • Aggressive compaction settings (maxHistoryShare: 0.4, recentTurnsPreserve: 3, early memory flush)
  • External session_overflow_guard.sh that scans sessions.json for >90% token usage and archives/removes overflow sessions
  • Called from self_heal.sh (every 5 minutes via cron)

Environment

  • OpenClaw 2026.3.7
  • Model: gpt-5.4 (200k context window)
  • Observed at 234k/200k (117% overflow)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions