Feature: Auto-resume unanswered sessions after gateway restart

## Problem

After a gateway restart (SIGUSR1, config change, update, or manual restart), all active agent sessions are interrupted. If an agent was mid-conversation in a Signal group (or any channel), the session dies and **the agent never follows up**. The user has to re-send their message or poke the agent with "?" to get a response.

This is especially painful when:
- A config change triggers an automatic restart
- The gateway restarts during an active conversation
- Multiple agents across multiple Signal groups are affected simultaneously

The user experience is terrible — messages appear "read" (Signal read receipts) but never get a response. It looks like the agent is ignoring you.

## Current Workarounds

### 1. `historyLimit` (partial fix)
Setting `channels.signal.historyLimit: 15` means agents see recent group messages when a new session starts. But this only helps **if someone sends a new message** — the agent still sits idle until poked.

### 2. `BOOT.md` + scan script (workaround)
We built a `BOOT.md` that runs on `gateway:startup` via the `boot-md` hook. It scans all agent session transcripts for Signal groups where the last message was from a user (unanswered), then sends `sessions_send` nudges to those agents.

This works but is fragile:
- It costs a full agent turn on the main agent every restart
- The scan script reads raw JSONL transcripts (implementation detail that could change)
- It cannot detect messages lost during the drain window
- Maximum 5 nudges per boot to avoid token storms

## Proposed Solution

### Native session resumption after restart

When the gateway comes back up after a SIGUSR1 restart:

1. **Detect interrupted sessions** — sessions that had an active turn aborted by drain, or sessions where the last transcript entry is a user message with no assistant response
2. **Auto-resume those sessions** — inject a system event like: *"The gateway restarted. Review conversation context and respond to any unanswered messages."* or simply re-process the last user message
3. **Scope it to channel sessions only** — skip heartbeat, subagent, and boot sessions
4. **Rate limit** — cap at N concurrent resumptions to avoid API storms
5. **Configurable** — add a config key like `session.resumeAfterRestart: true/false` (default: true)

### Bonus: Drain-aware message queuing
The `GatewayDrainingError` should queue messages silently (the code already has `resetAllLanes()` for this, but it does not always work). Messages received during drain should be replayed after restart, not rejected.

## Environment
- OpenClaw 2026.3.13
- Signal channel with ~27 bound agents across Signal groups
- Frequent restarts due to config changes, updates, and development

## Impact
This affects every multi-agent Signal setup. Any restart = broken conversations across all active groups. The user has to manually re-engage every agent that was mid-conversation.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature: Auto-resume unanswered sessions after gateway restart #51917

Problem

Current Workarounds

1. `historyLimit` (partial fix)

2. `BOOT.md` + scan script (workaround)

Proposed Solution

Native session resumption after restart

Bonus: Drain-aware message queuing

Environment

Impact

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Feature: Auto-resume unanswered sessions after gateway restart #51917

Description

Problem

Current Workarounds

1. historyLimit (partial fix)

2. BOOT.md + scan script (workaround)

Proposed Solution

Native session resumption after restart

Bonus: Drain-aware message queuing

Environment

Impact

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

1. `historyLimit` (partial fix)

2. `BOOT.md` + scan script (workaround)