Feature Request: Self-healing gateway with restart countdown, config backup & crash recovery

## Problem

When the OpenClaw gateway crashes or enters a restart loop due to a bad config change, there is currently no built-in recovery mechanism. The user has to:
- Notice the gateway is down
- Manually diagnose the cause
- Restore a previous config by hand
- Restart manually

This is especially painful on headless servers where the user is interacting via mobile (Telegram/WhatsApp).

## Proposed solution

Three small, composable features:

### 1. Restart countdown notifications
Before any gateway restart, send a channel notification with a countdown:
```
🔄 Gateway restart — T-60s
🔄 Gateway restart — T-30s
🔄 Gateway restart — T-10s
🔄 Gateway restart — T-0 🚀
✅ Gateway up — agents: Research · CRM · Site Seller · System · General
```

### 2. Automatic config backup
Before every restart, snapshot `openclaw.json` to a rotating backup directory (keep last N). Tag each as `good-config-<timestamp>.json`. On crash-loop detection, automatically roll back to the last known good config.

### 3. Crash-loop watchdog
A short-lived watchdog (runs ~90s post-restart, then exits). If ≥3 restarts occur within the window:
- Save the bad config for diagnostics
- Restore last known good config automatically
- Restart the gateway
- Notify the user what happened and which config was restored

## Reference implementation

We built this as shell scripts that work well in production on a headless Hetzner VPS (Ubuntu 24.04). Key insight: the watchdog needs to be short-lived (not a daemon) and the restart needs to be scheduled with a small delay (via `at`) so the exec session can return before the kill happens — otherwise you kill the session running the restart command.

Happy to share the scripts as a starting point for a proper implementation.

## Why it matters

For always-on, mobile-first setups (the core OpenClaw use case) this is table stakes. The gateway should be self-healing — not something the user has to babysit from their phone.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature Request: Self-healing gateway with restart countdown, config backup & crash recovery #31480

Problem

Proposed solution

1. Restart countdown notifications

2. Automatic config backup

3. Crash-loop watchdog

Reference implementation

Why it matters

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Feature Request: Self-healing gateway with restart countdown, config backup & crash recovery #31480

Description

Problem

Proposed solution

1. Restart countdown notifications

2. Automatic config backup

3. Crash-loop watchdog

Reference implementation

Why it matters

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions