[Feature] Automatic config rollback on gateway failure

## Problem

When `config.apply` or `config.patch` writes a configuration change that breaks the gateway (invalid field combination, bad secret reference, incompatible plugin config), the gateway fails to start and stays down. There is no automatic recovery — the user must manually identify and fix the config or restore from a backup they may not have.

In `hybrid` reload mode, the gateway auto-restarts for critical config changes. If that restart fails due to a bad config, there's no fallback — `KeepAlive` just crash-loops the same broken config.

## Evidence

- Multiple incidents where config changes broke the gateway, requiring manual terminal intervention.
- No backup is created before config writes. If the user doesn't have a manual backup, recovery requires guessing what changed.

## Workaround

I built `oc-config-safe`, a ~120-line bash script that enforces:
1. Create timestamped backup of current config
2. Stop gateway
3. Validate new config (JSON parse + schema check)
4. Apply change
5. Start gateway
6. Verify health within timeout
7. If health check fails → automatically restore backup and restart

This has prevented data loss on multiple occasions.

## Proposed Solution

Add a backup-and-rollback mechanism to the native config pipeline:

1. **Before any config write** (`config.apply`, `config.patch`, `doctor --fix`), save the current config as `openclaw.json.last-good`
2. **After writing**, verify gateway health within a configurable timeout
3. **If health fails**, automatically restore `openclaw.json.last-good` and restart
4. **Expose** `config.rollback` as a gateway tool action for manual recovery

```json
{
  "gateway": {
    "configSafety": {
      "backupBeforeWrite": true,
      "rollbackOnFailure": true,
      "healthCheckTimeoutMs": 30000
    }
  }
}
```

## Impact

Medium-high. Config changes are a common failure point, especially for users customizing agent configs, adding plugins, or adjusting auth profiles. Automatic rollback would prevent extended downtime from bad configs.

## Environment

- OpenClaw 2026.4.10 (npm, macOS)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Feature] Automatic config rollback on gateway failure #65814

Problem

Evidence

Workaround

Proposed Solution

Impact

Environment

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

[Feature] Automatic config rollback on gateway failure #65814

Description

Problem

Evidence

Workaround

Proposed Solution

Impact

Environment

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions