Skip to content

Auto-update is not atomic: config/plugin version mismatch causes repeated crash loops #58041

@allanjeng

Description

@allanjeng

Summary

Auto-update consistently causes gateway crash loops because it is not atomic — config or plugin manifests get updated to require a newer version while the binary remains at the old version (or vice versa). This has caused 3 separate overnight outages in 6 days on our setup.

Environment

  • macOS 15.x (arm64, Mac mini)
  • Node: v22.22.0
  • OpenClaw: experienced across v2026.3.13 → v2026.3.23-2 → v2026.3.24 → v2026.3.28
  • LaunchAgent plist with KeepAlive: true

Crash Pattern

Every auto-update follows the same failure mode:

  1. Auto-update triggers during stableDelayHours window
  2. npm i -g openclaw@latest runs, updates packages
  3. Gateway receives SIGTERM for restart
  4. On restart, version mismatch between config/plugins and binary
  5. Gateway fails to start → crash loop → launchd eventually SIGKILL with reason "inefficient"
  6. Manual openclaw update required to recover

Incident 1 (Mar 25): Entrypoint change

  • v2026.3.13 → v2026.3.23-2
  • New version changed entrypoint from dist/index.jsdist/entry.js
  • LaunchAgent plist not updated by auto-update
  • Gateway crash-looped ~10 hours overnight

Incident 2 (Mar 26): Same entrypoint issue

  • v2026.3.23-2 → v2026.3.24
  • Same plist/entrypoint mismatch
  • openclaw doctor flagged it but nobody was around to act

Incident 3 (Mar 31): Plugin version requirements

  • v2026.3.24 → v2026.3.28
  • Auto-update wrote config compatible with v2026.3.28
  • Binary stayed at v2026.3.24
  • Every plugin (including discord, imessage) refused to load:
    plugins.entries.discord: plugin requires OpenClaw >=2026.3.28, but this host is 2026.3.24; skipping load
    
    (19 plugins failed with same error)
  • Gateway could not start at all

Root Cause

Auto-update is not atomic. It can update:

  • npm packages (including plugin manifests with new version requirements)
  • Config file (via doctor/migration)
  • But NOT the plist entrypoint
  • And sometimes partially updates, leaving binary at old version while config/plugins expect new version

Expected Behavior

Auto-update should either:

  1. Be fully atomic — update binary, config, plist, and plugins in one transaction, or roll back on failure
  2. Run openclaw doctor --fix as part of the update before restarting
  3. Validate config against the current binary version before applying config changes
  4. Not modify config to require a version that is not yet running

Workaround

openclaw update (manual) works correctly every time. Only auto-update fails.

Related Issues

All three issues stem from the same fundamental problem: auto-update is not atomic and has no rollback mechanism.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions