Skip to content

SIGUSR1 restart causes infinite LaunchAgent respawn loop (KeepAlive conflict) #21685

@Jimmysnielsen

Description

@Jimmysnielsen

Bug Description

When the gateway is managed by a macOS LaunchAgent with KeepAlive: true, a SIGUSR1 in-process restart causes an infinite respawn loop.

Steps to Reproduce

  1. Install OpenClaw as a LaunchAgent service (openclaw gateway start creates plist with KeepAlive: true)
  2. Trigger any restart via SIGUSR1 (e.g. /restart, config change, gateway restart tool)
  3. Observe ~10-second loop of "Gateway failed to start: gateway already running" errors

What Happens

  1. SIGUSR1 handler does restart mode: full process restart (spawned pid XXXX) — spawns a new child process and the original PID exits
  2. LaunchAgent sees the original PID exit → thinks the service crashed → spawns another gateway process
  3. New spawn finds port already in use (child from step 1 owns it) → fails → exits
  4. LaunchAgent respawns again → repeat every ~10 seconds indefinitely

Impact

  • Webchat clients experience repeated disconnect/reconnect cycles
  • Log fills with "Port 18789 is already in use" / "gateway already running" errors at ~10s intervals
  • The loop persists until manually stopped or the webchat client re-establishes and stabilizes

Environment

  • OpenClaw 2026.2.15
  • macOS 26.2 (arm64), M4 Pro Mac mini
  • LaunchAgent plist at ~/Library/LaunchAgents/ai.openclaw.gateway.plist

Log Evidence

07:19:57.255Z gateway received SIGUSR1; restarting
07:19:57.405Z gateway restart mode: full process restart (spawned pid 57144)
07:20:07.309Z Gateway failed to start: gateway already running (pid 57144); lock timeout after 5000ms
07:20:18.352Z Gateway failed to start: gateway already running (pid 57144); lock timeout after 5000ms
... (repeats every ~10s)

Suggested Fix

When running under a LaunchAgent (or any process supervisor), SIGUSR1 should either:

  1. Restart in-place (exec into new process, preserving PID) instead of fork+exit
  2. Detect the supervisor and use launchctl kickstart to do a clean service restart
  3. Not exit the parent process — keep the original PID alive as a thin wrapper

Option 1 (exec-style restart) would be the most compatible with any process supervisor (systemd, launchd, etc.).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions