Skip to content

Gateway SIGUSR1 self-restart orphans process under launchd #27605

@carl-jeffrolc

Description

@carl-jeffrolc

Summary

When the gateway detects a config change requiring a full restart (e.g., adding gateway.auth.rateLimit), it sends itself SIGUSR1 and performs a "full process restart" by forking a new child process. Under launchd supervision (macOS LaunchAgent with KeepAlive: true), this creates an orphan process and a crash-loop.

Steps to Reproduce

  1. Run gateway via LaunchAgent with KeepAlive: true
  2. Edit openclaw.json to add/change a config key that requires full restart (e.g., gateway.auth.rateLimit)
  3. Config watcher detects the change and triggers SIGUSR1

What Happens

[reload] config change requires gateway restart (gateway.auth.rateLimit)
[gateway] received SIGUSR1; restarting
[gateway] restart mode: full process restart (spawned pid 44022)
  • Gateway forks a new child (pid 44022)
  • Original process (launchd's child) exits
  • New child is reparented to pid 1 (orphan)
  • Launchd sees its child died → spawns new instance → port taken → fails → retry loop:
Gateway failed to start: gateway already running (pid 44022); lock timeout after 5000ms
Port 18789 is already in use.

This repeats indefinitely until the orphan is manually killed.

Expected Behavior

Under a process supervisor, "full process restart" should not fork a new child. Instead, either:

  1. exec() in-place — replace the current process image (same PID, supervisor retains ownership)
  2. Exit cleanly (code 0) — let the supervisor (launchd, systemd, etc.) handle the restart via its KeepAlive/Restart=always policy
  3. Detect supervised mode — if OPENCLAW_LAUNCHD_LABEL or INVOCATION_ID (systemd) env var is set, use option 1 or 2 instead of fork

Option 2 is the simplest and most portable. The gateway already sets OPENCLAW_LAUNCHD_LABEL in the stock plist, so detection is trivial.

Environment

  • OpenClaw v2026.2.25
  • macOS 15.4 (Darwin 25.3.0)
  • LaunchAgent: stock ai.openclaw.gateway.plist with KeepAlive: true
  • Config change that triggered it: adding gateway.auth.rateLimit block

Workaround

Kill the orphan process manually, then launchd starts a properly supervised instance:

kill <orphan_pid>
# or: launchctl kickstart -k gui/$UID/ai.openclaw.gateway

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions