Skip to content

[Bug]: macOS LaunchAgent state flaps (service not found + WS 1006) causing gateway unreachable despite intermittent 18789 listener #29883

@hejiajiudeeyu

Description

@hejiajiudeeyu

Summary

On macOS 26.3 with OpenClaw 2026.2.26, gateway control enters an inconsistent state where UI/CLI report unreachable (1006, ECONNREFUSED) while launchd/service state and port listener signals conflict, and Retry/Recheck does not reliably recover.

Steps to reproduce

  1. Start OpenClaw on macOS (Mac app install, local gateway mode, port 18789).
  2. Open gateway panel and wait until failure banner appears (Gateway daemon command failed / cannot reach localhost:18789).
  3. Click Retry multiple times, then click Recheck.
  4. In terminal, run openclaw gateway status and openclaw status during/after retries.

Expected behavior

Retry/Recheck should deterministically restore a healthy gateway state (service loaded/running, 18789 listening, RPC probe OK), with no contradictory status between launchd, socket, and CLI/UI.

Actual behavior

  • Repeated gateway closed (1006 abnormal closure (no close frame)) and ECONNREFUSED during failure window.
  • openclaw gateway status intermittently reports: Could not find service "ai.openclaw.gateway" in domain for user gui: 501.
  • State contradictions observed in same window:
  • service reported missing/not loaded,
  • while logs also report Port 18789 is already in use / gateway already running,
    and at some points a gateway PID is listening on 18789.
  • UI Retry/Recheck did not provide stable recovery by itself.

OpenClaw version

2026.2.26

Operating system

macOS 26.3

Install method

OpenClaw mac app+Cli(gateway)

Logs, screenshots, and evidence

Captured bundle: openclaw-repro-20260228-234541.tar.gz

Includes:

events.log (repro timeline)
snapshot-start.txt (failure-state snapshot)
experiments.log (A/B/C recovery experiments)
snapshot-stop.txt (post-recovery snapshot)
gateway-tail.log and timeline.log (raw evidence)

Key timestamps (CST):

23:46:20 app opened
23:46:56 retry
23:47:17 retry2
23:47:30 recheck
23:49:49 recovery experiments started

Key evidence excerpts:

Could not find service "ai.openclaw.gateway" in domain for user gui: 501
gateway closed (1006 abnormal closure (no close frame))
Port 18789 is already in use and gateway already running appearing alongside service-not-found states
[openclaw-repro-20260228-234541.tar.gz](https://github.com/user-attachments/files/25622334/openclaw-repro-20260228-234541.tar.gz)

Impact and severity

  • Affected users/systems/channels: macOS desktop users running local gateway via LaunchAgent (UI + CLI control paths).
  • Severity: High (blocks normal gateway use; UI indicates active but effective connectivity is broken).
  • Frequency: Intermittent but reproducible under repeated Retry/Recheck during unstable state.
  • Consequence: workflow interruption, failed health checks, repeated manual intervention via launchctl/CLI restart commands.

Additional information

Recovery experiment results from the same capture:

A) openclaw gateway restart: recovered (RPC probe: ok)
B) openclaw gateway stop && openclaw gateway start: failed (service not loaded / unit not found)
C) launchctl bootout/bootstrap/kickstart: eventually recovered, but transient instability remained before settling

openclaw-repro-20260228-234541.tar.gz

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingstaleMarked as stale due to inactivity

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions