-
-
Notifications
You must be signed in to change notification settings - Fork 52.8k
Description
Description
The gateway process spontaneously crashes with Uncaught exception: Error: write EPIPE and stops working. Once crashed, macOS launchd applies exponential backoff on restarts — resulting in the gateway being down for hours with no automatic recovery.
The user has to manually run clawdbot doctor --fix every time to bring the bot back to life.
Steps to Reproduce
- Install Clawdbot with LaunchAgent (default setup)
- Use the bot normally
- At some point, the gateway crashes with EPIPE and stops responding
- It does not come back on its own — launchd throttles restart attempts after repeated crashes
- Only
clawdbot doctor --fix(or manual restart) brings it back
This happens repeatedly and unpredictably. In our case, the gateway died at 22:43 and did not come back until 09:06 the next day — a 10-hour outage with KeepAlive: true set in the LaunchAgent.
Root Cause
The gateway writes to stdout/stderr after the pipe has been closed (during process shutdown or launchd restart). This triggers an uncaught EPIPE exception → crash → exit code 1. Repeated crashes cause macOS launchd to throttle KeepAlive restarts exponentially (the service had runs = 11 and last exit code = 1).
Additionally, every restart produces unhandled AbortError from pending fetch requests being cancelled — these may contribute to the crash loop.
Error Logs
[clawdbot] Uncaught exception: Error: write EPIPE
at afterWriteDispatched (node:internal/stream_base_commons:159:15)
at writeGeneric (node:internal/stream_base_commons:150:3)
at Socket._writeGeneric (node:net:971:11)
at Socket._write (node:net:983:8)
at writeOrBuffer (node:internal/streams/writable:570:12)
at _write (node:internal/streams/writable:499:10)
at Writable.write (node:internal/streams/writable:508:10)
at console.value (node:internal/console/constructor:298:16)
at console.log (node:internal/console/constructor:384:26)
[clawdbot] Unhandled promise rejection: AbortError: This operation was aborted
at node:internal/deps/undici/undici:13502:13
Gateway restart timeline showing exponential throttle:
22:43:08 — listening (PID 66511)
[10+ hour gap — gateway dead, launchd throttled]
09:06:33 — listening (PID 7341) ← only after manual doctor --fix
09:08:13 — restart
09:08:29 — restart
09:08:53 — restart
[3+ hour gap — throttled again]
12:24:20 — listening ← after another doctor --fix
Workaround
Add ThrottleInterval: 5 to the LaunchAgent plist to cap restart delay at 5 seconds:
<key>ThrottleInterval</key>
<integer>5</integer>Suggested Fix
- Handle
EPIPEon stdout/stderr:process.stdout.on("error", () => {})or equivalent in the logging layer - Catch
AbortErroron pending fetch requests during shutdown - Add
ThrottleIntervalto the generated LaunchAgent plist by default so crashes don't cause hours of downtime
Environment
- Clawdbot: 2026.1.24-3
- macOS: 15.6.1 (arm64)
- Node: 23.7.0
- LaunchAgent: com.clawdbot.gateway with KeepAlive: true