Bug type
Behavior bug (incorrect output/state without crash)
Summary
The channel health monitor uses a fixed 30-minute staleEventThresholdMs to detect dead sockets, then force-restarts the provider. This threshold works for high-traffic channels (Slack/Discord) but iMessage can be legitimately quiet for hours. Result: 59 unnecessary provider restarts in 2 days, causing delivery gaps during active conversations. The restart cascade compounds — each restart resets the "last event" timer, so if no message arrives quickly after restart, the next 5-minute health check can flag it stale again almost immediately.
Steps to reproduce
- Configure OpenClaw with an iMessage channel on macOS.
- Leave the gateway running for any period with no inbound/outbound iMessage activity (30+ minutes).
- Check gateway.log — the iMessage provider will be restarted every ~30 minutes.
- Send a message during or immediately after a restart — delivery may be delayed.
Expected behavior
The health monitor should not restart a provider that is functioning correctly just because the channel has been quiet. iMessage can go hours between messages without any socket problem. A per-channel configurable staleEventThresholdMs or provider-level keepalive pings would prevent false positives.
Actual behavior
The iMessage provider is killed and restarted every ~30 minutes continuously, 24/7, regardless of actual socket health. During active conversations, these restarts cause transient delivery failures — outbound messages queue behind the restart cycle. Observed 59 stale-socket restarts in a 48-hour period. Restarts cascade: each restart resets the last-event timer, so subsequent 5-minute health checks can trigger another restart almost immediately.
OpenClaw version
v2026.3.2
Operating system
macOS 15.4 (Apple Silicon M4)
Install method
npm global
Logs, screenshots, and evidence
Sample from gateway.log showing continuous 30-minute restarts:
2026-03-04T22:04:10.274Z [health-monitor] [imessage:default] health-monitor: restarting (reason: stale-socket)
2026-03-04T22:29:10.287Z [health-monitor] [imessage:default] health-monitor: restarting (reason: stale-socket)
2026-03-04T22:34:10.294Z [health-monitor] [imessage:default] health-monitor: restarting (reason: stale-socket)
Source: DEFAULT_STALE_EVENT_THRESHOLD_MS = 30 * 6e4 in channel-health-monitor.ts
No per-channel config path exists — deps.timing?.staleEventThresholdMs is checked but never populated from openclaw.json.
Impact and severity
Affected: All iMessage channel users on always-on gateway deployments
Severity: High — blocks message delivery during active conversations
Frequency: 100% repro — happens every 30 minutes on any idle iMessage channel
Consequence: Missed/delayed message delivery, restart cascades during conversations, noisy logs (59 restarts in 48 hours obscure real issues)
Additional information
Proposed fix: (A) Make staleEventThresholdMs configurable per channel in openclaw.json, (B) Add iMessage-specific keepalive pings so idle sockets don't appear dead, or (C) both. This issue compounds with auto-compaction delivery blocking — when compaction runs during a stale-socket restart window, the user can wait 7+ minutes for a response that was already generated.
Bug type
Behavior bug (incorrect output/state without crash)
Summary
The channel health monitor uses a fixed 30-minute staleEventThresholdMs to detect dead sockets, then force-restarts the provider. This threshold works for high-traffic channels (Slack/Discord) but iMessage can be legitimately quiet for hours. Result: 59 unnecessary provider restarts in 2 days, causing delivery gaps during active conversations. The restart cascade compounds — each restart resets the "last event" timer, so if no message arrives quickly after restart, the next 5-minute health check can flag it stale again almost immediately.
Steps to reproduce
Expected behavior
The health monitor should not restart a provider that is functioning correctly just because the channel has been quiet. iMessage can go hours between messages without any socket problem. A per-channel configurable staleEventThresholdMs or provider-level keepalive pings would prevent false positives.
Actual behavior
The iMessage provider is killed and restarted every ~30 minutes continuously, 24/7, regardless of actual socket health. During active conversations, these restarts cause transient delivery failures — outbound messages queue behind the restart cycle. Observed 59 stale-socket restarts in a 48-hour period. Restarts cascade: each restart resets the last-event timer, so subsequent 5-minute health checks can trigger another restart almost immediately.
OpenClaw version
v2026.3.2
Operating system
macOS 15.4 (Apple Silicon M4)
Install method
npm global
Logs, screenshots, and evidence
Impact and severity
Affected: All iMessage channel users on always-on gateway deployments
Severity: High — blocks message delivery during active conversations
Frequency: 100% repro — happens every 30 minutes on any idle iMessage channel
Consequence: Missed/delayed message delivery, restart cascades during conversations, noisy logs (59 restarts in 48 hours obscure real issues)
Additional information
Proposed fix: (A) Make staleEventThresholdMs configurable per channel in openclaw.json, (B) Add iMessage-specific keepalive pings so idle sockets don't appear dead, or (C) both. This issue compounds with auto-compaction delivery blocking — when compaction runs during a stale-socket restart window, the user can wait 7+ minutes for a response that was already generated.