Skip to content

[Bug]: iMessage channel health monitor restarts provider every 30 minutes during normal idle periods #35072

@mira-lgtm

Description

@mira-lgtm

Bug type

Behavior bug (incorrect output/state without crash)

Summary

The channel health monitor uses a fixed 30-minute staleEventThresholdMs to detect dead sockets, then force-restarts the provider. This threshold works for high-traffic channels (Slack/Discord) but iMessage can be legitimately quiet for hours. Result: 59 unnecessary provider restarts in 2 days, causing delivery gaps during active conversations. The restart cascade compounds — each restart resets the "last event" timer, so if no message arrives quickly after restart, the next 5-minute health check can flag it stale again almost immediately.

Steps to reproduce

  1. Configure OpenClaw with an iMessage channel on macOS.
  2. Leave the gateway running for any period with no inbound/outbound iMessage activity (30+ minutes).
  3. Check gateway.log — the iMessage provider will be restarted every ~30 minutes.
  4. Send a message during or immediately after a restart — delivery may be delayed.

Expected behavior

The health monitor should not restart a provider that is functioning correctly just because the channel has been quiet. iMessage can go hours between messages without any socket problem. A per-channel configurable staleEventThresholdMs or provider-level keepalive pings would prevent false positives.

Actual behavior

The iMessage provider is killed and restarted every ~30 minutes continuously, 24/7, regardless of actual socket health. During active conversations, these restarts cause transient delivery failures — outbound messages queue behind the restart cycle. Observed 59 stale-socket restarts in a 48-hour period. Restarts cascade: each restart resets the last-event timer, so subsequent 5-minute health checks can trigger another restart almost immediately.

OpenClaw version

v2026.3.2

Operating system

macOS 15.4 (Apple Silicon M4)

Install method

npm global

Logs, screenshots, and evidence

Sample from gateway.log showing continuous 30-minute restarts:


2026-03-04T22:04:10.274Z [health-monitor] [imessage:default] health-monitor: restarting (reason: stale-socket)
2026-03-04T22:29:10.287Z [health-monitor] [imessage:default] health-monitor: restarting (reason: stale-socket)
2026-03-04T22:34:10.294Z [health-monitor] [imessage:default] health-monitor: restarting (reason: stale-socket)


Source: DEFAULT_STALE_EVENT_THRESHOLD_MS = 30 * 6e4 in channel-health-monitor.ts
No per-channel config path exists — deps.timing?.staleEventThresholdMs is checked but never populated from openclaw.json.

Impact and severity

Affected: All iMessage channel users on always-on gateway deployments
Severity: High — blocks message delivery during active conversations
Frequency: 100% repro — happens every 30 minutes on any idle iMessage channel
Consequence: Missed/delayed message delivery, restart cascades during conversations, noisy logs (59 restarts in 48 hours obscure real issues)

Additional information

Proposed fix: (A) Make staleEventThresholdMs configurable per channel in openclaw.json, (B) Add iMessage-specific keepalive pings so idle sockets don't appear dead, or (C) both. This issue compounds with auto-compaction delivery blocking — when compaction runs during a stale-socket restart window, the user can wait 7+ minutes for a response that was already generated.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingbug:behaviorIncorrect behavior without a crash

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions