You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When the OpenClaw gateway is down, wedging, or restarting, inbound BlueBubbles messages delivered during the outage window are permanently lost. The underlying iMessages are intact (they remain in Messages.app and in BB Server's DB), but the agent never sees them, never replies, and no recovery happens when the gateway comes back up. This is the BlueBubbles analog of #50093 (WhatsApp) and is partially related to #38307 (stale-socket restarts).
Validated by a controlled experiment (2026-04-14)
I stopped the gateway cleanly, sent three distinct test iMessages to a monitored handle, waited, then started the gateway — instrumenting both ~/Library/Logs/bluebubbles-server/main.log and ~/.openclaw/logs/gateway.log.
[2026-04-14 11:08:35] [WebhookService] Failed to dispatch "new-message" event → connect ECONNREFUSED 127.0.0.1:18789
[2026-04-14 11:08:53] [WebhookService] Failed to dispatch "new-message" event → connect ECONNREFUSED 127.0.0.1:18789
[2026-04-14 11:09:15] [WebhookService] Failed to dispatch "new-message" event → connect ECONNREFUSED 127.0.0.1:18789
# ... nothing more about msgs 1/2/3 — BB never re-dispatches
Findings:
BB Server's WebhookService is fire-and-forget on failure. Every Dispatching event with a failed POST is logged once and never retried, regardless of whether the failure is ECONNRESET (gateway wedged) or ECONNREFUSED (gateway stopped).
BB Server's MessagePoller does NOT replay missed webhooks on webhook-receiver reconnection. After the gateway came back up and registered its webhook target, there were zero fresh Dispatching lines for msgs 1/2/3 — the only new dispatches were for the replies the agent eventually sent. The ~1-week MessagePoller lookback that fix(bluebubbles): dedupe inbound webhooks across restarts (#19176, #12053) #66230's design relies on is driven by BB's own reconnection events (to Messages.app / APNs), not by webhook-target HTTP reachability.
Without external recovery, all three messages would have been permanently lost from the agent's perspective.
Proof the proposed fix works: bb-catchup.sh (Lobster workspace)
My Lobster install has been running a workspace script (openclaw-agents/lobster/scripts/bb-catchup.sh) that implements exactly this proposal. It's been in production for ~4 weeks and recovered all three messages in the experiment above. Its design:
Cursor:~/.openclaw/bb-last-seen-ms (epoch ms), updated after every successful replay pass.
Query:POST /api/v1/message/query?password=... with body {"limit":50,"sort":"ASC","after":<cursor>,"with":["chat","chat.participants","attachment"]}.
Filter: drop isFromMe, drop own-handle senders, drop pre-cursor messages (defense in depth).
Replay: wrap each message in {"type":"new-message","data":<message>} and POST to the gateway's BB webhook endpoint — same path BB itself uses, so processMessage() handles it identically.
Bounds: 2-hour max lookback, 50-message cap, 0.5s between POSTs.
Experimental result from that script in the run above:
bb-catchup: found 3 missed message(s)
replayed: [<chat>] from=<handle> text=dive test 1
replayed: [<chat>] from=<handle> text=dive test 2
replayed: [<chat>] from=<handle> text=dive test 3
bb-catchup: replayed=3 failed=0
The agent then produced inbound-session entries for all three, matching a clean webhook delivery. Proof the pattern is sound.
What I want to land upstream
Port the bb-catchup pattern into the BlueBubbles channel itself so every OpenClaw install gets message recovery for free, and the workspace script can be retired.
The BB extension already has all the primitives:
fetchBlueBubblesHistory(chatGuid, limit, opts) in extensions/bluebubbles/src/history.ts already speaks /api/v1/chat/{guid}/messages. For catchup we want the flat /api/v1/message/query?after=<ts> endpoint that bb-catchup.sh uses (cross-chat in a single call, server-side cursor filter) — this needs a small new helper fetchBlueBubblesMessagesSince(sinceMs, limit, opts) next to it.
processMessage in monitor-processing.ts is already the canonical inbound handler. The catchup path can call it directly with the normalized payload — no need for the HTTP re-POST hop bb-catchup.sh does (the re-POST only exists because the workspace script can't reach into the gateway process).
fetchBlueBubblesMessagesSince(sinceMs, limit, opts) — POST /api/v1/message/query with after: sinceMs, sort: "ASC", with: [\"chat\",\"chat.participants\",\"attachment\"], bounded by limit; resilient to the same URL-variant fallbacks as fetchBlueBubblesHistory.
runBlueBubblesCatchup(account, deps) — orchestrator: loads cursor (fall back to now - 30min on first run), clamps lookback to MAX_AGE_MS (default 2h), calls the query helper, filters isFromMe and self-handles, normalizes each row through the same path webhook POSTs use (normalizeWebhookMessage etc.), and invokes processMessage(...) for each. Updates cursor on success.
extensions/bluebubbles/src/monitor.ts — in registerBlueBubblesWebhookTarget, after successful route registration, fire-and-forget runBlueBubblesCatchup(account, deps) on a microtask. Log one INFO summary line per account: bluebubbles catchup: account=<id> replayed=N skipped=M window_ms=.... Errors are caught, logged at WARN, and never block the target registration.
extensions/bluebubbles/src/monitor-processing.ts — thread a new optional origin: \"webhook\" | \"catchup\" through processMessage so telemetry can distinguish replays. Default \"webhook\" preserves existing callers.
extensions/bluebubbles/src/config-schema.ts — add optional catchup block under the BB channel entry:
catchup?: {enabled?: boolean;// default true
maxAgeMinutes?: number;// default 120, hard cap 720
perRunLimit?: number;// default 50, hard cap 500
firstRunLookbackMinutes?: number;// default 30}
Default on, bounded.enabled: true out of the box because the downside of no-recovery is loud and user-visible; maxAgeMinutes and perRunLimit clamp the blast radius.
Never processes isFromMe — agent's own sends cannot be mistaken for inbound.
Cursor is persisted only on success. A failed run leaves the cursor at its previous value so the next run retries; the clamp prevents unbounded growth.
No new network surface. Only existing BB REST endpoints (same as fetchBlueBubblesHistory and bb-catchup.sh).
No new inbound code path. Catchup goes through processMessage — the exact same handler webhooks already use.
Test plan
Unit tests in catchup.test.ts as listed above (pass pnpm test extensions/bluebubbles/src/catchup.test.ts).
Full BB suite passes (pnpm test extensions/bluebubbles/).
pnpm check green.
Live repro on macOS using the same protocol as the 2026-04-14 experiment: stop gateway, send N messages, start gateway, assert: (a) processMessage called N times with origin: \"catchup\", (b) cursor file updated, (c) inbound dedupe file contains N new GUIDs, (d) re-running catchup is a no-op.
#66230 (persistent inbound dedupe) is a prerequisite for catchup to be safe to turn on by default. Recommend landing #66230 first, then this issue's fix.
Retirement of workspace script
Once this ships in a released OpenClaw, openclaw-agents/lobster/scripts/bb-catchup.sh and its BOOT.md invocation should be removed. Keeping both would double-process during the one-turn window where catchup runs at gateway startup, which #66230's dedupe handles correctly but introduces unnecessary overhead.
Problem
When the OpenClaw gateway is down, wedging, or restarting, inbound BlueBubbles messages delivered during the outage window are permanently lost. The underlying iMessages are intact (they remain in Messages.app and in BB Server's DB), but the agent never sees them, never replies, and no recovery happens when the gateway comes back up. This is the BlueBubbles analog of #50093 (WhatsApp) and is partially related to #38307 (stale-socket restarts).
Validated by a controlled experiment (2026-04-14)
I stopped the gateway cleanly, sent three distinct test iMessages to a monitored handle, waited, then started the gateway — instrumenting both
~/Library/Logs/bluebubbles-server/main.logand~/.openclaw/logs/gateway.log.Timeline:
openclaw gateway stop,pgrep-clean,healthzrefused)connect ECONNREFUSED 127.0.0.1:18789, no retry loggedECONNREFUSED, no retryECONNREFUSED, no retryBB-server log (edited, redacted):
Findings:
WebhookServiceis fire-and-forget on failure. EveryDispatching eventwith a failed POST is logged once and never retried, regardless of whether the failure isECONNRESET(gateway wedged) orECONNREFUSED(gateway stopped).MessagePollerdoes NOT replay missed webhooks on webhook-receiver reconnection. After the gateway came back up and registered its webhook target, there were zero freshDispatchinglines for msgs 1/2/3 — the only new dispatches were for the replies the agent eventually sent. The ~1-week MessagePoller lookback that fix(bluebubbles): dedupe inbound webhooks across restarts (#19176, #12053) #66230's design relies on is driven by BB's own reconnection events (to Messages.app / APNs), not by webhook-target HTTP reachability.Proof the proposed fix works:
bb-catchup.sh(Lobster workspace)My Lobster install has been running a workspace script (
openclaw-agents/lobster/scripts/bb-catchup.sh) that implements exactly this proposal. It's been in production for ~4 weeks and recovered all three messages in the experiment above. Its design:~/.openclaw/bb-last-seen-ms(epoch ms), updated after every successful replay pass.POST /api/v1/message/query?password=...with body{"limit":50,"sort":"ASC","after":<cursor>,"with":["chat","chat.participants","attachment"]}.isFromMe, drop own-handle senders, drop pre-cursor messages (defense in depth).{"type":"new-message","data":<message>}and POST to the gateway's BB webhook endpoint — same path BB itself uses, soprocessMessage()handles it identically.BOOT.mdas boot task fix: add @lid format support and allowFrom wildcard handling #1 on gateway startup.Experimental result from that script in the run above:
The agent then produced inbound-session entries for all three, matching a clean webhook delivery. Proof the pattern is sound.
What I want to land upstream
Port the
bb-catchuppattern into the BlueBubbles channel itself so every OpenClaw install gets message recovery for free, and the workspace script can be retired.The BB extension already has all the primitives:
fetchBlueBubblesHistory(chatGuid, limit, opts)inextensions/bluebubbles/src/history.tsalready speaks/api/v1/chat/{guid}/messages. For catchup we want the flat/api/v1/message/query?after=<ts>endpoint thatbb-catchup.shuses (cross-chat in a single call, server-side cursor filter) — this needs a small new helperfetchBlueBubblesMessagesSince(sinceMs, limit, opts)next to it.processMessageinmonitor-processing.tsis already the canonical inbound handler. The catchup path can call it directly with the normalized payload — no need for the HTTP re-POST hopbb-catchup.shdoes (the re-POST only exists because the workspace script can't reach into the gateway process).monitor-reply-cache.ts+ the persistent inbound dedupe from fix(bluebubbles): dedupe inbound webhooks across restarts (#19176, #12053) #66230 already protect against double-processing if a BB webhook and a catchup replay of the same GUID both arrive.Implementation plan
New files
extensions/bluebubbles/src/catchup.ts(~150 LoC)fetchBlueBubblesMessagesSince(sinceMs, limit, opts)— POST/api/v1/message/querywithafter: sinceMs,sort: "ASC",with: [\"chat\",\"chat.participants\",\"attachment\"], bounded bylimit; resilient to the same URL-variant fallbacks asfetchBlueBubblesHistory.loadCursor(accountId)/saveCursor(accountId, ms)— file-backed state at~/.openclaw/bluebubbles/catchup-cursor/<accountId>.json(matches the layout fix(bluebubbles): dedupe inbound webhooks across restarts (#19176, #12053) #66230 introduces for persistent dedupe). Atomic write via tmp+rename.runBlueBubblesCatchup(account, deps)— orchestrator: loads cursor (fall back tonow - 30minon first run), clamps lookback toMAX_AGE_MS(default 2h), calls the query helper, filtersisFromMeand self-handles, normalizes each row through the same path webhook POSTs use (normalizeWebhookMessageetc.), and invokesprocessMessage(...)for each. Updates cursor on success.extensions/bluebubbles/src/catchup.test.ts(~200 LoC)isFromMe, pre-cursor timestamp, self-handle address match.MAX_AGE_MSboundary, identical timestamps, monotonic-clock skew.processMessage, assert call count and argument shape.processMessagecalled but early-exits.Modified files
extensions/bluebubbles/src/monitor.ts— inregisterBlueBubblesWebhookTarget, after successful route registration, fire-and-forgetrunBlueBubblesCatchup(account, deps)on a microtask. Log one INFO summary line per account:bluebubbles catchup: account=<id> replayed=N skipped=M window_ms=.... Errors are caught, logged at WARN, and never block the target registration.extensions/bluebubbles/src/monitor-processing.ts— thread a new optionalorigin: \"webhook\" | \"catchup\"throughprocessMessageso telemetry can distinguish replays. Default\"webhook\"preserves existing callers.extensions/bluebubbles/src/config-schema.ts— add optionalcatchupblock under the BB channel entry:CHANGELOG.md—## Unreleased>### Fixesbullet: "BlueBubbles: replay missed webhook messages after gateway restart via a persistent cursor and/api/v1/message/query?after=<ts>pass (fixes BlueBubbles: replay missed webhook messages after gateway restart (cursor + fetchBlueBubblesHistory + processMessage) #66721)."Safety / invariants
enabled: trueout of the box because the downside of no-recovery is loud and user-visible;maxAgeMinutesandperRunLimitclamp the blast radius.isFromMe— agent's own sends cannot be mistaken for inbound.fetchBlueBubblesHistoryandbb-catchup.sh).processMessage— the exact same handler webhooks already use.Test plan
catchup.test.tsas listed above (passpnpm test extensions/bluebubbles/src/catchup.test.ts).pnpm test extensions/bluebubbles/).pnpm checkgreen.processMessagecalled N times withorigin: \"catchup\", (b) cursor file updated, (c) inbound dedupe file contains N new GUIDs, (d) re-running catchup is a no-op.processMessageearly-exits on the dedupe hit — no double reply.Order of operations with #66230
#66230 (persistent inbound dedupe) is a prerequisite for catchup to be safe to turn on by default. Recommend landing #66230 first, then this issue's fix.
Retirement of workspace script
Once this ships in a released OpenClaw,
openclaw-agents/lobster/scripts/bb-catchup.shand itsBOOT.mdinvocation should be removed. Keeping both would double-process during the one-turn window where catchup runs at gateway startup, which #66230's dedupe handles correctly but introduces unnecessary overhead.Related