fix(bluebubbles): dedupe inbound webhooks across restarts (#19176, #12053) by omarshahine · Pull Request #66816 · openclaw/openclaw

omarshahine · 2026-04-14T21:18:51Z

Summary

BlueBubbles `MessagePoller` keeps a ~1-week lookback and re-fires `new-message` webhooks after BB Server restart or reconnection. With no sequence number or ack in the BB webhook protocol, the gateway previously had no way to recognize replays and would happily re-reply to messages it had already handled before the restart — producing duplicated outbound messages, and confusing replies to stale inbound that the user had already moved on from (see #19176, #12053).

This PR adds a persistent, file-backed inbound dedupe keyed by message GUID, modeled after the same `createPersistentDedupe` pattern used by the Feishu plugin.

TTL = 7 days (matches BB's lookback window, so any replay is guaranteed to land on a remembered GUID).
On-disk state at `~/.openclaw/bluebubbles/inbound-dedupe/.json` — survives gateway restarts, which is the crucial property the previous in-memory dedupe layers can't provide.
Applied at the top of `processMessage`, before any downstream work, so stale replays never reach pairing, dispatch, or the reply cache.

Why this approach

Other channels with monotonic sequence IDs (Telegram's `update_id`, Matrix's sync token, Discord's gateway sequence) can dedupe natively via protocol. BlueBubbles does not expose anything like that, so an identity-based persistent dedupe at the message layer is the closest equivalent that fits how BB actually delivers webhooks.

Interaction with edit events (`updated-message`)

PR #52277 raised a related concern: if dedupe keys are GUID-only, a legitimate `updated-message` event would share a GUID with its original `new-message` and get dropped as a duplicate.

In the current codebase this cannot happen: `monitor.ts` routes `updated-message` payloads differently — without a reaction they are dropped at the webhook layer ("ignored without reaction"), and with a reaction they flow through `processReaction`, not `processMessage`. Our dedupe sits inside `processMessage`, so only `new-message` events are gated. Edits can't collide today.

If the separate work in #52277 lands and begins routing text-edit bodies into `processMessage`, the dedupe key will need to expand to include event type and edit metadata (e.g. `guid + eventType + (dateEdited||"")`) so an edit is treated as a distinct key. That's a straightforward forward-compatible change when it's needed.

Validation

New scoped test: `extensions/bluebubbles/src/inbound-dedupe.test.ts` (3 cases covering claim/reject, per-account scoping, missing GUID).
Full BlueBubbles suite passes (387/387).
End-to-end tested on a live gateway (macOS 26.3, BB Server 1.9.x) with a synthetic replay script: first webhook gets processed + recorded, second webhook with the same GUID is dropped before dispatch, dedupe file timestamp unchanged. Sample script lives in `scripts/test-bb-dedupe.sh` for the PR lifecycle; remove before merge if not desired.
`pnpm check` green.

Credits

Re-creates and improves on the focused fix from #31159 by @dashhuang — same behavioral goal (drop stale BB webhook replays), implemented as a persistent on-disk dedupe so it actually survives the gateway-restart case that drives the bug, and without the module-global mutable state that made the original patch need test-reset plumbing.

Fixes #19176, #12053.

Test plan

New scoped test passes (`pnpm test extensions/bluebubbles/src/inbound-dedupe.test.ts`)
Full BlueBubbles suite passes (`pnpm test extensions/bluebubbles/`)
`pnpm check` green
Live test on macOS 26.3 with BB Server — synthetic replay dropped, dedupe file persisted across gateway restart
Maintainer review

aisle-research-bot · 2026-04-14T21:18:58Z

🔒 Aisle Security Analysis

We found 3 potential security issue(s) in this PR:

#	Severity	Title
1	🟡 Medium	Replay can re-run tool side effects when reply delivery fails (dedupe claim released)
2	🟡 Medium	Disk/CPU DoS via file-backed inbound dedupe store rewriting large JSON map
3	🟡 Medium	Symlink/hardlink file clobber risk in Windows fallback path for atomic JSON writes

1. 🟡 Replay can re-run tool side effects when reply delivery fails (dedupe claim released)

Property	Value
Severity	Medium
CWE	CWE-841
Location	`extensions/bluebubbles/src/monitor-processing.ts:644-661`

Description

The inbound deduplication wrapper in processMessage releases its dedupe claim when the reply pipeline reports a final delivery error via onError. This allows the same inbound message GUID to be processed again on a later webhook replay (or reconnect replay window), re-running the full agent/tool pipeline.

If any tools/actions in processMessageAfterDedupe have side effects (writes, purchases, network calls, state changes), and the only failure is that the final BlueBubbles send transiently failed, those side effects may have already occurred but the claim is released anyway. A subsequent replay will execute them again, potentially causing duplicate external actions.

Vulnerable behavior:

Input: inbound webhook message (GUID-derived dedupeKey)
Side effects: tool execution and other actions inside processMessageAfterDedupe / dispatcher flow
Trigger: onError(..., {kind: "final"}) sets dedupeSignal.deliveryFailed = true
Result: claim is released rather than finalized, permitting replay reprocessing

Vulnerable code:

if (signal.deliveryFailed) {
  ...
  claim.release();
} else {
  await claim.finalize();
}

and

if (info.kind === "final") {
  dedupeSignal.deliveryFailed = true;
}

Recommendation

Treat inbound processing as at-most-once for side effects, even if reply delivery fails.

Options (choose based on product requirements):

Finalize the dedupe claim once side effects complete, and separately enqueue/retry only the outbound reply delivery (store the final response or a reference to it).
Add an idempotency key (the inbound GUID / dedupeKey) that is passed into all side-effecting tools so they can dedupe externally (or your tool-execution layer can persist tool-call results keyed by GUID).
If you must release on delivery failure, restructure so that no side-effecting tools run until after reply delivery is guaranteed, which is usually impractical.

Illustrative approach (finalize processing, retry delivery):

// After successful tool run / response generation
await claim.finalize();
try {
  await deliverFinalReply(...);
} catch (e) {
  // schedule retry of delivery only; do NOT release inbound dedupe
  await enqueueDeliveryRetry({ dedupeKey, responseId });
}

2. 🟡 Disk/CPU DoS via file-backed inbound dedupe store rewriting large JSON map

Property	Value
Severity	Medium
CWE	CWE-400
Location	`extensions/bluebubbles/src/inbound-dedupe.ts:16-57`

Description

The new BlueBubbles inbound GUID dedupe persists attacker-influenced GUIDs to a per-account JSON file for 7 days (TTL) with a hard cap of 50,000 entries. Each successful message processing calls impl.commit(), which (via createPersistentDedupe) performs:

readFile() of the whole JSON blob
JSON.parse() into an object containing up to fileMaxEntries keys
Object.keys(...).toSorted(...) for pruning
writeJsonFileAtomically() rewriting the entire file

Because inbound GUIDs originate from remote webhook/poller events, a remote party can send many unique messages (unique GUIDs) to drive the store to its maximum size and then keep it near the cap. This can cause sustained high disk I/O and CPU usage due to repeated full-file read/parse/sort/write cycles, potentially degrading the gateway or exhausting disk throughput.

Although sanitizeGuid() caps GUID length at 512 chars, the worst-case on-disk size remains large (order-of-magnitude tens of MB: 50,000 * 512 plus JSON overhead), and the rewrite cost is paid repeatedly.

Vulnerable code (BlueBubbles wiring):

Uses large FILE_MAX_ENTRIES and long TTL
Commits every processed unique GUID to disk

Recommendation

Reduce the ability for untrusted inbound traffic to force large persistent state and full-file rewrites.

Suggested mitigations (pick a combination):

Store a fixed-size hash of the GUID (e.g., SHA-256 hex) instead of raw GUID strings to cap key size.

import { createHash } from "node:crypto";

function normalizeGuidForStore(guid: string): string {
  return createHash("sha256").update(guid, "utf8").digest("hex");
}

Lower FILE_MAX_ENTRIES and/or lower TTL (or make them configurable with conservative defaults).
Change the persistent-dedupe backend to an append-only log with periodic compaction, or a small SQLite table with an index and DELETE WHERE seenAt < ... to avoid full-file rewrite per event.
Add rate limiting/backpressure for inbound events or for commits to disk (batch commits, debounce writes).
Ensure onDiskError fails closed in a way that doesn’t keep retrying expensive operations in a tight loop.

3. 🟡 Symlink/hardlink file clobber risk in Windows fallback path for atomic JSON writes

Property	Value
Severity	Medium
CWE	CWE-59
Location	`src/infra/json-files.ts:9-27`

Description

The writeJsonAtomic/writeTextAtomic helper used by the persistent dedupe store performs an atomic rename() first, but on Windows falls back to copyFile() when rename() fails with EPERM/EEXIST.

On Windows, copyFile(tempPath, filePath) can follow an existing destination symlink (and will also overwrite the contents of an existing hardlink), which can allow a local attacker who can create/replace files under the configured state directory to redirect writes to an arbitrary target file.

This is relevant to the new BlueBubbles inbound dedupe feature because it calls createClaimableDedupe({ resolveFilePath, ... }), which persists a JSON file under the OpenClaw state directory via writeJsonFileAtomically → writeJsonAtomic.

Vulnerable behavior:

attacker pre-creates .../bluebubbles/inbound-dedupe/<ns>.json as a symlink/hardlink to another file writable by the gateway process
when the dedupe store commits, rename() may hit the Windows fallback and copyFile() overwrites the link target

Vulnerable code:

await fs.copyFile(tempPath, filePath);

Note: rename() replacement semantics are generally safe against symlink-following on POSIX, but the Windows fallback removes that safety property.

Recommendation

Avoid overwriting an attacker-controlled link destination.

Options:

Prefer a safe replace strategy that does not follow destination symlinks/hardlinks:
- On Windows, delete the destination path first (after verifying it is a regular file in the intended directory) and then rename().
- Additionally, validate the destination with lstat() to reject symlinks.

Example (sketch):

import fs from "node:fs/promises";

async function safeReplaceFile(tempPath: string, filePath: string) {
  try {
    // Best: attempt atomic rename
    await fs.rename(tempPath, filePath);
    return;
  } catch (e: any) {
    if (process.platform !== "win32" || (e?.code !== "EPERM" && e?.code !== "EEXIST")) {
      throw e;
    }
  }

  // Windows fallback: refuse to write through links
  try {
    const st = await fs.lstat(filePath);
    if (st.isSymbolicLink()) {
      throw new Error(`Refusing to overwrite symlink: ${filePath}`);
    }
    // Optionally also ensure st.isFile() and that filePath is within an expected base dir.
    await fs.rm(filePath, { force: true });
  } catch (e: any) {
    if (e?.code !== "ENOENT") throw e;
  }

  await fs.rename(tempPath, filePath);
}

Defense-in-depth for callers: ensure the state directory is created with 0700 and is not group/world-writable; ensure no path components are symlinks by comparing realpath(dirname(filePath)) to an expected base directory; and validate that resolveFilePath(namespace) stays within the intended base directory (e.g., path.relative(base, filePath) must not start with ..).

Analyzed PR: #66816 at commit cbeb3ca

_{Last updated on: 2026-04-14T22:40:55Z}

greptile-apps · 2026-04-14T21:25:17Z

Greptile Summary

Adds a persistent file-backed GUID dedupe (~/.openclaw/bluebubbles/inbound-dedupe/<accountId>.json) at the top of processMessage so BlueBubbles MessagePoller webhook replays after server restart are dropped before reaching pairing, dispatch, or the reply cache. The approach follows the existing createClaimableDedupe SDK pattern and includes a claim/finalize/release lifecycle so transient send failures allow the next replay to retry.

All four claim outcomes are handled correctly, the onError → deliveryFailed signal correctly scopes release to terminal final failures only (avoiding replay-triggered tool-side-effect duplication), and tests cover the expected dedupe, scoping, skip, and release behaviors.

Confidence Score: 5/5

Safe to merge — all remaining findings are minor style/clarity concerns that do not affect correctness.

The core dedupe logic is correct across all four claim outcomes. The finalize-on-success / release-on-failure lifecycle is sound. Disk persistence is properly guarded against errors. Tests cover the key behavioral contracts. The two P2 findings (misleading comment in a catch block whose release() call is a no-op, and an unnecessary export on an internal type) have no runtime impact.

No files require special attention.

Prompt To Fix All With AI

This is a comment left during a code review.
Path: extensions/bluebubbles/src/monitor-processing.ts
Line: 602

Comment:
**`InboundDedupeDeliverySignal` exported but only used within this file**

The type is defined and consumed entirely within `monitor-processing.ts` — `processMessageAfterDedupe` (unexported) takes it as a parameter, and `processMessage` (exported) creates and owns it. Exporting the type leaks an internal implementation detail of the dedupe wrapper into the module's public surface. Consider removing the `export` keyword unless downstream consumers (e.g., tests) need to reference it explicitly.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: extensions/bluebubbles/src/monitor-processing.ts
Line: 662-673

Comment:
**`claim.release()` in the `catch (finalizeError)` block is a no-op**

When `finalize()` throws, it means `impl.commit` threw. `createClaimableDedupe.commit` always calls `inflight.delete(scopedKey)` in its `finally` block before re-throwing, so by the time `claim.release()` runs here, the inflight slot is already gone — `release` calls `inflight.get` on an empty map and returns immediately.

The GUID won't get "stuck" for 7 days, but not because `release()` unsticks it: `commit` failed before persisting anything (disk errors inside `checkAndRecordInner` are caught and fall back to memory, so `commit` itself rarely throws at all). The comment overstates what `release()` accomplishes and could mislead future maintainers into thinking the call is load-bearing when it's actually redundant in this path.

How can I resolve this? If you propose a fix, please make it concise.

_{Reviews (1): Last reviewed commit: "fix(bluebubbles): dedupe inbound webhook..." | Re-trigger Greptile}

greptile-apps · 2026-04-14T21:25:21Z

+ * dispatcher's `onError` callback rather than as thrown exceptions, so a
+ * plain try/catch cannot detect them — see review thread `rwF8` on #66230.
+ */
+export type InboundDedupeDeliverySignal = { deliveryFailed: boolean };


InboundDedupeDeliverySignal exported but only used within this file

The type is defined and consumed entirely within monitor-processing.ts — processMessageAfterDedupe (unexported) takes it as a parameter, and processMessage (exported) creates and owns it. Exporting the type leaks an internal implementation detail of the dedupe wrapper into the module's public surface. Consider removing the export keyword unless downstream consumers (e.g., tests) need to reference it explicitly.

Prompt To Fix With AI

This is a comment left during a code review. Path: extensions/bluebubbles/src/monitor-processing.ts Line: 602 Comment: **`InboundDedupeDeliverySignal` exported but only used within this file** The type is defined and consumed entirely within `monitor-processing.ts` — `processMessageAfterDedupe` (unexported) takes it as a parameter, and `processMessage` (exported) creates and owns it. Exporting the type leaks an internal implementation detail of the dedupe wrapper into the module's public surface. Consider removing the `export` keyword unless downstream consumers (e.g., tests) need to reference it explicitly. How can I resolve this? If you propose a fix, please make it concise.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Fixed — removed the export keyword. The type is now file-local.

greptile-apps · 2026-04-14T21:25:21Z

+        await claim.finalize();
+      } catch (finalizeError) {
+        // Release the in-flight claim so the GUID doesn't get stuck for the
+        // 7-day TTL if persistence fails (disk full, permissions, etc.).
+        claim.release();
+        logVerbose(
+          core,
+          runtime,
+          `inbound-dedupe: finalize failed for key=${sanitizeForLog(dedupeKey ?? "")}: ${sanitizeForLog(finalizeError)}`,
+        );
+      }
+    }


claim.release() in the catch (finalizeError) block is a no-op

When finalize() throws, it means impl.commit threw. createClaimableDedupe.commit always calls inflight.delete(scopedKey) in its finally block before re-throwing, so by the time claim.release() runs here, the inflight slot is already gone — release calls inflight.get on an empty map and returns immediately.

The GUID won't get "stuck" for 7 days, but not because release() unsticks it: commit failed before persisting anything (disk errors inside checkAndRecordInner are caught and fall back to memory, so commit itself rarely throws at all). The comment overstates what release() accomplishes and could mislead future maintainers into thinking the call is load-bearing when it's actually redundant in this path.

Prompt To Fix With AI

This is a comment left during a code review. Path: extensions/bluebubbles/src/monitor-processing.ts Line: 662-673 Comment: **`claim.release()` in the `catch (finalizeError)` block is a no-op** When `finalize()` throws, it means `impl.commit` threw. `createClaimableDedupe.commit` always calls `inflight.delete(scopedKey)` in its `finally` block before re-throwing, so by the time `claim.release()` runs here, the inflight slot is already gone — `release` calls `inflight.get` on an empty map and returns immediately. The GUID won't get "stuck" for 7 days, but not because `release()` unsticks it: `commit` failed before persisting anything (disk errors inside `checkAndRecordInner` are caught and fall back to memory, so `commit` itself rarely throws at all). The comment overstates what `release()` accomplishes and could mislead future maintainers into thinking the call is load-bearing when it's actually redundant in this path. How can I resolve this? If you propose a fix, please make it concise.

Fixed — removed the redundant claim.release() call since commit() already clears inflight state in its finally block. Now just logs the error.

omarshahine · 2026-04-14T21:33:46Z

Aisle findings response

#1 (High) Symlink-following file overwrite: This is a pre-existing concern in src/infra/json-files.ts (shared atomic write infrastructure), not introduced by this PR. The dedupe layer calls createClaimableDedupe from the Plugin SDK which uses the existing atomic write path. Any hardening should be done in the shared infra, not in this extension.

#2 (Medium) PII in verbose logs: Consistent with existing BB channel behavior — logVerbose throughout the channel already logs message content, sender IDs, and chat GUIDs. Verbose mode is operator-opted and not enabled by default. No change to the security model.

#3 (Medium) Attacker-controlled dedupe key: If an attacker can forge BB webhooks (requires the webhook password), they can already inject arbitrary messages, not just suppress them. The dedupe key derivation mirrors the existing debouncer key logic. The webhook password is the trust boundary here.

#4 (Info) Plaintext GUIDs on disk: All BB state (sessions, reply cache, history) is already stored as plaintext JSON under the same state directory with the same permissions. Consistent with existing behavior.

None of these are blockers or require changes in this PR scope.

…66816 The inbound-dedupe PR was reopened again as #66816 (closed-without-merge trail: #66230 → #66810 → #66816). The branch was force-pushed and the new PR uses the parallel `fix/bb-inbound-dedupe` branch. Updating code comments and the catchup CHANGELOG entry to point at the live PR. Stacking on top of the dedupe branch will be addressed in a follow-up rebase. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

omarshahine · 2026-04-14T22:04:41Z

Aisle findings response (commit d27603b)

#1 (High) Replay re-executes tool side effects when final delivery fails:
This is an intentional design tradeoff, documented in the PR body under "Interaction with edit events." The alternative — committing the dedupe on tool execution rather than delivery — means the user never gets a reply if the final send fails transiently. For a personal iMessage channel (BB's use case), a duplicate reply is recoverable; a permanently silenced message is not. A two-phase "process + deliver" retry queue would be the ideal long-term solution, but that's a cross-channel infrastructure change well beyond this PR's scope. The current behavior is strictly better than the status quo (no dedupe at all, guaranteed duplicates on every restart).

#2 (Medium) Unbounded fields in verbose logs:
Valid hardening suggestion. Adding a length cap to sanitizeForLog is reasonable. Will address.

#3 (Medium) Symlink-following in atomic JSON writer:
Pre-existing concern in src/infra/json-files.ts, not introduced by this PR. Same response as the earlier scan.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: be43c315aa

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-04-14T22:34:15Z

+    onDiskError: (error) =>
+      logVerbose(core, runtime, `inbound-dedupe disk error: ${sanitizeForLog(error)}`),
+  });
+  if (claim.kind === "duplicate" || claim.kind === "inflight") {


Skip dedupe for from-me webhook messages

processMessage now claims/drops dedupe entries before the message.fromMe branch runs, so after a gateway restart replayed from-me events are treated as duplicate and never reach cacheInboundMessage() in processMessageAfterDedupe. That regresses post-restart cache hydration for BlueBubbles reply context/short IDs (the cache is in-memory), so replies to pre-restart assistant messages can lose context resolution even though the replay webhooks still arrived. Consider bypassing duplicate-drop for fromMe events (or running their cache-update path before dedupe return).

Useful? React with 👍 / 👎.

…2053) BlueBubbles MessagePoller replays its ~1-week lookback window as new-message webhooks after BB Server restart or reconnect. Without persistent dedup, the gateway re-replies to messages it already handled before the restart. Add a persistent file-backed GUID dedupe (TTL=7d, matching BB's lookback window) at the top of processMessage, using the same createClaimableDedupe SDK primitive as Feishu. The on-disk store at ~/.openclaw/bluebubbles/inbound-dedupe/<account>.json survives gateway restarts. Claim/finalize/release semantics ensure transient delivery failures release the GUID so a later replay can retry, while successful deliveries are committed and block future replays. Fixes #19176, #12053.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: cbeb3ca130

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-04-14T22:44:31Z

+          if (info.kind === "final") {
+            dedupeSignal.deliveryFailed = true;
+          }


Release dedupe claim on block-stream delivery failures

When BlueBubbles blockStreaming is enabled, replies can be delivered as block payloads without a final payload, but this handler only marks dedupeSignal.deliveryFailed for info.kind === "final". If a block send fails (transient network/API error), onError logs it but leaves deliveryFailed false, so processMessage commits the dedupe claim and future replay webhooks for that message are dropped as duplicates for 7 days instead of retrying delivery. This regresses reliability specifically for block-streaming accounts with block-only output.

Useful? React with 👍 / 👎.

…rt (#66721) Adds an in-process startup catchup pass to the BlueBubbles channel that queries BB Server for messages delivered since a persisted per-account cursor and re-feeds each through the existing processMessage pipeline. Fixes the missed-message hole documented in #66721: BB's WebhookService is fire-and-forget on POST failure (no retries), and BB's MessagePoller only re-fires webhooks on BB-side reconnection events (Messages.app / APNs), not on webhook-receiver recovery. So inbound messages delivered while the gateway was down, restarting, or wedged were permanently lost. Design ------ New extensions/bluebubbles/src/catchup.ts: - fetchBlueBubblesMessagesSince(sinceMs, limit, opts) calls /api/v1/message/query with {after, sort:"ASC", with:[chat, chat.participants, attachment]} so replays carry the same shape normalizeWebhookMessage already handles on live dispatch. - loadBlueBubblesCatchupCursor / saveBlueBubblesCatchupCursor persist {lastSeenMs, updatedAt} per account under <stateDir>/bluebubbles/catchup/<accountId>__<hash>.json using the plugin-sdk's atomic JSON helpers. File layout mirrors the inbound-dedupe store from #66816, and the resolver is the canonical openclaw/plugin-sdk/state-paths.resolveStateDir (same helper dedupe uses) so the two stores share a single root. - runBlueBubblesCatchup(target) orchestrates: clamp config, fetch, filter isFromMe and pre-cursor records, dispatch to processMessage, advance cursor. Modified extensions/bluebubbles/src/monitor.ts: after the webhook target registers, fire catchup as a background task; errors are logged but never block the channel-ready signal. Modified extensions/bluebubbles/src/config-schema.ts: new optional `catchup` block (enabled, maxAgeMinutes, perRunLimit, firstRunLookbackMinutes); defaults are on with 2h lookback / 50 msg cap / 30-min first-run lookback. Modified extensions/bluebubbles/src/accounts.ts: adds `catchup` to the account-merge nestedObjectKeys list so per-account overrides deep-merge on top of channel-level defaults, mirroring the existing `network` precedent. Safety ------ - Goes through the same processMessage path webhooks use, so auth, allowlist, pairing, and downstream agent dispatch all apply unchanged. - Dedupes against #66816's persistent inbound GUID cache: a webhook delivery that already succeeded cannot be reprocessed by catchup. - Never dispatches isFromMe records (double-checked before and after normalization) so the agent's own sends cannot enter the inbound path. - Catchup runs once per gateway startup and does NOT skip on rapid restarts. Skipping would permanently lose any messages that arrived during the brief downtime between the two startups; the bounded query (perRunLimit, maxAge) and inbound-dedupe cache cap the cost of running every restart. - Cursor only advances to nowMs on fully-successful runs. On processMessage failure, cursor is held just before the earliest failure timestamp so the next run retries from there. On truncation (fetchedCount === perRunLimit), cursor advances only to the last fetched timestamp so the next gateway startup picks up the unfetched tail. - A future-dated cursor (NTP rollback, manual clock adjust) is treated as unusable and falls through to the firstRunLookback path; the cursor is repaired at the end of the run. - First-run lookback clamped to the maxAge ceiling so a config with maxAgeMinutes:5, firstRunLookbackMinutes:30 cannot exceed the operator's stated cap. - Hard ceilings: 12h max lookback, 500 messages per run. - Loud WARNING emitted when fetchedCount hits perRunLimit so operators know a single startup didn't drain the full backlog. Why this approach The fix mirrors a workspace-level shell script that's been running on a real OpenClaw install for ~4 weeks (~100 LoC of bash + python doing the same query/filter/POST flow). Porting it into the BB channel itself means every install gets recovery for free, calls processMessage directly (no re-POST hop), and benefits from #66816's persistent dedupe automatically. Validation - New scoped tests in extensions/bluebubbles/src/catchup.test.ts (21 cases): cursor round-trip, per-account scoping, FS-unsafe account IDs, firstRunLookback default, maxAge clamp on both existing-cursor and first-run branches, enabled:false, rapid-restart-still-runs, isFromMe filter (pre- and post-normalize), query-failure-preserves-cursor, per-message failure isolation, held-cursor-on-retryable-failure, clamp-to-prior-cursor, future-cursor recovery, pre-cursor defense-in-depth, perRunLimit warn / no-warn, and truncation-cursor advances only to page boundary. - Full BlueBubbles suite: 410/410. - pnpm check green. - Live E2E on macOS 26.3 / BB Server 1.9.x: stop gateway, send 3 messages (verified 3x ECONNREFUSED in BB log), start gateway; catchup replayed all 3 through processMessage, cursor file appeared at ~/.openclaw/bluebubbles/catchup/<accountId>__<hash>.json, subsequent restart was a no-op. Closes #66721. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…rt (#66721) Adds an in-process startup catchup pass to the BlueBubbles channel that queries BB Server for messages delivered since a persisted per-account cursor and re-feeds each through the existing processMessage pipeline. Fixes the missed-message hole documented in #66721: BB's WebhookService is fire-and-forget on POST failure (no retries), and BB's MessagePoller only re-fires webhooks on BB-side reconnection events (Messages.app / APNs), not on webhook-receiver recovery. So inbound messages delivered while the gateway was down, restarting, or wedged were permanently lost. Design - New extensions/bluebubbles/src/catchup.ts with fetchBlueBubblesMessagesSince (POSTs /api/v1/message/query with {after, sort:"ASC", with:[chat, chat.participants, attachment]}), load/saveBlueBubblesCatchupCursor (file-backed {lastSeenMs, updatedAt} per account under <stateDir>/bluebubbles/catchup/<accountId>__<hash>.json using the plugin-sdk's atomic JSON helpers, same state-dir root as inbound-dedupe via the canonical SDK resolver, and resolvePreferredOpenClawTmpDir for test isolation to satisfy the messaging-tmpdir and temp-path-guard lints), and runBlueBubblesCatchup orchestrator. - monitor.ts: fire catchup as a background task after the webhook target registers; errors are logged but never block the channel-ready signal. - config-schema.ts: new optional `catchup` block (enabled, maxAgeMinutes, perRunLimit, firstRunLookbackMinutes); defaults on with 2h lookback / 50 msg cap / 30-min first-run lookback. - accounts.ts: adds `catchup` to nestedObjectKeys so per-account overrides deep-merge on top of channel-level defaults (mirroring the existing `network` precedent). Safety - Goes through the same processMessage path webhooks use, so auth, allowlist, pairing, and downstream agent dispatch apply unchanged. - Dedupes against #66816's persistent inbound GUID cache. - Never dispatches isFromMe records (checked before and after normalization). - Runs once per gateway startup and does NOT skip on rapid restarts - skipping would permanently lose any messages that arrived during the brief downtime between two startups. - Cursor advances to nowMs on full success, held at min(earliestFailureTs - 1, previousCursor) on any processMessage failure so retries pick up exactly the failed records, or at latestFetchedTs on truncation (fetchedCount === perRunLimit) so the next gateway startup picks up the unfetched tail. - Future-dated cursor (NTP rollback, manual clock adjust) treated as unusable and recovered via firstRunLookback; cursor is repaired at end of run. - First-run lookback clamped to the maxAge ceiling. - Hard ceilings: 12h max lookback, 500 messages per run. - Loud WARNING on perRunLimit truncation pointing at the config knob to raise. Why this approach The fix mirrors a workspace-level shell script that's been running on a real OpenClaw install for ~4 weeks (~100 LoC of bash + python doing the same query/filter/POST flow). Porting it into the BB channel itself means every install gets recovery for free, calls processMessage directly (no re-POST hop), and benefits from #66816's persistent dedupe automatically. Validation - 21 scoped tests in extensions/bluebubbles/src/catchup.test.ts. - Full BB suite 410/410. - pnpm check green. - src/security/temp-path-guard.test.ts and lint:tmp:no-random-messaging both pass (use resolvePreferredOpenClawTmpDir + string concatenation instead of os.tmpdir + template literal). - Live E2E on macOS 26.3 / BB Server 1.9.x: 3/3 messages replayed. Closes #66721. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…rt (#66857) Adds an in-process startup catchup pass to the BlueBubbles channel that queries BB Server for messages delivered since a persisted per-account cursor and re-feeds each through the existing processMessage pipeline. Fixes the missed-message hole documented in #66721: BB's WebhookService is fire-and-forget on POST failure, and MessagePoller only re-fires webhooks on BB-side reconnection events, not on webhook-receiver recovery. - New extensions/bluebubbles/src/catchup.ts with singleflight per accountId, cursor persistence via the canonical state-paths resolver, bounded query (perRunLimit + maxAgeMinutes), failure-held cursor, truncation-aware page-boundary advancement, future-cursor recovery, isFromMe filter (pre- and post-normalization). - monitor.ts fires catchup as a background task after the webhook target registers. - config-schema.ts adds optional catchup block; accounts.ts adds catchup to nestedObjectKeys for deep-merge per-account overrides. - Dedupes against #66816's persistent inbound GUID cache. - 22 scoped tests; full BB suite 411/411; pnpm check green; live E2E on macOS 26.3 / BB Server 1.9.x recovered 3/3 missed messages. Closes #66721. Co-authored-by: Omar Shahine <omar@shahine.com>

…9176, openclaw#12053) (openclaw#66816) BlueBubbles MessagePoller replays its ~1-week lookback window as new-message webhooks after BB Server restart or reconnect. Add a persistent file-backed GUID dedupe (TTL=7d) at the top of processMessage using createClaimableDedupe from the Plugin SDK. Claim/finalize/release semantics ensure transient delivery failures release the GUID so a later replay can retry. Fixes openclaw#19176, openclaw#12053. Co-authored-by: Omar Shahine <omar@shahine.com>

…rt (openclaw#66857) Adds an in-process startup catchup pass to the BlueBubbles channel that queries BB Server for messages delivered since a persisted per-account cursor and re-feeds each through the existing processMessage pipeline. Fixes the missed-message hole documented in openclaw#66721: BB's WebhookService is fire-and-forget on POST failure, and MessagePoller only re-fires webhooks on BB-side reconnection events, not on webhook-receiver recovery. - New extensions/bluebubbles/src/catchup.ts with singleflight per accountId, cursor persistence via the canonical state-paths resolver, bounded query (perRunLimit + maxAgeMinutes), failure-held cursor, truncation-aware page-boundary advancement, future-cursor recovery, isFromMe filter (pre- and post-normalization). - monitor.ts fires catchup as a background task after the webhook target registers. - config-schema.ts adds optional catchup block; accounts.ts adds catchup to nestedObjectKeys for deep-merge per-account overrides. - Dedupes against openclaw#66816's persistent inbound GUID cache. - 22 scoped tests; full BB suite 411/411; pnpm check green; live E2E on macOS 26.3 / BB Server 1.9.x recovered 3/3 missed messages. Closes openclaw#66721. Co-authored-by: Omar Shahine <omar@shahine.com>

…9176, openclaw#12053) (openclaw#66816) BlueBubbles MessagePoller replays its ~1-week lookback window as new-message webhooks after BB Server restart or reconnect. Add a persistent file-backed GUID dedupe (TTL=7d) at the top of processMessage using createClaimableDedupe from the Plugin SDK. Claim/finalize/release semantics ensure transient delivery failures release the GUID so a later replay can retry. Fixes openclaw#19176, openclaw#12053. Co-authored-by: Omar Shahine <omar@shahine.com>

…rt (openclaw#66857) Adds an in-process startup catchup pass to the BlueBubbles channel that queries BB Server for messages delivered since a persisted per-account cursor and re-feeds each through the existing processMessage pipeline. Fixes the missed-message hole documented in openclaw#66721: BB's WebhookService is fire-and-forget on POST failure, and MessagePoller only re-fires webhooks on BB-side reconnection events, not on webhook-receiver recovery. - New extensions/bluebubbles/src/catchup.ts with singleflight per accountId, cursor persistence via the canonical state-paths resolver, bounded query (perRunLimit + maxAgeMinutes), failure-held cursor, truncation-aware page-boundary advancement, future-cursor recovery, isFromMe filter (pre- and post-normalization). - monitor.ts fires catchup as a background task after the webhook target registers. - config-schema.ts adds optional catchup block; accounts.ts adds catchup to nestedObjectKeys for deep-merge per-account overrides. - Dedupes against openclaw#66816's persistent inbound GUID cache. - 22 scoped tests; full BB suite 411/411; pnpm check green; live E2E on macOS 26.3 / BB Server 1.9.x recovered 3/3 missed messages. Closes openclaw#66721. Co-authored-by: Omar Shahine <omar@shahine.com>

…9176, openclaw#12053) (openclaw#66816) BlueBubbles MessagePoller replays its ~1-week lookback window as new-message webhooks after BB Server restart or reconnect. Add a persistent file-backed GUID dedupe (TTL=7d) at the top of processMessage using createClaimableDedupe from the Plugin SDK. Claim/finalize/release semantics ensure transient delivery failures release the GUID so a later replay can retry. Fixes openclaw#19176, openclaw#12053. Co-authored-by: Omar Shahine <omar@shahine.com>

…rt (openclaw#66857) Adds an in-process startup catchup pass to the BlueBubbles channel that queries BB Server for messages delivered since a persisted per-account cursor and re-feeds each through the existing processMessage pipeline. Fixes the missed-message hole documented in openclaw#66721: BB's WebhookService is fire-and-forget on POST failure, and MessagePoller only re-fires webhooks on BB-side reconnection events, not on webhook-receiver recovery. - New extensions/bluebubbles/src/catchup.ts with singleflight per accountId, cursor persistence via the canonical state-paths resolver, bounded query (perRunLimit + maxAgeMinutes), failure-held cursor, truncation-aware page-boundary advancement, future-cursor recovery, isFromMe filter (pre- and post-normalization). - monitor.ts fires catchup as a background task after the webhook target registers. - config-schema.ts adds optional catchup block; accounts.ts adds catchup to nestedObjectKeys for deep-merge per-account overrides. - Dedupes against openclaw#66816's persistent inbound GUID cache. - 22 scoped tests; full BB suite 411/411; pnpm check green; live E2E on macOS 26.3 / BB Server 1.9.x recovered 3/3 missed messages. Closes openclaw#66721. Co-authored-by: Omar Shahine <omar@shahine.com>

…9176, openclaw#12053) (openclaw#66816) BlueBubbles MessagePoller replays its ~1-week lookback window as new-message webhooks after BB Server restart or reconnect. Add a persistent file-backed GUID dedupe (TTL=7d) at the top of processMessage using createClaimableDedupe from the Plugin SDK. Claim/finalize/release semantics ensure transient delivery failures release the GUID so a later replay can retry. Fixes openclaw#19176, openclaw#12053. Co-authored-by: Omar Shahine <omar@shahine.com>

…rt (openclaw#66857) Adds an in-process startup catchup pass to the BlueBubbles channel that queries BB Server for messages delivered since a persisted per-account cursor and re-feeds each through the existing processMessage pipeline. Fixes the missed-message hole documented in openclaw#66721: BB's WebhookService is fire-and-forget on POST failure, and MessagePoller only re-fires webhooks on BB-side reconnection events, not on webhook-receiver recovery. - New extensions/bluebubbles/src/catchup.ts with singleflight per accountId, cursor persistence via the canonical state-paths resolver, bounded query (perRunLimit + maxAgeMinutes), failure-held cursor, truncation-aware page-boundary advancement, future-cursor recovery, isFromMe filter (pre- and post-normalization). - monitor.ts fires catchup as a background task after the webhook target registers. - config-schema.ts adds optional catchup block; accounts.ts adds catchup to nestedObjectKeys for deep-merge per-account overrides. - Dedupes against openclaw#66816's persistent inbound GUID cache. - 22 scoped tests; full BB suite 411/411; pnpm check green; live E2E on macOS 26.3 / BB Server 1.9.x recovered 3/3 missed messages. Closes openclaw#66721. Co-authored-by: Omar Shahine <omar@shahine.com>

omarshahine added channel: bluebubbles Channel integration: bluebubbles maintainer Maintainer-authored PR labels Apr 14, 2026

openclaw-barnacle Bot added the size: M label Apr 14, 2026

greptile-apps Bot reviewed Apr 14, 2026

View reviewed changes

omarshahine force-pushed the fix/bb-inbound-dedupe branch from 82bce26 to 64020da Compare April 14, 2026 21:27

omarshahine force-pushed the fix/bb-inbound-dedupe branch 3 times, most recently from c772ce8 to be43c31 Compare April 14, 2026 22:28

chatgpt-codex-connector Bot reviewed Apr 14, 2026

View reviewed changes

omarshahine added 2 commits April 14, 2026 22:37

BlueBubbles: move changelog entry to Unreleased section

cbeb3ca

omarshahine force-pushed the fix/bb-inbound-dedupe branch from be43c31 to cbeb3ca Compare April 14, 2026 22:38

chatgpt-codex-connector Bot reviewed Apr 14, 2026

View reviewed changes

omarshahine merged commit 58742ac into main Apr 14, 2026
42 checks passed

omarshahine deleted the fix/bb-inbound-dedupe branch April 14, 2026 22:45

omarshahine mentioned this pull request Apr 14, 2026

feat(bluebubbles): replay missed webhook messages after gateway restart (#66721) #66853

Closed

5 tasks

omarshahine mentioned this pull request Apr 14, 2026

feat(bluebubbles): replay missed webhook messages after gateway restart (#66721) #66857

Merged

5 tasks

github-actions Bot mentioned this pull request Apr 15, 2026

📡 Upstream Digest — 2026-04-15 01:47 UTC curtismercier/openclaw-mods#576

Open

This was referenced Apr 16, 2026

[codex] fix(bluebubbles): dedupe webhook replays without dropping edits #52277

Closed

fix(bluebubbles): restore webhook ingress after source-loaded startup #47843

Closed

This was referenced Apr 19, 2026

BlueBubbles 每30分钟 stale-socket 重连问题 #36059

Closed

BlueBubbles: periodic stale-socket restarts correlate with inbound gaps (multi-account on macOS 26.3) #38307

Closed

amknight mentioned this pull request Apr 24, 2026

feat(plugin-sdk): add persistent keyed store helper #70626

Closed

16 tasks

amknight mentioned this pull request Apr 29, 2026

feat(plugins): add SQLite plugin state store #74190

Merged

clawsweeper Bot mentioned this pull request May 4, 2026

feat(plugin-state): add atomic dedupe claims #77134

Open

25 tasks

Uh oh!

Conversation

omarshahine commented Apr 14, 2026

Summary

Why this approach

Interaction with edit events (`updated-message`)

Validation

Credits

Test plan

Uh oh!

aisle-research-bot Bot commented Apr 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔒 Aisle Security Analysis

Description

Recommendation

Description

Recommendation

Description

Recommendation

Uh oh!

greptile-apps Bot commented Apr 14, 2026

Greptile Summary

Confidence Score: 5/5

Uh oh!

greptile-apps Bot Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

omarshahine Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

omarshahine Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

omarshahine commented Apr 14, 2026

Aisle findings response

Uh oh!

omarshahine commented Apr 14, 2026

Aisle findings response (commit d27603b)

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

aisle-research-bot Bot commented Apr 14, 2026 •

edited

Loading