Skip to content

fix(imessage): resume watch from catchup cursor#79085

Open
CaptainTimon wants to merge 1 commit intoopenclaw:mainfrom
CaptainTimon:fix/imessage-bounded-catchup
Open

fix(imessage): resume watch from catchup cursor#79085
CaptainTimon wants to merge 1 commit intoopenclaw:mainfrom
CaptainTimon:fix/imessage-bounded-catchup

Conversation

@CaptainTimon
Copy link
Copy Markdown
Contributor

@CaptainTimon CaptainTimon commented May 7, 2026

Summary

  • Problem: the iMessage monitor subscribed live-only after Gateway restart, so messages written to chat.db while OpenClaw was down were not replayed.
  • Why it matters: operators could miss inbound iMessage/SMS requests after sleep, restarts, or offline windows.
  • What changed: OpenClaw now stores the last successfully processed imsg rowid per iMessage account and passes a bounded since_rowid + start cursor to watch.subscribe on restart.
  • What did NOT change (scope boundary): first-run subscriptions remain live-only; access control, echo-dedupe, mention gating, and reply dispatch still use the existing inbound pipeline.

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor required for the fix
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

Real behavior proof (required for external PRs)

  • Behavior or issue addressed: iMessage restart catchup now resumes imsg rpc watch.subscribe from a persisted, exclusive rowid cursor instead of launching live-only after every Gateway restart.
  • Real environment tested: local OpenClaw repo checkout on Linux, running the actual changed cursor builder module with Node/tsx; dependency contract verified against current imsg JSON-RPC docs for watch.subscribe params (since_rowid, start, attachments).
  • Exact steps or command run after this patch: node --import tsx - <<'EOF' ... buildIMessageWatchSubscribeParams(...) ... EOF
  • Evidence after fix (screenshot, recording, terminal capture, console output, redacted runtime log, linked artifact, or copied live output): copied console output from the real changed OpenClaw module:
$ node --import tsx - <<'EOF'
import { buildIMessageWatchSubscribeParams } from './extensions/imessage/src/monitor/catchup-cursor.ts';
const firstRun = buildIMessageWatchSubscribeParams({ attachments: false, cursor: null });
const restart = buildIMessageWatchSubscribeParams({
  attachments: false,
  cursor: { lastSeenRowid: 9000, updatedAt: '2026-05-07T00:00:00.000Z' },
  catchup: { maxAgeMinutes: 30 },
  now: new Date('2026-05-07T12:00:00.000Z'),
});
console.log(JSON.stringify({ firstRun, restart }, null, 2));
EOF
{
  "firstRun": {
    "attachments": false
  },
  "restart": {
    "attachments": false,
    "since_rowid": 9000,
    "start": "2026-05-07T11:30:00.000Z"
  }
}
  • Observed result after fix: first-run watch params stay live-only, while restart params include since_rowid: 9000 plus a 30-minute bounded ISO start, which is the documented imsg rpc replay cursor contract.
  • What was not tested: live macOS Messages.app replay against a real chat.db was not run from this Linux workspace.
  • Before evidence (optional but encouraged): existing code always called watch.subscribe with only attachments, so imsg started at the newest message and skipped rows written before launch.

Root Cause (if applicable)

  • Root cause: OpenClaw treated imsg watch.subscribe as a live-only stream and did not persist or pass the last processed message rowid across monitor restarts.
  • Missing detection / guardrail: no monitor test asserted restart subscribe params when a previous rowid exists.
  • Contributing context (if known): imsg starts at newest message when since_rowid is omitted, by design.

Regression Test Plan (if applicable)

  • Coverage level that should have caught this:
    • Unit test
    • Seam / integration test
    • End-to-end test
    • Existing coverage already sufficient
  • Target test or file: extensions/imessage/src/monitor/catchup-cursor.test.ts, extensions/imessage/src/monitor.watch-subscribe-retry.test.ts, extensions/imessage/src/config-schema.test.ts
  • Scenario the test should lock in: persist the largest processed rowid per account; build bounded resume params; pass the persisted cursor to watch.subscribe; validate catchup.maxAgeMinutes schema bounds.
  • Why this is the smallest reliable guardrail: it proves OpenClaw uses the documented imsg RPC cursor contract without needing a real macOS Messages database in unit CI.
  • Existing test that already covers this (if any): none.
  • If no new test is added, why not: N/A.

User-visible / Behavior Changes

  • iMessage accounts now create $OPENCLAW_STATE_DIR/imessage/catchup-cursors.json after processing live messages.
  • After a cursor exists, Gateway restart replays missed iMessage rows through the existing pipeline for a bounded window.
  • New optional config: channels.imessage.catchup.enabled and channels.imessage.catchup.maxAgeMinutes (default enabled, 24 hours).

Diagram (if applicable)

Before:
Gateway restart -> watch.subscribe({ attachments }) -> imsg starts at newest row -> missed offline rows stay missed

After:
processed row -> persist rowid -> Gateway restart -> watch.subscribe({ since_rowid, start, attachments }) -> missed bounded rows replay

Security Impact (required)

  • New permissions/capabilities? (No)
  • Secrets/tokens handling changed? (No)
  • New/changed network calls? (No)
  • Command/tool execution surface changed? (No)
  • Data access scope changed? (No)
  • If any Yes, explain risk + mitigation: N/A.

Repro + Verification

Environment

  • OS: Linux validation workspace
  • Runtime/container: Node via repo pnpm scripts
  • Model/provider: N/A
  • Integration/channel (if any): iMessage / imsg rpc
  • Relevant config (redacted): channels.imessage.catchup.maxAgeMinutes=30 in unit coverage

Steps

  1. Seed an iMessage catchup cursor for account main with rowid 9000.
  2. Start the iMessage monitor with mocked imsg rpc client.
  3. Inspect the watch.subscribe params.

Expected

  • Subscribe includes since_rowid: 9000, a bounded ISO start, and the existing attachments flag.

Actual

  • Matches expected in extensions/imessage/src/monitor.watch-subscribe-retry.test.ts and the real module console output above.

Evidence

  • Failing test/log before + passing after
  • Trace/log snippets
  • Screenshot/recording
  • Perf numbers (if relevant)

Validation run:

pnpm test extensions/imessage/src/monitor/catchup-cursor.test.ts extensions/imessage/src/monitor.watch-subscribe-retry.test.ts extensions/imessage/src/config-schema.test.ts
# Test Files 3 passed (3); Tests 19 passed (19)

pnpm config:docs:check
# OK docs/.generated/config-baseline.sha256

pnpm config:channels:check
# passed

pnpm exec oxfmt --check --threads=1 CHANGELOG.md docs/channels/imessage.md extensions/imessage/src/accounts.ts extensions/imessage/src/config-schema.test.ts extensions/imessage/src/monitor.watch-subscribe-retry.test.ts extensions/imessage/src/monitor/catchup-cursor.ts extensions/imessage/src/monitor/catchup-cursor.test.ts extensions/imessage/src/monitor/monitor-provider.ts src/config/types.imessage.ts src/config/zod-schema.providers-core.ts src/config/bundled-channel-config-metadata.generated.ts
# All matched files use the correct format.

git diff --check
# passed

pnpm lint:extensions -- extensions/imessage/src/monitor/catchup-cursor.ts extensions/imessage/src/monitor/monitor-provider.ts extensions/imessage/src/accounts.ts extensions/imessage/src/monitor/catchup-cursor.test.ts extensions/imessage/src/monitor.watch-subscribe-retry.test.ts extensions/imessage/src/config-schema.test.ts
# Found 0 warnings and 0 errors.

pnpm tsgo:core && pnpm tsgo:extensions && pnpm tsgo:extensions:test
# passed

Additional note:

pnpm tsgo:core:test
# still fails on pre-existing OpenAI fixture type errors in:
# src/agents/openai-transport-stream.test.ts
# src/agents/pi-embedded-runner/openai-stream-wrappers.test.ts

Human Verification (required)

What you personally verified (not just CI), and how:

  • Verified scenarios: first-run live-only params; persisted cursor resume params; per-account cursor max rowid; catchup config validation; generated config metadata/docs baseline.
  • Edge cases checked: disabled/missing cursor does not replay; invalid maxAgeMinutes: 0 is rejected; lower rowid does not move the cursor backward.
  • What you did not verify: live macOS Messages.app replay with a real chat.db from this Linux workspace.

Review Conversations

  • I replied to or resolved every bot review conversation I addressed in this PR.
  • I left unresolved only the conversations that still need reviewer or maintainer judgment.

If a bot review conversation is addressed by this PR, resolve that conversation yourself. Do not leave bot review conversation cleanup for maintainers.

Compatibility / Migration

  • Backward compatible? (Yes)
  • Config/env changes? (Yes)
  • Migration needed? (No)
  • If yes, exact upgrade steps: no required migration; optional tuning via channels.imessage.catchup.enabled / maxAgeMinutes.

Risks and Mitigations

  • Risk: replay after restart could process stale old rows if a gateway was down for a long period.
    • Mitigation: replay is bounded by catchup.maxAgeMinutes and defaults to 24 hours; first run stays live-only until OpenClaw has processed and persisted a cursor.

@openclaw-barnacle openclaw-barnacle Bot added docs Improvements or additions to documentation channel: imessage Channel integration: imessage size: M triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. labels May 7, 2026
@clawsweeper
Copy link
Copy Markdown
Contributor

clawsweeper Bot commented May 7, 2026

Codex review: needs real behavior proof before merge.

Summary
The branch adds per-account iMessage catchup cursor persistence, restart watch.subscribe resume params, catchup config/schema/docs/generated metadata, a changelog entry, and targeted tests.

Reproducibility: no. high-confidence live reproduction path was established. Source inspection does show current main subscribes live-only with no persisted cursor path, and upstream imsg docs confirm the exclusive since_rowid resume contract.

Real behavior proof
Needs stronger real behavior proof before merge: The PR body includes terminal output from the changed helper, but it does not show the Gateway/monitor subscribing to a real imsg process or replaying a real Messages chat.db after restart; contributors should post redacted terminal/log/video proof and update the PR body to trigger re-review.

Next step before merge
Contributor and maintainer follow-up is needed because the PR lacks live macOS/imsg restart proof, still has a concrete cursor-store race, and changes catchup defaults/config semantics.

Security
Cleared: No concrete security or supply-chain regression was found; the diff adds no dependencies, CI changes, secret handling, or new external execution path, and the new state file is written under the OpenClaw state dir with restrictive modes.

Review findings

  • [P2] Serialize cursor updates before writing the shared store — extensions/imessage/src/monitor/catchup-cursor.ts:97-107
Review details

Best possible solution:

Land an iMessage-owned bounded catchup implementation after cursor writes are serialized and covered for concurrent updates, with live macOS/imsg restart replay proof and maintainer-approved defaults.

Do we have a high-confidence way to reproduce the issue?

No high-confidence live reproduction path was established. Source inspection does show current main subscribes live-only with no persisted cursor path, and upstream imsg docs confirm the exclusive since_rowid resume contract.

Is this the best way to solve the issue?

No, not merge-ready as proposed. The owner boundary and dependency contract are reasonable, but the cursor store needs concurrency-safe updates and the external PR needs real macOS/imsg restart replay proof.

Full review comments:

  • [P2] Serialize cursor updates before writing the shared store — extensions/imessage/src/monitor/catchup-cursor.ts:97-107
    recordIMessageCatchupCursor reads catchup-cursors.json, mutates one account entry, and rewrites the whole file without per-file serialization. Different iMessage debounce keys can flush independently and multi-account monitors share this store, so concurrent records can overwrite another account's cursor or regress the same account to an older rowid. Queue writes per store path or re-read/merge under a lock, then add concurrent account/lower-rowid coverage.
    Confidence: 0.86

Overall correctness: patch is incorrect
Overall confidence: 0.78

What I checked:

  • Current main live-only subscribe path: Current main calls watch.subscribe with only the attachments parameter and has no iMessage cursor read/write path in monitor startup. (extensions/imessage/src/monitor/monitor-provider.ts:611, 56fe64e8e369)
  • PR shared cursor store: The PR records each cursor by reading the whole shared catchup-cursors.json store, mutating one account entry, then rewriting and renaming the file. (extensions/imessage/src/monitor/catchup-cursor.ts:97, 9e86b02b40f7)
  • Independent inbound flushes can overlap: The shared inbound debouncer serializes work per key, not globally, so different conversations/accounts can flush and call the shared cursor writer concurrently. (src/auto-reply/inbound-debounce.ts:84, 56fe64e8e369)
  • Dependency cursor contract: The current upstream imsg JSON-RPC docs list watch.subscribe params including exclusive since_rowid, ISO start/end, and attachments; the README also says imsg watch starts at newest by default and uses --since-rowid to resume.
  • Linked issue context: The linked issue remains open and asks for bounded per-account iMessage catchup after Gateway downtime, with replay bounds and echo-dedupe called out as behavior constraints.
  • Proof review: The PR body now includes copied terminal output from the pure cursor-param builder, but explicitly says live macOS Messages.app replay against a real chat.db was not run. (9e86b02b40f7)

Likely related people:

  • steipete: Local blame for the current live watch.subscribe monitor path points to Peter Steinberger's recent iMessage monitor rewrite, and GitHub path history shows repeated recent iMessage monitor/config refactors by the same maintainer. (role: recent maintainer and central-path owner; confidence: high; commits: f2bf925a387f, 05eda57b3c72, ffe67e9cdc9e; files: extensions/imessage/src/monitor/monitor-provider.ts, src/config/types.imessage.ts, src/config/zod-schema.providers-core.ts)
  • vincentkoc: Recent iMessage docs and account type seam history point to Vincent Koc around the supported imsg setup and account/config surfaces touched by this PR. (role: recent adjacent maintainer; confidence: medium; commits: 0fca66549794, 91ed1604b011, 6784cc692c1f; files: docs/channels/imessage.md, extensions/imessage/src/accounts.ts)
  • Takhoffman: Recent path history shows Tak Hoffman on the iMessage default runtime account behavior adjacent to the account resolver and runtimeOnlyDefault surface touched by this PR. (role: adjacent account-resolution maintainer; confidence: low; commits: 4f5f1fa724a1; files: extensions/imessage/src/accounts.ts)

Remaining risk / open question:

  • The shared cursor store read-modify-write race can drop another account's cursor or overwrite a newer rowid with an older one, causing missed restart catchup later.
  • The external PR still lacks after-fix proof from a real macOS Messages/imsg restart-catchup run, so the actual outage replay behavior is unverified.
  • The new default-enabled 24-hour catchup config is a product/defaults choice tied to the maintainer-labeled linked issue.

Codex review notes: model gpt-5.5, reasoning high; reviewed against 56fe64e8e369.

@openclaw-barnacle openclaw-barnacle Bot added proof: supplied External PR includes structured after-fix real behavior proof. and removed triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. labels May 7, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

channel: imessage Channel integration: imessage docs Improvements or additions to documentation proof: supplied External PR includes structured after-fix real behavior proof. size: M

Projects

None yet

Development

Successfully merging this pull request may close these issues.

iMessage: catchup missed inbound messages received while gateway was down

1 participant