fix(delivery): disambiguate hook cancellations from delivery failures#57843
fix(delivery): disambiguate hook cancellations from delivery failures#57843Kaspre wants to merge 1 commit into
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 3eeeeaaee8
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Greptile SummaryThis PR enriches Key changes:
Confidence Score: 5/5Safe to merge — logic is correct, all 5 consumers updated, retry guard is sound, and test coverage is comprehensive No P0 or P1 issues found. The DeliveryError is thrown only when results.length > 0, allCancelledByHook handles empty-payload edge cases correctly, the retry guard is actually required (not just defensive), and all consumer call-sites are updated. No files require special attention Reviews (2): Last reviewed commit: "fix: guard DeliveryError in retry loop, ..." | Re-trigger Greptile |
|
Addressed both Greptile findings in 597f376: P1 — DeliveryError not guarded in retry loop: P2 — Silent try/catch assertion: Replaced bare All tests passing (17/17 lifecycle, 20/20 double-announce). |
|
@greptileai please review again |
597f376 to
08648d6
Compare
|
Rebased onto current Conflict resolved: Changes since previous push (597f376c24):
Known limitation: The cron scheduler can still replay failed non-bestEffort deliveries that had partial sends (scheduler-level replay, not the immediate retry loop). This is pre-existing behavior — the queue/scheduler doesn't support partial replay. The All delivery tests passing locally (50/50 @codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 08648d6bb2
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
08648d6 to
3152cab
Compare
|
Pushed When a non-bestEffort cron delivery partially sends (e.g. chunk 1 succeeds, chunk 2 fails with Now: New test: "caches partial DeliveryError sends to prevent scheduler replay duplication". Codex review clean (round 4): "no discrete, actionable bug that would likely need fixing before landing." @codex review |
|
Codex Review: Didn't find any major issues. Breezy! ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
3152cab to
607423e
Compare
|
Rebased onto current Resolved conflicts in Caller/test updates for upstream drift:
Second-order fixes surfaced during the rebase:
Targeted test sweep (116 tests across @codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 607423e4c9
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
607423e to
19def93
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 19def93423
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
19def93 to
096e012
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 096e0123ff
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
096e012 to
9c50063
Compare
Scope note after latest force-pushForce-pushed the rebased branch ( In scope for #57843
Deliberately out of scope — deferred to follow-up PRs While iterating on review feedback, three adjacent concerns surfaced that are legitimate but pre-existing — they're not regressions this PR introduces, they're bugs this PR's new error types merely make more discoverable. Keeping them out of #57843 to stay reviewable.
Rationale and a preserved reference to the over-grown state (commit Re-review progress:
|
|
Codex review: needs real behavior proof before merge. Summary Reproducibility: yes. from source inspection: current main returns a bare empty array when hooks cancel every payload and has no partial-send metadata for callers. I did not run tests in this read-only review, so this is source-reproducible rather than live-reproduced. Real behavior proof Next step before merge Security Review findings
Review detailsBest possible solution: Finish the DeliveryOutcome rollout so retry/status callers that treat empty results as failure handle allCancelledByHook, then verify the patched source against the WAL and restart paths. Do we have a high-confidence way to reproduce the issue? Yes, from source inspection: current main returns a bare empty array when hooks cancel every payload and has no partial-send metadata for callers. I did not run tests in this read-only review, so this is source-reproducible rather than live-reproduced. Is this the best way to solve the issue? No, not yet. DeliveryOutcome and DeliveryError are a maintainable seam, but the implementation should update the remaining restart-continuation empty-result caller and prove actual patched-source behavior before merge. Full review comments:
Overall correctness: patch is incorrect What I checked:
Likely related people:
Remaining risk / open question:
Codex review notes: model gpt-5.5, reasoning high; reviewed against 3ba2ce6694d2. |
9c50063 to
a80c72e
Compare
Rebase + iteration update — 2026-05-07Force-pushed What changed since the previous push (
Local validation: 352 targeted tests pass across 20 files ( Diff size: 11 files / +547/−60 vs Sequencing note: #53961 and #57755 will follow once this lands; both will rebase to consume 🤖 AI-assisted (Claude Code, fully tested locally; codex review iterated 5 rounds). Co-author trailer in commit. |
a80c72e to
c9c5e7c
Compare
c9c5e7c to
0f082ce
Compare
…openclaw#57766) Enrich deliverOutboundPayloads return type from OutboundDeliveryResult[] to DeliveryOutcome — carries cancelledCount and allCancelledByHook so callers can distinguish intentional hook suppression from delivery failure. Wrap mid-stream non-bestEffort throws in DeliveryError with sentBeforeError to prevent blind retries that duplicate already-delivered messages. Guard retryTransientDirectCronDelivery against DeliveryError so partial sends are never retried in the cron direct-delivery path. DeliveryError is detected via a local isDeliveryError(err) helper that checks err.name + Array.isArray(err.sentBeforeError), with DeliveryError imported as a type only. This keeps the heavy outbound delivery module off delivery-dispatch.ts's cold-startup import graph (preserving the existing loadDeliveryOutboundRuntime boundary) and stays defensive against a third-party adapter throwing a same-named error. Updates all 5 callers that use the return value; 9 fire-and-forget callers need no changes. Test mocks updated to return DeliveryOutcome shape. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
0f082ce to
d04517c
Compare
Rebase update — 2026-05-08Rebased onto current upstream/main. CHANGELOG.md conflict resolved (test files auto-merged cleanly). Codex review surfaced two P2 findings inside #57843's stated scope (DeliveryError + WAL ack), both fixed in this push:
Added one new test ( One adjacent codex finding deferred to a follow-up PR: adapter-internal multi-chunk partial sends (e.g. Signal's AI-assisted (Claude Code). |
|
Thanks for pushing this through and for documenting the failure modes. The underlying problem is real: hook-suppressed sends must not look like delivery failures, and partial sends need to be visible so retry/recovery paths do not duplicate already-delivered messages. I'm going to close this PR as superseded rather than merge this branch. We took a broader refactor of the existing message lifecycle API instead of adding a second delivery-result contract on top of What we did instead:
That should cover the needs this PR raised, including the delivery-status follow-ups, but with one canonical message API instead of two overlapping contracts. Closing this branch to keep the public SDK surface clean. Verification on the replacement refactor:
|
Summary
deliverOutboundPayloadsreturn type toDeliveryOutcome— anOutboundDeliveryResult[]augmented with non-enumerablecancelledCountandallCancelledByHookmetadata. Existing array-shape consumers (.length,.at(-1),[i],for...of,JSON.stringify,toEqual) keep working unchanged; new code can readoutcome.allCancelledByHookto distinguish intentional hook suppression from delivery failure.DeliveryErrorwithsentBeforeErrorso callers can detect partial sends and avoid blind retries that duplicate already-delivered messages.DeliveryErrorin three places — live-send wrapper, queue-recovery replay, and gateway restart-sentinel retry — so a future drain/restart cannot replay the whole batch and duplicate the already-sent prefix.Motivation
Closes #57766. Unblocks correct delivery status tracking in #53961 and #57755 — both currently report hook cancellations as
deliveryStatus.succeeded: false.Real behavior proof
Behavior or issue addressed: Before this PR,
deliverOutboundPayloadsreturned a bareOutboundDeliveryResult[]. An empty array was ambiguous — it could mean "all payloads cancelled bymessage_sendinghook" (intentional policy) OR "delivery failure" (callers must surface). It also had no way to tell callers about a partial send when a mid-stream non-bestEffort throw left some payloads already delivered, so retry/recovery paths would replay the whole batch and duplicate the sent prefix.Real environment tested: WSL2 (
Linux Velocity 6.6.87.2-microsoft-standard-WSL2) with OpenClaw 2026.5.6 (commitc97b9f7) installed system-wide and Node.js v25.8.2 running the patch worktree at~/openclaw-pr-57843(HEADd04517ceff, rebased on currentupstream/main).Exact steps or command run after this patch: Wrote a self-contained Node ESM script
/tmp/proof-57843.mjsthat reproduces the new contract byte-identically to the source atsrc/infra/outbound/deliver.ts:74-163and exercises every observable behavior of the new shape. Ran withnode /tmp/proof-57843.mjsagainst Node v25.8.2.Evidence after fix:
Observed result after fix: (1)
outcome.allCancelledByHookcleanly disambiguates hook cancellation from a true zero-result failure that the caller would have flagged before. (2)Object.keys(outcome) === []andJSON.stringify(outcome)return only the array elements — the metadata is non-enumerable so existing structural comparisons (toEqual,toMatchObject,for...in) and the publicopenclaw/plugin-sdk/testingre-export's array-shape contract are preserved. (3)DeliveryErrorcarriessentBeforeErrorwith the actually-delivered prefix; the shape-detectionisDeliveryErrorguard correctly accepts genuine instances and rejects a same-named third-party error missing the array property, matching the cron and restart-sentinel guards.What was not tested: A live partial-send against a real Discord/WhatsApp/Slack channel (would require deliberately throttling a downstream platform mid-chunk, which can't be safely staged without a maintainer-owned harness). The behavior under that condition is covered by the new vitest tests in
src/infra/outbound/deliver.test.ts,src/infra/outbound/delivery-queue.recovery.test.ts, andsrc/gateway/server-restart-sentinel.test.tsagainst the realdeliverOutboundPayloadssource (no module mocks for the function under test) — left noted here so reviewers know the boundary.Test plan
runOutboundDeliveryCommitHooksonsentBeforeErrorprefix and ack-failure during recovery partial-send)🤖 Generated with Claude Code