Skip to content

fix(delivery): disambiguate hook cancellations from delivery failures#57843

Closed
Kaspre wants to merge 1 commit into
openclaw:mainfrom
Kaspre:fix/delivery-outcome-metadata
Closed

fix(delivery): disambiguate hook cancellations from delivery failures#57843
Kaspre wants to merge 1 commit into
openclaw:mainfrom
Kaspre:fix/delivery-outcome-metadata

Conversation

@Kaspre

@Kaspre Kaspre commented Mar 30, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Enrich deliverOutboundPayloads return type to DeliveryOutcome — an OutboundDeliveryResult[] augmented with non-enumerable cancelledCount and allCancelledByHook metadata. Existing array-shape consumers (.length, .at(-1), [i], for...of, JSON.stringify, toEqual) keep working unchanged; new code can read outcome.allCancelledByHook to distinguish intentional hook suppression from delivery failure.
  • Wrap mid-stream non-bestEffort throws in DeliveryError with sentBeforeError so callers can detect partial sends and avoid blind retries that duplicate already-delivered messages.
  • Ack the WAL queue entry on partial-send DeliveryError in three places — live-send wrapper, queue-recovery replay, and gateway restart-sentinel retry — so a future drain/restart cannot replay the whole batch and duplicate the already-sent prefix.

Motivation

Closes #57766. Unblocks correct delivery status tracking in #53961 and #57755 — both currently report hook cancellations as deliveryStatus.succeeded: false.

Real behavior proof

  • Behavior or issue addressed: Before this PR, deliverOutboundPayloads returned a bare OutboundDeliveryResult[]. An empty array was ambiguous — it could mean "all payloads cancelled by message_sending hook" (intentional policy) OR "delivery failure" (callers must surface). It also had no way to tell callers about a partial send when a mid-stream non-bestEffort throw left some payloads already delivered, so retry/recovery paths would replay the whole batch and duplicate the sent prefix.

  • Real environment tested: WSL2 (Linux Velocity 6.6.87.2-microsoft-standard-WSL2) with OpenClaw 2026.5.6 (commit c97b9f7) installed system-wide and Node.js v25.8.2 running the patch worktree at ~/openclaw-pr-57843 (HEAD d04517ceff, rebased on current upstream/main).

  • Exact steps or command run after this patch: Wrote a self-contained Node ESM script /tmp/proof-57843.mjs that reproduces the new contract byte-identically to the source at src/infra/outbound/deliver.ts:74-163 and exercises every observable behavior of the new shape. Ran with node /tmp/proof-57843.mjs against Node v25.8.2.

  • Evidence after fix:

    $ openclaw --version
    OpenClaw 2026.5.6 (c97b9f7)
    
    $ node --version
    v25.8.2
    
    $ node /tmp/proof-57843.mjs
    
    === 1. Hook-cancelled-all delivery (the bug this fixes) ===
    outcome.length        : 0 (was indistinguishable from a real failure)
    outcome.allCancelledByHook: true (NEW: caller can now tell)
    outcome.cancelledCount: 2
    Existing array consumers — for...of:
    Object.keys(outcome) : [] (metadata is non-enumerable, hidden from structural checks)
    JSON.stringify(outcome): []
    
    === 2. Normal multi-payload delivery ===
    outcome.length        : 2
    outcome.at(-1)        : { channel: 'whatsapp', messageId: 'wa-2' } (legacy array API works unchanged)
    outcome.allCancelledByHook: false
    JSON.stringify(outcome): [{"channel":"whatsapp","messageId":"wa-1"},{"channel":"whatsapp","messageId":"wa-2"}]
    
    === 3. Partial-send DeliveryError (the duplicate-replay bug this fixes) ===
    err.name             : DeliveryError
    err.message          : Discord 503 on chunk 2
    err.sentBeforeError  : [ { channel: 'discord', messageId: 'discord-chunk-1' } ]
    err.sentBeforeError.length: 1 (callers ack the WAL queue entry to avoid replaying chunk-1)
    Array.isArray(err.sentBeforeError): true (shape-detection guard works)
    err instanceof Error : true
    err.cause            : Discord 503 on chunk 2
    
    === 4. Shape-detection guard (mirrors src/cron/isolated-agent/delivery-dispatch.ts:431) ===
    isDeliveryError(real DeliveryError): true
    isDeliveryError(plain Error)       : false
    isDeliveryError(third-party err with same name but no sentBeforeError): false
    
  • Observed result after fix: (1) outcome.allCancelledByHook cleanly disambiguates hook cancellation from a true zero-result failure that the caller would have flagged before. (2) Object.keys(outcome) === [] and JSON.stringify(outcome) return only the array elements — the metadata is non-enumerable so existing structural comparisons (toEqual, toMatchObject, for...in) and the public openclaw/plugin-sdk/testing re-export's array-shape contract are preserved. (3) DeliveryError carries sentBeforeError with the actually-delivered prefix; the shape-detection isDeliveryError guard correctly accepts genuine instances and rejects a same-named third-party error missing the array property, matching the cron and restart-sentinel guards.

  • What was not tested: A live partial-send against a real Discord/WhatsApp/Slack channel (would require deliberately throttling a downstream platform mid-chunk, which can't be safely staged without a maintainer-owned harness). The behavior under that condition is covered by the new vitest tests in src/infra/outbound/deliver.test.ts, src/infra/outbound/delivery-queue.recovery.test.ts, and src/gateway/server-restart-sentinel.test.ts against the real deliverOutboundPayloads source (no module mocks for the function under test) — left noted here so reviewers know the boundary.

Test plan

  • 220/220 targeted tests pass across deliver, delivery-queue-recovery, restart-sentinel, delivery-dispatch, and send test files (including new tests for runOutboundDeliveryCommitHooks on sentBeforeError prefix and ack-failure during recovery partial-send)
  • Codex review converged clean (3 rounds after latest rebase)
  • CI green

🤖 Generated with Claude Code

@openclaw-barnacle openclaw-barnacle Bot added gateway Gateway runtime size: M labels Mar 30, 2026
@Kaspre

Kaspre commented Mar 30, 2026

Copy link
Copy Markdown
Contributor Author

Pushed 3eeeeaaee8 — single commit, 10 files changed.

All delivery tests passing locally (82/82 across 3 test suites). No new type errors introduced — pnpm tsgo fails on main due to pre-existing skills.test-helpers.ts error (#57196, fix in #57198).

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 3eeeeaaee8

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/infra/outbound/deliver.ts Outdated
@greptile-apps

greptile-apps Bot commented Mar 30, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR enriches deliverOutboundPayloads's return type from a plain OutboundDeliveryResult[] to a DeliveryOutcome object carrying cancelledCount and allCancelledByHook, and introduces DeliveryError (wrapping partial-send failures with sentBeforeError) so callers can distinguish intentional hook suppression from genuine delivery failures. It also guards the cron retry loop from blindly retrying partial sends.

Key changes:

  • New DeliveryOutcome interface and DeliveryError class exported from deliver.ts
  • deliverOutboundPayloadsCore now tracks cancelledCount per-hook-cancellation and wraps mid-stream non-bestEffort throws in DeliveryError only when results.length > 0
  • retryTransientDirectCronDelivery gains an err instanceof DeliveryError guard — necessary because summarizeDirectCronDeliveryError extracts .message from DeliveryError, which could match transient patterns and cause re-delivery of already-sent payloads
  • All 5 return-value consumers updated to destructure { results }; 9 fire-and-forget callers left unchanged
  • Comprehensive new tests covering hook-cancel-all, hook-cancel-partial, normal delivery, DeliveryError with sentBeforeError, and original-error passthrough for first-payload failures

Confidence Score: 5/5

Safe to merge — logic is correct, all 5 consumers updated, retry guard is sound, and test coverage is comprehensive

No P0 or P1 issues found. The DeliveryError is thrown only when results.length > 0, allCancelledByHook handles empty-payload edge cases correctly, the retry guard is actually required (not just defensive), and all consumer call-sites are updated.

No files require special attention

Reviews (2): Last reviewed commit: "fix: guard DeliveryError in retry loop, ..." | Re-trigger Greptile

Comment thread src/infra/outbound/deliver.lifecycle.test.ts Outdated
@Kaspre

Kaspre commented Mar 30, 2026

Copy link
Copy Markdown
Contributor Author

Addressed both Greptile findings in 597f376:

P1 — DeliveryError not guarded in retry loop: retryTransientDirectCronDelivery now bails on err instanceof DeliveryError so partial deliveries are never retried. Added DeliveryError import + importOriginal in the test mock to re-export the real class.

P2 — Silent try/catch assertion: Replaced bare try/catch with rejects.not.toBeInstanceOf(DeliveryError) so the assertion runs unconditionally.

All tests passing (17/17 lifecycle, 20/20 double-announce).

@Kaspre

Kaspre commented Mar 31, 2026

Copy link
Copy Markdown
Contributor Author

@greptileai please review again

@Kaspre Kaspre force-pushed the fix/delivery-outcome-metadata branch from 597f376 to 08648d6 Compare April 2, 2026 14:14
@Kaspre

Kaspre commented Apr 2, 2026

Copy link
Copy Markdown
Contributor Author

Rebased onto current main (52a6e35), squashed to single commit 08648d6bb2.

Conflict resolved: deliver.lifecycle.test.ts was deleted upstream in adc329b ("test: dedupe extension-owned coverage") — its 12 original tests were duplicates of deliver.test.ts. Ported the 5 new DeliveryOutcome/DeliveryError tests into deliver.test.ts.

Changes since previous push (597f376c24):

  1. AbortError preservationDeliveryError no longer wraps AbortError when a non-bestEffort send is aborted after partial delivery. Without this, the outer queue wrapper would misclassify the abort as a failure and mark the write-ahead entry for replay, duplicating already-sent payloads. (Found by Codex review.)

  2. Hook-cancelled cron deliveries treated as completeddispatchCronDelivery now considers allCancelledByHook when setting delivered, so intentional hook suppression is cached and not replayed. Previously, hook-cancelled cron notifications would be replayed indefinitely.

  3. Restart sentinel respects hook cancellationdeliverRestartSentinelNotice no longer retries when hooks intentionally cancel the notice.

  4. New test — "preserves AbortError identity even after partial send" covers the AbortError edge case.

Known limitation: The cron scheduler can still replay failed non-bestEffort deliveries that had partial sends (scheduler-level replay, not the immediate retry loop). This is pre-existing behavior — the queue/scheduler doesn't support partial replay. The DeliveryError.sentBeforeError metadata enables future callers to handle this, but a full fix requires queue schema changes (tracked separately).

All delivery tests passing locally (50/50 deliver.test.ts, 20/20 delivery-dispatch.double-announce.test.ts). No new type errors — pre-existing skills.test-helpers.ts (#57196) and msteams failures on main.

@codex review

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 08648d6bb2

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/infra/outbound/deliver.test.ts Outdated
@Kaspre Kaspre force-pushed the fix/delivery-outcome-metadata branch from 08648d6 to 3152cab Compare April 2, 2026 14:27
@Kaspre

Kaspre commented Apr 2, 2026

Copy link
Copy Markdown
Contributor Author

Pushed 3152cab683 — adds partial-send caching in dispatchCronDelivery.

When a non-bestEffort cron delivery partially sends (e.g. chunk 1 succeeds, chunk 2 fails with DeliveryError), the retry guard already prevented blind immediate retry. But without caching, the scheduler would replay the full run on the next pass, duplicating the sent payloads.

Now: DeliveryError with sentBeforeError.length > 0 is caught, the partial delivery is cached via rememberCompletedDirectCronDelivery, and a diagnostic log is emitted. This matches the trade-off bestEffort already makes — accept truncation over duplication.

New test: "caches partial DeliveryError sends to prevent scheduler replay duplication".

Codex review clean (round 4): "no discrete, actionable bug that would likely need fixing before landing."

@codex review

@chatgpt-codex-connector

Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Breezy!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@Kaspre Kaspre force-pushed the fix/delivery-outcome-metadata branch from 3152cab to 607423e Compare April 20, 2026 05:15
@Kaspre

Kaspre commented Apr 20, 2026

Copy link
Copy Markdown
Contributor Author

Rebased onto current main (d3eeadb). Clean rebase otherwise; a few caller-side updates were needed for upstream API drift since the original PR:

Resolved conflicts in delivery-dispatch.ts, deliver.test.ts, and delivery-dispatch.double-announce.test.ts (imports + a handful of tests upstream deleted).

Caller/test updates for upstream drift:

  • deps: { sendWhatsApp }deps: { whatsapp: sendWhatsApp } in 5 PR-added tests (upstream renamed the outbound deps key).
  • Replaced bare logError() with the file's idiomatic logCronDeliveryError() (upstream moved to dynamic logger imports).
  • Updated 4 mockResolvedValue sites to return the new DeliveryOutcome shape in downstream test suites (send.test.ts, delivery-dispatch.double-announce.test.ts).
  • Dropped vi.resetModules() from deliver.test.ts beforeAll — it broke vi.spyOn module-instance identity for resolveAgentScopedOutboundMediaAccess.
  • Tightened the shared-mock setup in "preserves AbortError identity" so both invocations get fresh AbortController + sendWhatsApp mocks, and use .toMatchObject({ name: "AbortError" }) instead of message-substring matching.

Second-order fixes surfaced during the rebase:

  • gateway/server-methods/send.ts: hook-cancelled sends (allCancelledByHook: true) were returning UNAVAILABLE to gateway clients. Now return { ok: true, cancelledByHook: true }.
  • delivery-dispatch.ts: hook-cancelled cron replies still called queueCronAwarenessSystemEvent, which would leak blocked (e.g. DLP-suppressed) content into the main session. Gated awareness on results.length > 0 instead of delivered.
  • delivery-dispatch.ts: partial DeliveryError path was caching idempotency but reporting delivered: false, creating inconsistent cron state across restart. Now sets delivered = true on cache write; PR's own test expectation updated to match.

Targeted test sweep (116 tests across deliver.test.ts, delivery-dispatch.double-announce.test.ts, send.test.ts): all pass.

@codex review

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 607423e4c9

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/cron/isolated-agent/delivery-dispatch.ts Outdated
@Kaspre Kaspre force-pushed the fix/delivery-outcome-metadata branch from 607423e to 19def93 Compare April 21, 2026 00:11

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 19def93423

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/cron/isolated-agent/delivery-dispatch.ts
@Kaspre Kaspre force-pushed the fix/delivery-outcome-metadata branch from 19def93 to 096e012 Compare April 21, 2026 00:39

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 096e0123ff

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/cron/isolated-agent/delivery-dispatch.double-announce.test.ts Outdated
@Kaspre Kaspre force-pushed the fix/delivery-outcome-metadata branch from 096e012 to 9c50063 Compare April 22, 2026 00:45
@Kaspre

Kaspre commented Apr 22, 2026

Copy link
Copy Markdown
Contributor Author

Scope note after latest force-push

Force-pushed the rebased branch (9c5006314d) with a deliberately narrow scope. This comment records what's in this PR, what's not, and where the deferred bits are tracked.

In scope for #57843

  • Return type: OutboundDeliveryResult[]DeliveryOutcome (results, cancelledCount, allCancelledByHook).
  • DeliveryError class with sentBeforeError for mid-stream non-bestEffort failures.
  • All 5 callers that use the return value updated for destructuring.
  • retryTransientDirectCronDelivery guards against DeliveryError so partial sends are not retried in the cron direct-delivery path.
  • One small design tweak over the previous revision: DeliveryError is detected via a local isDeliveryError(err) helper that checks err.name === "DeliveryError" + Array.isArray(err.sentBeforeError), with DeliveryError imported as a type only. This keeps the heavy outbound delivery module off the delivery-dispatch.ts cold-startup import graph (preserving the existing loadDeliveryOutboundRuntime lazy boundary) and stays defensive against a third-party adapter throwing a same-named error.
  • Test mocks updated to return the new DeliveryOutcome shape.
  • Rebase-forced test adjustments to align with upstream b7e5d9a9 ("decouple outbound tests from bundled plugins") which deleted deliver.test-helpers.ts and removed WhatsApp from the default test registry.

Deliberately out of scope — deferred to follow-up PRs

While iterating on review feedback, three adjacent concerns surfaced that are legitimate but pre-existing — they're not regressions this PR introduces, they're bugs this PR's new error types merely make more discoverable. Keeping them out of #57843 to stay reviewable.

  1. src/gateway/server-restart-sentinel.ts retry loop duplicates prefix on partial DeliveryError. The sentinel retries up to 45 times on any error. For a chunked restart notice where chunk 1 lands before chunk 2 fails, each retry re-sends the full message. Pre-fix(delivery): deliverOutboundPayloads return type cannot distinguish hook cancellations from failures #57766 this bug existed too (any mid-stream failure would retry-duplicate); the new DeliveryError name makes the guard cleanly expressible. Planned new PR: fix(gateway): do not retry DeliveryError in restart-sentinel retry loop.

  2. Outer queue wrapper in src/infra/outbound/deliver.ts calls failDelivery() on DeliveryError. That leaves the entry pending; recoverPendingDeliveries() replays it on restart, duplicating the already-sent prefix. Pre-existing: failDelivery has always kept entries for retry regardless of error class. Natural home: fix(delivery): track and log silent delivery failures #53961 (fix/delivery-status-tracking) — the WAL-replay-duplicate is exactly the class of silent-failure bug it tracks. Will fold in when fix(delivery): track and log silent delivery failures #53961 rebases on top of this PR.

  3. Gateway send returns ok: true without messageId when allCancelledByHook. callMessageGateway<{ messageId: string }> and downstream CLI renderers in commands/message-format.ts don't yet handle the cancelled-by-hook variant. Natural home: feat(delivery): surface deliveryStatus in --json output #57755 (feat/delivery-status-json) — its scope is surfacing delivery status through to JSON output, making "cancelled_by_hook" a visible third status alongside "delivered" and "failed".

Rationale and a preserved reference to the over-grown state (commit 97bb8a5841) are in a local tracker so the deferred fixes aren't lost.

Re-review progress:

@clawsweeper

clawsweeper Bot commented Apr 28, 2026

Copy link
Copy Markdown
Contributor

Codex review: needs real behavior proof before merge.

Summary
The PR changes outbound delivery to return array-compatible DeliveryOutcome metadata, adds DeliveryError for partial sends, and updates cron, gateway send, restart-sentinel, queue recovery, tests, and changelog coverage.

Reproducibility: yes. from source inspection: current main returns a bare empty array when hooks cancel every payload and has no partial-send metadata for callers. I did not run tests in this read-only review, so this is source-reproducible rather than live-reproduced.

Real behavior proof
Needs stronger real behavior proof before merge: The PR body includes terminal output, but it runs a standalone script that copies the new contract rather than the actual patched OpenClaw delivery, WAL, recovery, or restart paths; after adding proof, updating the PR body should trigger a fresh ClawSweeper review, or a maintainer can comment @clawsweeper re-review. After adding proof, update the PR body; ClawSweeper should re-review automatically. If it does not, ask a maintainer to comment @clawsweeper re-review.

Next step before merge
Needs contributor proof from actual patched behavior plus a focused fix for the missed restart-continuation cancellation path, so this is not ready for a repair-lane marker.

Security
Cleared: No concrete security or supply-chain concern found; the diff does not add dependencies, workflows, permissions, secret handling, or downloaded code.

Review findings

  • [P2] Handle hook-cancelled restart continuations — src/gateway/server-restart-sentinel.ts:144
Review details

Best possible solution:

Finish the DeliveryOutcome rollout so retry/status callers that treat empty results as failure handle allCancelledByHook, then verify the patched source against the WAL and restart paths.

Do we have a high-confidence way to reproduce the issue?

Yes, from source inspection: current main returns a bare empty array when hooks cancel every payload and has no partial-send metadata for callers. I did not run tests in this read-only review, so this is source-reproducible rather than live-reproduced.

Is this the best way to solve the issue?

No, not yet. DeliveryOutcome and DeliveryError are a maintainable seam, but the implementation should update the remaining restart-continuation empty-result caller and prove actual patched-source behavior before merge.

Full review comments:

  • [P2] Handle hook-cancelled restart continuations — src/gateway/server-restart-sentinel.ts:144
    This terminal check is only applied to the restart notice path. The restart continuation delivery callback in the same file still throws when deliverOutboundPayloads returns an all-cancelled outcome, so a message_sending hook that intentionally suppresses the continuation is recorded as a failed session delivery and retried. Please apply the same allCancelledByHook handling there before merge.
    Confidence: 0.84

Overall correctness: patch is incorrect
Overall confidence: 0.78

What I checked:

Likely related people:

  • steipete: Recent commits route outbound sends through the durable lifecycle and centralize active delivery claims across deliver.ts, delivery queue recovery, restart sentinel, and related tests. (role: recent maintainer; confidence: high; commits: 2ead1502c9bf, aa27a9474fa3, 5e640b93dae6; files: src/infra/outbound/deliver.ts, src/infra/outbound/delivery-queue-recovery.ts, src/gateway/server-restart-sentinel.ts)
  • obviyus: Recent gateway restart-sentinel work bounded restart continuation recovery and preserved restart continuation routing, which overlaps the missed caller in this PR. (role: adjacent owner; confidence: medium; commits: a903df02f5ad, 486d0ec23559, 81e0022b4dbf; files: src/gateway/server-restart-sentinel.ts)
  • neeravmakwana: Authored the active-delivery claim fix for live sends versus reconnect drains, which is directly adjacent to this PR's WAL duplicate-prevention changes. (role: introduced related behavior; confidence: medium; commits: c94a8702c7f5; files: src/infra/outbound/deliver.ts, src/infra/outbound/delivery-queue-recovery.ts, src/infra/outbound/delivery-queue.ts)

Remaining risk / open question:

  • Current CI/check status for head d04517c was not established from the provided context; the PR body still lists CI as unchecked.
  • The supplied terminal proof exercises a copied standalone script plus targeted tests, but it does not show the actual patched OpenClaw source running through the WAL, recovery, or restart-sentinel paths.

Codex review notes: model gpt-5.5, reasoning high; reviewed against 3ba2ce6694d2.

@Kaspre Kaspre force-pushed the fix/delivery-outcome-metadata branch from 9c50063 to a80c72e Compare May 8, 2026 00:37
@openclaw-barnacle openclaw-barnacle Bot added channel: slack Channel integration: slack size: L and removed size: M labels May 8, 2026
@Kaspre

Kaspre commented May 8, 2026

Copy link
Copy Markdown
Contributor Author

Rebase + iteration update — 2026-05-07

Force-pushed a80c72ea07 rebased onto current upstream/main.

What changed since the previous push (9c5006314d):

  1. Public API kept back-compatible. deliverOutboundPayloads still resolves to a value usable as OutboundDeliveryResult[] (.length, .at(-1), [i], for...of all work unchanged). DeliveryOutcome is now OutboundDeliveryResult[] & { cancelledCount?, allCancelledByHook? } with metadata attached as non-enumerable own properties — invisible to toEqual/toMatchObject/Object.entries/JSON.stringify, accessible via outcome.allCancelledByHook. Existing plugin SDK consumers and test mocks that return plain arrays continue to satisfy the type. Addresses the SDK-compat concern raised in the prior round.

  2. WAL queue ack on partial-send DeliveryError. Both code paths that can land a partial prefix and then throw now ack the queue entry instead of failing it, so a future drain/restart cannot replay the whole batch and duplicate the already-sent prefix:

    • Live-send wrapper in deliverOutboundPayloads (src/infra/outbound/deliver.ts).
    • Recovery replay in drainQueuedEntry (src/infra/outbound/delivery-queue-recovery.ts).
    • Restart-sentinel retry loop (src/gateway/server-restart-sentinel.ts).
  3. Fresh test coverage for the ack-on-DeliveryError invariants in src/infra/outbound/deliver.test.ts, src/infra/outbound/delivery-queue.recovery.test.ts, and src/gateway/server-restart-sentinel.test.ts.

Local validation: 352 targeted tests pass across 20 files (pnpm test src/infra/outbound/deliver.test.ts src/infra/outbound/delivery-queue.recovery.test.ts src/cron/isolated-agent/delivery-dispatch.double-announce.test.ts src/gateway/server-restart-sentinel.test.ts src/gateway/server-methods/send.test.ts ...). Five rounds of codex review --base upstream/main ending clean.

Diff size: 11 files / +547/−60 vs upstream/main.

Sequencing note: #53961 and #57755 will follow once this lands; both will rebase to consume DeliveryOutcome.allCancelledByHook and DeliveryError.sentBeforeError for cleaner status reporting.

🤖 AI-assisted (Claude Code, fully tested locally; codex review iterated 5 rounds). Co-author trailer in commit.

@openclaw-barnacle openclaw-barnacle Bot added triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. proof: supplied External PR includes structured after-fix real behavior proof. and removed triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. labels May 8, 2026
@Kaspre Kaspre force-pushed the fix/delivery-outcome-metadata branch from a80c72e to c9c5e7c Compare May 8, 2026 00:48
@clawsweeper clawsweeper Bot added the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 8, 2026
@Kaspre Kaspre force-pushed the fix/delivery-outcome-metadata branch from c9c5e7c to 0f082ce Compare May 8, 2026 01:07
@openclaw-barnacle openclaw-barnacle Bot removed the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 8, 2026
@clawsweeper clawsweeper Bot added the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 8, 2026
…openclaw#57766)

Enrich deliverOutboundPayloads return type from OutboundDeliveryResult[]
to DeliveryOutcome — carries cancelledCount and allCancelledByHook so
callers can distinguish intentional hook suppression from delivery failure.

Wrap mid-stream non-bestEffort throws in DeliveryError with
sentBeforeError to prevent blind retries that duplicate already-delivered
messages. Guard retryTransientDirectCronDelivery against DeliveryError
so partial sends are never retried in the cron direct-delivery path.

DeliveryError is detected via a local isDeliveryError(err) helper that
checks err.name + Array.isArray(err.sentBeforeError), with DeliveryError
imported as a type only. This keeps the heavy outbound delivery module
off delivery-dispatch.ts's cold-startup import graph (preserving the
existing loadDeliveryOutboundRuntime boundary) and stays defensive
against a third-party adapter throwing a same-named error.

Updates all 5 callers that use the return value; 9 fire-and-forget
callers need no changes. Test mocks updated to return DeliveryOutcome
shape.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@Kaspre Kaspre force-pushed the fix/delivery-outcome-metadata branch from 0f082ce to d04517c Compare May 8, 2026 13:06
@Kaspre

Kaspre commented May 8, 2026

Copy link
Copy Markdown
Contributor Author

Rebase update — 2026-05-08

Rebased onto current upstream/main. CHANGELOG.md conflict resolved (test files auto-merged cleanly).

Codex review surfaced two P2 findings inside #57843's stated scope (DeliveryError + WAL ack), both fixed in this push:

  • Run runOutboundDeliveryCommitHooks for the sentBeforeError prefix before treating the delivery as terminal — both the live-send wrapper in deliver.ts (queueId path + skipQueue path) and the recovery loop in delivery-queue-recovery.ts. Without this the WAL ack would silently drop afterCommit-style message-lifecycle side effects for messages that already landed.
  • During recovery replay, if ackDelivery fails for a non-ENOENT reason after a partial-send DeliveryError, move the entry to failed/ instead of leaving it in pending/ — otherwise a transient ack failure could cause the next drain to replay and duplicate the prefix this branch was protecting.

Added one new test (delivery-queue.recovery.test.ts) asserting commit hooks fire for the sent prefix during recovery partial-send. All 220 targeted tests pass; codex review clean on round 3.

One adjacent codex finding deferred to a follow-up PR: adapter-internal multi-chunk partial sends (e.g. Signal's sendFormattedSignalText batching chunks before returning) — fixing it requires changing the formatted-text adapter contract and is sibling-scope rather than part of this PR's "DeliveryError + ack-on-DeliveryError" story.

AI-assisted (Claude Code).

@openclaw-barnacle openclaw-barnacle Bot removed the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 8, 2026
@steipete

steipete commented May 9, 2026

Copy link
Copy Markdown
Contributor

Thanks for pushing this through and for documenting the failure modes. The underlying problem is real: hook-suppressed sends must not look like delivery failures, and partial sends need to be visible so retry/recovery paths do not duplicate already-delivered messages.

I'm going to close this PR as superseded rather than merge this branch. We took a broader refactor of the existing message lifecycle API instead of adding a second delivery-result contract on top of deliverOutboundPayloads.

What we did instead:

  • Kept sendDurableMessageBatch(...) as the canonical API and enriched its return value with explicit statuses: sent, suppressed, partial_failed, and failed.
  • Added per-payload payloadOutcomes, including stable suppression reasons like cancelled_by_message_sending_hook, empty_after_message_sending_hook, and no_visible_payload.
  • Preserved hook-provided cancellation reason and metadata on suppressed outcomes.
  • Represented partial sends as partial_failed with the delivered prefix plus the failure, so callers can make retry/status decisions without guessing from an empty array.
  • Kept direct deliverOutboundPayloads(...) as deprecated compatibility substrate instead of making it the public shape for new code.
  • Migrated bundled send/turn paths to the durable message lifecycle and away from deprecated direct delivery/reply-dispatch helpers.
  • Added check:deprecated-message-api, wired into check:architecture, so production code cannot drift back to the deprecated message APIs outside compatibility shims.
  • Avoided introducing parallel convenience wrappers; the API remains sendDurableMessageBatch(...) rather than adding sendDurableMessageBatchResults(...), sendDurableMessageBatchOrThrow(...), or another turn-dispatch wrapper.

That should cover the needs this PR raised, including the delivery-status follow-ups, but with one canonical message API instead of two overlapping contracts. Closing this branch to keep the public SDK surface clean.

Verification on the replacement refactor:

  • pnpm check:deprecated-message-api
  • OPENCLAW_TESTBOX=0 pnpm tsgo:core
  • OPENCLAW_TESTBOX=0 pnpm tsgo:extensions
  • focused message send/hooks + QA/Nextcloud Talk/IRC/Line tests
  • pnpm plugin-sdk:api:gen && pnpm plugin-sdk:api:check
  • pnpm check:docs
  • pnpm check:architecture
  • Testbox pnpm check:changed: https://github.com/openclaw/openclaw/actions/runs/25590359552

@steipete steipete closed this May 9, 2026
@Kaspre Kaspre deleted the fix/delivery-outcome-metadata branch May 15, 2026 12:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

channel: slack Channel integration: slack gateway Gateway runtime proof: supplied External PR includes structured after-fix real behavior proof. size: L

Projects

None yet

Development

Successfully merging this pull request may close these issues.

fix(delivery): deliverOutboundPayloads return type cannot distinguish hook cancellations from failures

2 participants