fix(outbound): preserve retries for budget-deferred deliveries#91241
Conversation
|
Codex review: passed. Reviewed June 7, 2026, 9:41 PM ET / 01:41 UTC. Summary PR surface: Source -11, Tests -1. Total -12 across 2 files. Reproducibility: yes. at source level: current main reaches Review metrics: none identified. Merge readiness Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch. Risk before merge
Maintainer options:
Next step before merge
Security Review detailsBest possible solution: Land this narrow outbound queue accounting fix and keep the linked Feishu transport-readiness issue open for the remaining startup/readiness behavior. Do we have a high-confidence way to reproduce the issue? Yes, at source level: current main reaches Is this the best way to solve the issue? Yes. Removing the no-attempt AGENTS.md: found and applied where relevant. Codex review notes: model gpt-5.5, reasoning high; reviewed against 8b03fd1f5f83. Label changesLabel changes:
Label justifications:
Evidence reviewedPR surface: Source -11, Tests -1. Total -12 across 2 files. View PR surface stats
What I checked:
Likely related people:
What the crustacean ranks mean
Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics. How this review workflow works
|
|
@clawsweeper automerge |
|
🦞✅ Source: What merged:
Automerge notes:
The automerge loop is complete. Automerge progress:
|
f3fa2a0 to
aff2b9d
Compare
…law#91241) Summary: - The branch removes the budget-deferred `failDelivery` path from outbound recovery and updates the `maxRecoveryMs` regression expectation so unattempted deliveries keep retry counts at zero. - PR surface: Source -11, Tests -1. Total -12 across 2 files. - Reproducibility: yes. at source level: current main reaches `failDelivery` from the exhausted recovery-budge ... in this read-only review, but the PR body also supplies terminal output showing the after-fix queue state. Automerge notes: - PR branch already contained follow-up commit before automerge: fix(outbound): preserve retries for budget-deferred deliveries Validation: - ClawSweeper review passed for head aff2b9d. - Required merge gates passed before the squash merge. Prepared head SHA: aff2b9d Review: openclaw#91241 (comment) Co-authored-by: 宇宙熊Yzx <53250620+849261680@users.noreply.github.com> Co-authored-by: clawsweeper <274271284+clawsweeper[bot]@users.noreply.github.com> Co-authored-by: clawsweeper[bot] <274271284+clawsweeper[bot]@users.noreply.github.com> Approved-by: takhoffman Co-authored-by: takhoffman <781889+takhoffman@users.noreply.github.com>
Summary
failDelivery.maxRecoveryMsexhaustion does not consume retry budget.Addresses the budget-deferral retry-accounting part of #91212. The broader Feishu transport-readiness restart behavior remains intentionally open for maintainer follow-up.
Verification
node scripts/run-vitest.mjs src/infra/outbound/delivery-queue.recovery.test.ts— 1 file passed, 22 tests passed.pnpm format:check src/infra/outbound/delivery-queue-recovery.ts src/infra/outbound/delivery-queue.recovery.test.ts— passed.node scripts/run-oxlint.mjs src/infra/outbound/delivery-queue-recovery.ts src/infra/outbound/delivery-queue.recovery.test.ts— passed..agents/skills/autoreview/scripts/autoreview --mode local— clean, no accepted/actionable findings.Scope note: this intentionally fixes only the source-proven retry-accounting bug for unattempted budget deferrals. It does not add a fixed startup sleep, reset retry counts across restarts, or change Feishu transport-readiness policy.
Real behavior proof
Behavior addressed: pending outbound deliveries that are deferred because the startup recovery wall-clock budget is exhausted no longer have
retryCountorlastAttemptAtchanged when no delivery attempt ran.Real environment tested: Local OpenClaw source checkout using the production delivery queue APIs with a temporary state directory and real queue persistence.
Exact steps or command run after this patch:
node --import tsxscript that enqueued two Feishu outbound deliveries, calledrecoverPendingDeliveries({ maxRecoveryMs: 0 }), and printed the pending queue state before and after recovery.Evidence after fix: Captured terminal output:
{ "result": { "recovered": 0, "failed": 0, "skippedMaxRetries": 0, "deferredBackoff": 0 }, "before": [ { "id": "f8481ba9-9b34-4ee6-94ec-84f2dc2d791a", "retryCount": 0, "lastAttemptAt": null }, { "id": "1f442292-c726-4591-8387-540b440e863e", "retryCount": 0, "lastAttemptAt": null } ], "after": [ { "id": "f8481ba9-9b34-4ee6-94ec-84f2dc2d791a", "retryCount": 0, "lastAttemptAt": null }, { "id": "1f442292-c726-4591-8387-540b440e863e", "retryCount": 0, "lastAttemptAt": null } ], "deliverCallCount": 0, "warning": "Recovery time budget exceeded — remaining entries deferred to next startup" }Observed result after fix: no delivery function was called, both queued entries remained pending, and their retry metadata stayed unchanged.
What was not tested: a live Feishu WebSocket restart scenario; this PR avoids transport readiness policy and fixes the generic queue retry-accounting path that is source-reproducible.