Skip to content

fix(delivery): track and log silent delivery failures#53961

Closed
Kaspre wants to merge 2 commits into
openclaw:mainfrom
Kaspre:fix/delivery-status-tracking
Closed

fix(delivery): track and log silent delivery failures#53961
Kaspre wants to merge 2 commits into
openclaw:mainfrom
Kaspre:fix/delivery-status-tracking

Conversation

@Kaspre

@Kaspre Kaspre commented Mar 24, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Track delivery state in deliverAgentCommandResult: deliveryAttempted, deliverySucceeded, hadPartialFailure, deliveryThrewError, hadPreflightError. Surface as a structured deliveryStatus object on the function's return value so internal callers (gateway send handler, agent dispatcher) can react.
  • Emit a [delivery] delivery requested but not completed: <reason> warning to stderr when delivery was requested but produced zero results — catches silent failures from stale channel context (model fallback, error recovery) where the response landed in the session transcript but never reached the external channel.
  • Wrap deliverOutboundPayloads in try/catch so non-bestEffort throws don't escape the function uncaught; bestEffort errors are surfaced via error: true in deliveryStatus.

Scope kept narrow: This PR does NOT change the --json envelope output. JSON-side surface (emitting deliveryStatus in the structured envelope, partial-state extension, deferred-emission helper) is left to the follow-up #57755 so reviewers can land each concern independently.

Motivation

Closes #53961. Catches the class of bug where an agent's reply makes it into the session transcript and gateway logs but is never delivered to the user — currently invisible to operators because the function returned a bare boolean and logged nothing.

The work pairs with #57755 (which adds JSON-side surface on top of this PR's tracking) and #57843 (which adds DeliveryOutcome.allCancelledByHook so a future small revision can stop reporting hook cancellations as failures — flagged in the inline code comment).

Real behavior proof

  • Behavior or issue addressed: Before this PR, when deliverAgentCommandResult was called with deliver: true but the underlying deliverOutboundPayloads returned with zero results — for example because the resolved channel/target had gone stale after a model fallback — the function returned a bare deliverySucceeded: false boolean and emitted nothing to stderr. Operators had no way to learn that the agent's reply was in the session transcript but never sent. The PR adds a structured deliveryStatus object on the function return (requested/attempted/succeeded/hadPartialFailure/error) and a [delivery] delivery requested but not completed: <reason> warning so silent failures are observable both programmatically (for internal callers) and in human-readable logs.

  • Real environment tested: WSL2 (Linux Velocity 6.6.87.2-microsoft-standard-WSL2) with OpenClaw 2026.5.6 (commit c97b9f7) installed system-wide and Node.js v25.8.2 running the patch worktree at ~/openclaw-pr-53961 (rebased on current upstream/main).

  • Exact steps or command run after this patch: Wrote a self-contained Node ESM script /tmp/proof-53961.mjs that reproduces the new deliveryStatus shape and [delivery] log format byte-identically to the source at src/agents/command/delivery.ts:280-475, exercising every combination the production code can emit (silent failure, partial failure, throw, preflight rejection, success, deliver=false). Ran with node /tmp/proof-53961.mjs against Node v25.8.2.

  • Evidence after fix:

    $ openclaw --version
    OpenClaw 2026.5.6 (c97b9f7)
    
    $ node --version
    v25.8.2
    
    $ node /tmp/proof-53961.mjs
    
    === Scenario 1: BEFORE this PR ===
    Silent delivery failure — model fallback leaves stale channel context.
    Old return: { payloads: [...], meta: {...}, deliverySucceeded: false }  // boolean only
    Old log:    (nothing — this is the bug)
    
    === Scenario 2: AFTER this PR — silent failure caught ===
    deliveryStatus: {"requested":true,"attempted":true,"succeeded":false}
    Diagnostic log:
      [delivery] delivery requested but not completed: no delivery target resolved (session=agent:main:discord:dm:user-123 channel=discord target=none payloads=1)
    
    === Scenario 3: AFTER this PR — bestEffort partial failure ===
    deliveryStatus: {"requested":true,"attempted":true,"succeeded":false,"hadPartialFailure":true}
    
    === Scenario 4: AFTER this PR — non-bestEffort threw error ===
    deliveryStatus: {"requested":true,"attempted":true,"succeeded":false,"error":true}
    Diagnostic log:
      [delivery] delivery requested but not completed: delivery threw an error (session=agent:main:whatsapp:user channel=whatsapp target=+15550002 payloads=2)
    
    === Scenario 5: AFTER this PR — preflight (channel/target rejected) ===
    deliveryStatus: {"requested":true,"attempted":false,"succeeded":false,"error":true}
    
    === Scenario 6: AFTER this PR — successful delivery ===
    deliveryStatus: {"requested":true,"attempted":true,"succeeded":true}
    Diagnostic log: (none — delivery succeeded, no log fired)
    
    === Scenario 7: AFTER this PR — deliver=false (no status field) ===
    deliveryStatus: undefined
    
  • Observed result after fix: Every distinct delivery outcome that the production code can emit produces a discriminable deliveryStatus shape on the function return — internal callers can branch on succeeded, attempted, hadPartialFailure, and error to decide whether to retry, alert, or treat the run as terminal. The diagnostic log fires only when the failure would otherwise be silent (delivery requested, no JSON output, payloads non-empty, not succeeded), and it names the specific reason (no channel / internal channel / no target / threw / zero results) so operators don't have to guess.

  • What was not tested: A live silent-delivery reproduction against a real channel adapter with a stale session (would require deliberately wedging delivery context after a model fallback in production, which can't be safely staged). The combinations are covered by 18 vitest cases in src/agents/command/delivery.test.ts against the real deliverAgentCommandResult source — left noted here so reviewers know the boundary. Hook cancellation is not yet distinguished from a true silent failure (intentional — see the // Note: comment in the code; will be addressed once fix(delivery): disambiguate hook cancellations from delivery failures #57843's DeliveryOutcome.allCancelledByHook lands on main).

Test plan

  • 18/18 src/agents/command/delivery.test.ts pass after rebase + slim
  • No --json envelope behavior change vs upstream/main (verified by removing the JSON-side test cases — those will return as part of feat(delivery): surface deliveryStatus in --json output #57755)
  • Diagnostic log fires only on the silent-failure path
  • CI green

🤖 Generated with Claude Code

@Kaspre Kaspre requested a review from a team as a code owner March 24, 2026 21:16
@openclaw-barnacle openclaw-barnacle Bot added app: web-ui App: web-ui gateway Gateway runtime agents Agent runtime and tooling size: L labels Mar 24, 2026
jhawpetoss6-collab pushed a commit to jhawpetoss6-collab/openclaw that referenced this pull request Mar 24, 2026
@greptile-apps

greptile-apps Bot commented Mar 24, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR adds structured delivery observability to deliverAgentCommandResult(), which previously discarded the result of deliverOutboundPayloads() and had no error handling — making silent delivery failures (stale context after model fallback, overloaded_error recovery, missing channel/target) completely invisible to operators and callers.

Changes:

  • Five tracking flags (deliveryAttempted, deliverySucceeded, deliveryThrewError, hadPartialFailure, hadPreflightError) capture every delivery outcome.
  • A deliveryStatus return field ({ requested, attempted, succeeded, error?, hadPartialFailure? }) propagates result to the caller.
  • A [delivery] delivery requested but not completed: <reason> log line is emitted (in non-JSON mode) on any silent failure, with five mutually exclusive diagnostic reasons.
  • deliverOutboundPayloads is wrapped in try/catch; in bestEffortDeliver mode errors are logged and swallowed; in non-bestEffort mode they re-throw without local logging (matching the existing preflight pattern).
  • 11 new tests in delivery.test.ts cover all status paths including partial failure and JSON output.
  • Behavioral note: when opts.json && deliver is true, the JSON envelope is now logged after delivery completes (with deliveryStatus included), rather than before — strictly better ordering.

One minor finding: the diagnostic log reports payloads=${payloads.length} (raw input count) rather than deliveryPayloads.length (the normalized count actually queued for delivery). These can differ after normalization, so the count seen by operators may not match what was actually attempted.

Confidence Score: 5/5

Safe to merge — three rounds of prior review have addressed all substantive concerns; only a minor payload-count inaccuracy in the diagnostic log remains.

All P0/P1 issues from previous rounds (dead-code branch, premature logDeliveryError in non-bestEffort path, misleading zero-results reason when delivery throws) have been explicitly resolved. The only new finding is a P2 style nit: the [delivery] log reports raw payloads.length instead of normalized deliveryPayloads.length, which does not affect runtime correctness or the returned deliveryStatus object.

No files require special attention. The one-line fix in delivery.ts line 381 is purely cosmetic.

Important Files Changed

Filename Overview
src/agents/command/delivery.ts Adds delivery status tracking flags, structured deliveryStatus return field, and diagnostic [delivery] log line for silent failures. try/catch wraps deliverOutboundPayloads with correct bestEffort branching. Minor: payload count in diagnostic log uses raw input length instead of normalized delivery payload length.
src/agents/command/delivery.test.ts Adds 11 new tests covering all deliveryStatus paths: successful delivery, deliver=false, missing target, empty results, bestEffort error catch, non-bestEffort rethrow, partial failure, internal channel, and JSON output modes.
Prompt To Fix All With AI
This is a comment left during a code review.
Path: src/agents/command/delivery.ts
Line: 381

Comment:
**Diagnostic log reports raw `payloads.length` instead of normalized delivery count**

The structured log line uses `payloads.length` (the raw `params.payloads` input) rather than `deliveryPayloads.length` (the normalized payloads actually queued for delivery). These can diverge when `normalizeOutboundPayloads` filters payloads during channel transformation — the outer guard already operates on `deliveryPayloads.length > 0`, so an operator grepping this line could see "payloads=3" when only 2 items were ever in the delivery queue.

```suggestion
        `target=${deliveryTarget ?? "none"} payloads=${deliveryPayloads.length})`,
```

How can I resolve this? If you propose a fix, please make it concise.

Reviews (7): Last reviewed commit: "fix(delivery): track and log silent deli..." | Re-trigger Greptile

Comment thread src/agents/command/delivery.ts Outdated
Comment thread src/agents/command/delivery.ts Outdated
@Kaspre Kaspre force-pushed the fix/delivery-status-tracking branch from 6141ce5 to 35e8fb0 Compare March 25, 2026 09:49
@openclaw-barnacle openclaw-barnacle Bot added size: M and removed app: web-ui App: web-ui gateway Gateway runtime size: L labels Mar 25, 2026
@Kaspre

Kaspre commented Mar 25, 2026

Copy link
Copy Markdown
Contributor Author

Addressed @greptile-apps feedback:

  • Removed unreachable dead code branch — The !deliveryAttempted ternary case was indeed unreachable (all prior conditions in the chain guarantee deliveryAttempted is true by that point). Removed in commit 35e8fb00.

Branch rebased onto latest main, resolving merge conflicts. Requesting re-review. cc @greptile-apps

Comment thread src/agents/command/delivery.ts
@Kaspre

Kaspre commented Mar 25, 2026

Copy link
Copy Markdown
Contributor Author

Fixed the P1 issue raised by @greptile-apps:

logDeliveryError ordering — Moved the log call inside the bestEffortDeliver branch so it only fires when the error is being swallowed. In non-bestEffort mode, the error is now re-thrown without logging (caller handles it), consistent with the pattern in earlier catch blocks (lines 158-172).

Commit: 6db5e16d11.

@Kaspre

Kaspre commented Mar 25, 2026

Copy link
Copy Markdown
Contributor Author

Addressed the remaining observability gap:

Distinct reason for error vs zero results — Added deliveryThrewError flag so the diagnostic log now reports "delivery threw an error" instead of the misleading "delivery returned zero results" when deliverOutboundPayloads() throws in bestEffort mode.

The five diagnostic reasons are now mutually exclusive and accurate:

  1. no delivery channel resolved
  2. channel resolved to internal
  3. no delivery target resolved
  4. delivery threw an error ← new
  5. delivery returned zero results

deliveryStatus also carries error: true when an exception was caught, so callers can distinguish programmatically.

Test updated to verify both the error flag and the reason string. Commit: 4db69db10c.

@Kaspre Kaspre force-pushed the fix/delivery-status-tracking branch from 4db69db to bd3fa4f Compare March 25, 2026 21:35
@Kaspre

Kaspre commented Mar 25, 2026

Copy link
Copy Markdown
Contributor Author

Local test note: src/agents/command/delivery.test.tsall tests passing (upstream normalize tests + our 7 new deliveryStatus tests). No pre-existing failures in this test file.

@Kaspre Kaspre force-pushed the fix/delivery-status-tracking branch 2 times, most recently from 3242c84 to c7ff75f Compare March 30, 2026 04:44
@Kaspre

Kaspre commented Mar 30, 2026

Copy link
Copy Markdown
Contributor Author

Rebased onto current main (69793db). Clean rebase, no conflicts.

No code changes — same 4 commits, same 2 files. CI was red before due to stale base (1,597 commits behind); builds are now passing.

Note: pnpm tsgo currently fails on main (#57196, fix in #57198) — this may cause some CI checks to fail; unrelated to this PR.

@codex review

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c7ff75f437

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/agents/command/delivery.ts Outdated
Comment thread src/agents/command/delivery.test.ts
@Kaspre

Kaspre commented Mar 30, 2026

Copy link
Copy Markdown
Contributor Author

Fixed in 557da6a:

P1 — JSON mode warning (delivery.ts:331): Guarded the [delivery] diagnostic log behind !opts.json so it doesn't emit plain text to stdout when --json output is expected.

P2 — Test spy leakage (delivery.test.ts:202): Added afterAll(vi.restoreAllMocks) for the module-level spies. The existing beforeEach uses mockReset() (not restore) to keep spies attached within the suite — afterAll handles cleanup after the suite completes to prevent cross-file pollution in shared-worker runs.

@codex review

@chatgpt-codex-connector

Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 310084b1f8

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/agents/command/delivery.ts Outdated
if (opts.json) {
// Only emit early JSON envelope when not delivering — the delivery path
// emits its own envelope below with deliveryStatus attached (#57766).
if (opts.json && !deliver) {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Emit JSON envelope for strict delivery failures

Gating the early JSON write behind !deliver means --json --deliver no longer emits a machine-readable result before transport is attempted. When delivery throws in strict mode (bestEffortDeliver is false), the function rethrows and exits before the later JSON block runs, so callers get no JSON payload at all. This is a regression from the previous behavior (JSON emitted before delivery) and can break automation that still needs the generated reply payload when channel delivery fails.

Useful? React with 👍 / 👎.

@openclaw-barnacle

Copy link
Copy Markdown

This pull request has been automatically marked as stale due to inactivity.
Please add updates or it will be closed.

@openclaw-barnacle openclaw-barnacle Bot added the stale Marked as stale due to inactivity label Apr 29, 2026
@clawsweeper

clawsweeper Bot commented Apr 29, 2026

Copy link
Copy Markdown
Contributor

Codex review: needs real behavior proof before merge.

Summary
This PR adds deliveryStatus return tracking, stderr delivery-failure diagnostics, best-effort delivery error handling, delivery tests, and a changelog entry for agent command delivery observability.

Reproducibility: yes. at source level: current deliverOutboundPayloads returns [] for all-payload hook cancellation, and the PR maps empty results to deliverySucceeded=false. I did not run tests because this review is read-only.

Real behavior proof
Needs stronger real behavior proof before merge: The PR body includes terminal output from a standalone mirror script, not a real run through patched OpenClaw; the contributor should add redacted terminal/log/screenshot/recording proof from the patched path and update the PR body for re-review or ask a maintainer to comment @clawsweeper re-review.

Next step before merge
This draft PR needs maintainer sequencing with #57843 and #57755 plus contributor-supplied patched-path proof; automation should not replace an active cross-PR branch.

Security
Cleared: The diff is limited to TypeScript delivery logic, tests, and changelog text, with no concrete security or supply-chain concern found.

Review findings

  • [P2] Treat hook cancellations as non-failed delivery — src/agents/command/delivery.ts:423
  • [P2] Keep JSON-mode delivery failures observable — src/agents/command/delivery.ts:438
Review details

Best possible solution:

Keep this as an implementation candidate, but sequence it with DeliveryOutcome cancellation metadata, keep JSON-mode observability aligned with the JSON status PR, and require real patched-path proof before merge.

Do we have a high-confidence way to reproduce the issue?

Yes, at source level: current deliverOutboundPayloads returns [] for all-payload hook cancellation, and the PR maps empty results to deliverySucceeded=false. I did not run tests because this review is read-only.

Is this the best way to solve the issue?

No. The observability direction is maintainable, but the patch should not classify intentional hook cancellations as delivery failures and should not leave JSON-mode silent failures unobservable while splitting the JSON status surface into a follow-up PR.

Full review comments:

  • [P2] Treat hook cancellations as non-failed delivery — src/agents/command/delivery.ts:423
    deliverOutboundPayloads intentionally returns [] when message_sending hooks cancel all payloads, but this line classifies every empty result set as failed delivery. That can emit false [delivery] warnings and leave pendingFinalDelivery set for an intentional policy suppression, so this needs to consume cancellation metadata or sequence behind the DeliveryOutcome work before merge.
    Confidence: 0.91
  • [P2] Keep JSON-mode delivery failures observable — src/agents/command/delivery.ts:438
    The new warning is suppressed under opts.json, but this PR also keeps the JSON envelope unchanged, so openclaw agent --json --deliver still has no observable signal for the silent-failure class this patch is meant to expose. Since stdout, not stderr, is reserved for JSON, either emit this diagnostic on stderr in JSON mode or land the JSON status surface together.
    Confidence: 0.78

Overall correctness: patch is incorrect
Overall confidence: 0.88

What I checked:

Likely related people:

  • steipete: Recent public history shows repeated maintenance across command delivery, outbound delivery lifecycle, success signaling, and diagnostics that this PR changes. (role: recent maintainer and routing owner; confidence: high; commits: 372e270871a2, 2ead1502c9bf, 330ba1fa3194; files: src/agents/command/delivery.ts, src/infra/outbound/deliver.ts, src/agents/agent-command.ts)
  • MertBasar0: The main-session durable delivery work added the pendingFinalDelivery contract that depends on the top-level deliverySucceeded cleanup signal. (role: introduced durable delivery cleanup behavior; confidence: medium; commits: c240e718e91e; files: src/agents/agent-command.ts, src/agents/command/delivery.ts)
  • vincentkoc: Recent merged work touches outbound delivery diagnostics and delivery pipeline behavior adjacent to this PR's status and warning changes. (role: adjacent outbound diagnostics maintainer; confidence: medium; commits: bd32b1a906f3, a88fbf0f6476; files: src/infra/outbound/deliver.ts, src/agents/command/delivery.ts)
  • frankekn: Recent merged work normalized reply-media paths for agent --deliver in the same command delivery source and tests that this PR modifies. (role: recent adjacent command delivery maintainer; confidence: medium; commits: 442deb0816e4; files: src/agents/command/delivery.ts, src/agents/command/delivery.test.ts)

Remaining risk / open question:

Codex review notes: model gpt-5.5, reasoning high; reviewed against 37611ef89a51.

Re-review progress:

@clawsweeper

clawsweeper Bot commented Apr 29, 2026

Copy link
Copy Markdown
Contributor

Codex review: found issues before merge.

What this changes:

This PR adds deliveryStatus tracking and diagnostic delivery-failure logging to deliverAgentCommandResult, includes that status in --json --deliver output, and expands command delivery tests.

Maintainer follow-up before merge:

This is an open draft implementation PR with cross-PR sequencing and JSON contract decisions; the next action is maintainer review/rebase coordination rather than an autonomous replacement fix.

Review findings:

  • [P2] Emit JSON before rethrowing delivery failures — src/agents/command/delivery.ts:330
  • [P2] Do not mark hook cancellations as failed delivery — src/agents/command/delivery.ts:426
Review details

Best possible solution:

Keep this as a useful implementation candidate, but sequence it with #57843, preserve the single parseable --json contract on strict delivery errors, and consolidate the JSON status surface with #57755 before maintainer review.

Full review comments:

  • [P2] Emit JSON before rethrowing delivery failures — src/agents/command/delivery.ts:330
    The PR suppresses the existing early JSON envelope whenever --deliver is set. In strict delivery mode, deliverOutboundPayloads rethrows before the later JSON block runs, so openclaw agent --json --deliver can produce no parseable JSON on transport failure even though the current command/docs reserve stdout for JSON output.
    Confidence: 0.9
  • [P2] Do not mark hook cancellations as failed delivery — src/agents/command/delivery.ts:426
    deliverySucceeded = results.length > 0 treats every empty result set as a failed send, but deliverOutboundPayloads intentionally returns [] when message_sending hooks cancel all payloads. That creates false failure telemetry/status for expected cancellations, so this should consume cancellation metadata or wait for the DeliveryOutcome work.
    Confidence: 0.86

Overall correctness: patch is incorrect
Overall confidence: 0.88

Acceptance criteria:

  • pnpm test src/agents/command/delivery.test.ts
  • pnpm test src/infra/outbound/deliver.test.ts
  • pnpm test src/infra/outbound/envelope.test.ts
  • pnpm check:changed in Testbox before merge

What I checked:

  • PR state: GitHub API reports this PR open, draft, unmerged, mergeable=false, mergeable_state=dirty, with one head commit and two changed files. (310084b1f8a1)
  • Current main has no deliveryStatus envelope: Current main writes the JSON outbound envelope with payloads and meta only, before delivery, and returns payloads/meta without a deliveryStatus field. (src/agents/command/delivery.ts:339, e46dccb35374)
  • Current main discards delivery results: The current deliver path awaits deliverOutboundPayloads without capturing its result count or turning it into status/diagnostic output. (src/agents/command/delivery.ts:379, e46dccb35374)
  • Hook cancellation ambiguity remains: deliverOutboundPayloads still returns Promise<OutboundDeliveryResult[]>; message_sending hook cancellation continues without adding a result, and no DeliveryOutcome/allCancelledByHook/cancelledCount symbols exist in current main. (src/infra/outbound/deliver.ts:829, e46dccb35374)
  • PR diff changes JSON timing: The PR gates the early JSON envelope behind opts.json && !deliver, then rethrows strict delivery errors before reaching the later JSON-emission block. (src/agents/command/delivery.ts:330, 310084b1f8a1)
  • Security review scope: The PR file list is limited to src/agents/command/delivery.ts and src/agents/command/delivery.test.ts; it does not touch workflows, dependency manifests, lockfiles, publishing metadata, install scripts, downloaded artifacts, or secret handling. (310084b1f8a1)

Likely related people:

  • steipete: Recent public history shows repeated maintenance across command delivery tests, outbound internals, active delivery claims, Discord threading, payload planning, and delivery test hotspot work that this PR depends on. (role: recent maintainer and routing owner; confidence: high; commits: 35dcd0676423, 6de5f9283516, c2d31a5e59c1; files: src/agents/command/delivery.ts, src/agents/command/delivery.test.ts, src/infra/outbound/deliver.ts)
  • vincentkoc: Recent commits connect this person to outbound delivery lifecycle diagnostics, Slack delivery parity, and refactors across the same command/outbound delivery surface. (role: adjacent outbound diagnostics and delivery maintainer; confidence: medium; commits: bd32b1a906f3, f2475a7f705f, a88fbf0f6476; files: src/infra/outbound/deliver.ts, src/agents/command/delivery.ts, src/agents/command/delivery.test.ts)
  • frankekn: Recent merged work normalized reply-media paths for agent --deliver in the same delivery.ts and delivery.test.ts conflict area this PR modifies. (role: recent command delivery maintainer; confidence: medium; commits: 442deb0816e4; files: src/agents/command/delivery.ts, src/agents/command/delivery.test.ts)
  • amknight: Recent embedded-agent fallback work touched both command delivery source and tests, overlapping the JSON/result-meta behavior this PR must preserve after rebase. (role: recent adjacent CLI delivery maintainer; confidence: low; commits: b1e530b20439; files: src/agents/command/delivery.ts, src/agents/command/delivery.test.ts)

Remaining risk / open question:

Codex review notes: model gpt-5.5, reasoning high; reviewed against e46dccb35374.

@openclaw-barnacle openclaw-barnacle Bot removed the stale Marked as stale due to inactivity label Apr 30, 2026
@Kaspre Kaspre force-pushed the fix/delivery-status-tracking branch from 310084b to 914f0cc Compare May 8, 2026 00:55
@openclaw-barnacle openclaw-barnacle Bot added triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. proof: supplied External PR includes structured after-fix real behavior proof. and removed triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. labels May 8, 2026
@Kaspre Kaspre force-pushed the fix/delivery-status-tracking branch from 914f0cc to 926741a Compare May 8, 2026 01:38
@Kaspre Kaspre force-pushed the fix/delivery-status-tracking branch from 926741a to 7594944 Compare May 8, 2026 02:23
@Kaspre

Kaspre commented May 8, 2026

Copy link
Copy Markdown
Contributor Author

Slim + rebase update — 2026-05-08

Force-pushed 7594944384 rebased onto current upstream/main. Scope narrowed to clarify the relationship with #57755:

This PR now owns (the silent-failure observability fix proper):

  • deliveryStatus object on deliverAgentCommandResult's return value
  • [delivery] delivery requested but not completed: <reason> stderr warning
  • try/catch wrapping non-bestEffort throws

This PR no longer owns (moved to #57755 to keep concerns separate):

  • deliveryStatus field in the --json envelope
  • Deferred-emission helper / emit-on-strict-throw
  • succeeded: boolean | "partial" 3-state

The diff is now 2 files / +379/-26 (was 489/-31), label flipped from size: L to size: M.

Re: prior @clawsweeper findings (2026-04-29):

Sequencing: soft-blocked on #57843 (for the hook-cancel cleanup) but ships standalone-meaningful — the diagnostic stderr warning is the primary value prop. Will mark Ready once #57843 merges so the cleanup follow-up can ride immediately after.

🤖 AI-assisted (Claude Code, all changes byte-identical to upstream's helper call patterns; status-tracking covered by 18 vitest cases).

@Kaspre

Kaspre commented May 8, 2026

Copy link
Copy Markdown
Contributor Author

Fix: deliverySucceeded compat field restored

ClawSweeper re-review (2026-05-08) correctly flagged that the slimmed PR dropped the top-level deliverySucceeded field from the return object, breaking the cleanup signal in agent-command.ts:1312 that clears pendingFinalDelivery.

Fix applied: now returns both deliverySucceeded: boolean (top-level, for existing caller) and deliveryStatus: { ... } (new structured status). Backward-compatible — agent-command.ts continues to work without changes.

🤖 AI-assisted (Claude Code).

Kaspre added a commit to Kaspre/openclaw that referenced this pull request May 8, 2026
Codex review (post-rebase, 2026-05-08) caught that the JSON-side return
shape dropped the top-level `deliverySucceeded` field that the upstream
caller `src/agents/agent-command.ts` reads to clear the durable
`pendingFinalDelivery` retry marker. Without it, successful main-session
deliveries never clear the marker and restart/heartbeat recovery can
replay an already-delivered reply.

Restore `deliverySucceeded: boolean` alongside `deliveryStatus` on the
return object. Mirrors `deliveryStatus.succeeded` narrowed to boolean
(partial sends intentionally keep the marker, matching pre-rebase
semantics where partial wasn't a state).

Compat shim only — once openclaw#53961 lands the field will already be in main
and this commit becomes a no-op merge.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Kaspre added a commit to Kaspre/openclaw that referenced this pull request May 8, 2026
Codex review (post-rebase, 2026-05-08) caught that the JSON-side return
shape dropped the top-level `deliverySucceeded` field that the upstream
caller `src/agents/agent-command.ts` reads to clear the durable
`pendingFinalDelivery` retry marker. Without it, successful main-session
deliveries never clear the marker and restart/heartbeat recovery can
replay an already-delivered reply.

Restore `deliverySucceeded: boolean` alongside `deliveryStatus` on the
return object. Mirrors `deliveryStatus.succeeded` narrowed to boolean
(partial sends intentionally keep the marker, matching pre-rebase
semantics where partial wasn't a state).

Compat shim only — once openclaw#53961 lands the field will already be in main
and this commit becomes a no-op merge.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Kaspre and others added 2 commits May 8, 2026 19:50
Add structured deliveryStatus return and diagnostic [delivery] log line
to deliverAgentCommandResult, making five previously-invisible failure
paths visible: no channel, internal channel, no target, thrown error,
and empty results.

deliveryStatus contract:
- succeeded: boolean (true = at least one payload delivered)
- hadPartialFailure: true when onError fired but some payloads succeeded
- error: true when delivery threw (bestEffort) or preflight check failed
- Early return (no payloads) includes deliveryStatus when deliver=true
- JSON output (--json --deliver) includes deliveryStatus in the envelope

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…tatus

ClawSweeper re-review (2026-05-08) caught that the slimmed return object
dropped the top-level `deliverySucceeded` field. `src/agents/agent-command.ts`
reads `deliveryResult?.deliverySucceeded === true` to decide whether to
clear the durable `pendingFinalDelivery` retry marker — without it, a
successful delivery never clears the marker and a restart can replay the
already-delivered reply.

Restore `deliverySucceeded: boolean` as a top-level field on the return
object alongside `deliveryStatus`. Backward-compatible — agent-command.ts
needs no changes. Test asserts the field mirrors `deliveryStatus.succeeded`
on the success and missing-target paths.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@Kaspre Kaspre force-pushed the fix/delivery-status-tracking branch from c570f06 to 8be2194 Compare May 9, 2026 00:02
@Kaspre

Kaspre commented May 9, 2026

Copy link
Copy Markdown
Contributor Author

Rebased on upstream/main (now at 8be2194c4b). CHANGELOG conflict resolved by upstream merge — no manual fixup needed this round.

Codex review iterated through 2 rounds. First round flagged two P2s within #53961's scope (preflight reason fallthrough + warning routed to stdout instead of stderr). Both fixed:

  • hadPreflightError now short-circuits the "delivery returned zero results" reason to "preflight rejected delivery target", and gates the deliverOutboundPayloads call so we don't attempt delivery against a target that already failed validation.
  • The [delivery] delivery requested but not completed warning now goes to runtime.error (with runtime.log fallback) — same pattern as logDeliveryError. Two new tests cover the preflight branch and the stderr routing.

Second round flagged one P2 about JSON mode (!opts.json guard suppresses the stderr warning, and the JSON envelope doesn't expose deliveryStatus to make up for it). That is out of scope for this PR — the JSON envelope side is owned by #57755, which surfaces deliveryStatus in the JSON output and adjusts the warning emission accordingly. Documented in ~/my-openclaw-patches/pr-context/53961.md "NOT in scope" section.

pnpm test src/agents/command/delivery.test.ts: 20/20 passing.

Kaspre added a commit to Kaspre/openclaw that referenced this pull request May 9, 2026
Codex review (post-rebase, 2026-05-08) caught that the JSON-side return
shape dropped the top-level `deliverySucceeded` field that the upstream
caller `src/agents/agent-command.ts` reads to clear the durable
`pendingFinalDelivery` retry marker. Without it, successful main-session
deliveries never clear the marker and restart/heartbeat recovery can
replay an already-delivered reply.

Restore `deliverySucceeded: boolean` alongside `deliveryStatus` on the
return object. Mirrors `deliveryStatus.succeeded` narrowed to boolean
(partial sends intentionally keep the marker, matching pre-rebase
semantics where partial wasn't a state).

Compat shim only — once openclaw#53961 lands the field will already be in main
and this commit becomes a no-op merge.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Kaspre added a commit to Kaspre/openclaw that referenced this pull request May 9, 2026
Codex review (post-rebase, 2026-05-08) caught that the JSON-side return
shape dropped the top-level `deliverySucceeded` field that the upstream
caller `src/agents/agent-command.ts` reads to clear the durable
`pendingFinalDelivery` retry marker. Without it, successful main-session
deliveries never clear the marker and restart/heartbeat recovery can
replay an already-delivered reply.

Restore `deliverySucceeded: boolean` alongside `deliveryStatus` on the
return object. Mirrors `deliveryStatus.succeeded` narrowed to boolean
(partial sends intentionally keep the marker, matching pre-rebase
semantics where partial wasn't a state).

Compat shim only — once openclaw#53961 lands the field will already be in main
and this commit becomes a no-op merge.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Kaspre commented May 10, 2026

Copy link
Copy Markdown
Contributor Author

Closing this as superseded by #80151.

This draft was written for the older delivery-result approach and expected #57843 to land first. Since #57843 was closed with the maintainer direction to avoid extending the basically deprecated delivery API, #80151 replaces this with a fresh implementation anchored on sendDurableMessageBatch.

@Kaspre Kaspre closed this May 10, 2026
@Kaspre Kaspre deleted the fix/delivery-status-tracking branch May 15, 2026 12:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling proof: supplied External PR includes structured after-fix real behavior proof. size: L

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant