Skip to content

fix(cli): surface durable delivery status#80151

Merged
steipete merged 7 commits into
openclaw:mainfrom
Kaspre:fix/durable-command-delivery-status
May 10, 2026
Merged

fix(cli): surface durable delivery status#80151
steipete merged 7 commits into
openclaw:mainfrom
Kaspre:fix/durable-command-delivery-status

Conversation

@Kaspre

@Kaspre Kaspre commented May 10, 2026

Copy link
Copy Markdown
Contributor

Summary

The new optional deliveryStatus object distinguishes sent, suppressed, partial_failed, and failed delivery outcomes, including per-payload outcomes when durable delivery provides them. deliverySucceeded remains the legacy retry-marker compatibility field: sent and suppressed are terminal no-retry results, while partial_failed and failed do not clear retry state. One deliberate tightening: the compatibility boolean now follows the final durable send status rather than incidental onError callbacks, so non-fatal delivery hiccups do not incorrectly mark a final sent/suppressed outcome as retryable.

Value proposition

Automation can now tell what happened after an agent produced payloads. Scripts and supervisors using openclaw agent --json --deliver can distinguish successful sends, intentional hook suppressions, partial sends, preflight failures, and hard delivery failures without parsing human stderr or guessing from a coarse boolean.

Anticipated consumers

  • CLI automation using openclaw agent --json --deliver.
  • Gateway/agent callers that need a structured retry/no-retry decision.
  • Session-final-delivery cleanup logic that currently relies on deliverySucceeded.
  • Hook authors and channel maintainers who need hook cancellations surfaced as intentional suppressions rather than ambiguous delivery failures.

Why maintainers should consider this

This follows Peter's guidance from #57843 by anchoring the public status projection on sendDurableMessageBatch, not by extending deprecated deliverOutboundPayloads semantics. The change is additive, documented, and covered by focused tests for sent, suppressed, no-payload, preflight, partial failure, strict JSON failure, and best-effort failure paths.

Verification

  • timeout 600s claude --print --tools "" --no-session-persistence --output-format text < /tmp/claude-review-current.txt (external local review; no blocking findings)
  • pnpm test src/agents/command/delivery.test.ts (16 tests passed; covers sent delivery, hook suppression, no-payload suppression, preflight failure with reason and no errorMessage, partial failure with errorMessage, strict delivery failure with errorMessage, strict preflight failure, and best-effort failure paths)
  • pnpm exec oxfmt --check --threads=1 docs/cli/agent.md
  • git diff --check
  • pnpm check:deprecated-api-usage
  • pnpm tsgo:core
  • pnpm build
  • pnpm exec oxfmt --check --threads=1 CHANGELOG.md docs/cli/agent.md src/agents/command/delivery.ts src/agents/command/delivery.test.ts
  • pnpm exec oxlint --tsconfig config/tsconfig/oxlint.core.json src/agents/command/delivery.ts src/agents/command/delivery.test.ts
  • pnpm docs:check-links

pnpm check:docs passed docs formatting, markdown lint, and MDX checks, then failed in docs:check-i18n-glossary on broad existing zh-CN glossary drift across many docs. The failure was not specific to this new deliveryStatus section.

Real behavior proof

  • Behavior or issue addressed: openclaw agent --json --deliver and deliverAgentCommandResult now expose durable delivery status from sendDurableMessageBatch, including sent, suppressed, partial-failed, and failed outcomes, without changing the legacy deliverySucceeded compatibility field shape.
  • Real environment tested: Local OpenClaw source build with OPENCLAW_BUILD_PRIVATE_QA=1, real QA gateway child, qa-channel plugin, Pi runtime, and mock-openai/gpt-5.5 provider. The run used a synthetic QA channel only; no external chat service was contacted.
  • Exact steps or command run after this patch: Gateway RPC equivalent of node dist/index.js agent --agent qa --message 'Finish with exactly delivery-proof-ok.' --json --deliver --reply-channel qa-channel --reply-to dm:delivery-proof --model mock-openai/gpt-5.5 --timeout 120.
  • Evidence after fix: Terminal output from the local QA gateway RPC run:
{
  "waitResult": { "status": "ok" },
  "mockProvider": { "requestCount": 1, "lastRequest": { "model": "gpt-5.5", "providerVariant": "openai" } },
  "payloads": [{ "text": "delivery-proof-ok", "mediaUrl": null }],
  "deliveryStatus": {
    "requested": true,
    "attempted": true,
    "status": "sent",
    "succeeded": true,
    "resultCount": 1,
    "payloadOutcomes": [{ "index": 0, "status": "sent", "resultCount": 1 }]
  },
  "qaChannel": {
    "outboundCount": 1,
    "outbound": [{ "conversation": { "id": "delivery-proof", "kind": "direct" }, "text": "delivery-proof-ok" }]
  }
}
  • Observed result after fix: The gateway turn completed with waitResult.status: "ok", produced the payload text delivery-proof-ok, delivered one QA-channel outbound message, and returned deliveryStatus.status: "sent" with per-payload status: "sent" and resultCount: 1.
  • What was not tested: The public CLI proof command through the same QA gateway still times out in the CLI client's expectFinal/embedded-fallback path even though the equivalent gateway RPC completes and delivers. I am treating that as a separate CLI QA harness/client issue rather than part of this delivery-status projection.

This PR supersedes draft PRs #53961 and #57755 after #57843 was closed because that approach tried to extend delivery-result semantics on an API that maintainers consider basically deprecated. Instead, this projects the newer sendDurableMessageBatch result model into deliverAgentCommandResult and openclaw agent --json --deliver output.

@openclaw-barnacle openclaw-barnacle Bot added docs Improvements or additions to documentation agents Agent runtime and tooling size: L triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. labels May 10, 2026
@clawsweeper

clawsweeper Bot commented May 10, 2026

Copy link
Copy Markdown
Contributor

Codex review: needs maintainer review before merge.

Summary
Adds optional durable deliveryStatus output to agent delivery results and CLI/Gateway JSON, plus docs, changelog, and focused delivery, gateway, and subagent tests.

Reproducibility: not applicable. for the additive JSON status surface. The prior no-visible cleanup concern is source-checkable and now covered by the PR's NO_REPLY/no-visible tests returning deliverySucceeded: true.

Real behavior proof
Sufficient (live_output): The PR body includes copied after-fix live output from a local QA gateway child showing a delivered QA-channel message with deliveryStatus.status: "sent".

Next step before merge
No ClawSweeper repair lane is needed because the current PR head has no discrete actionable review finding; remaining handling is normal PR validation and merge judgment.

Security
Cleared: The diff is limited to agent delivery logic, CLI/Gateway projection, docs, changelog, generated baseline hash, and tests with no dependency, workflow, permission, install, or secret-handling changes.

Review details

Best possible solution:

Keep the durable-send-based projection, preserve deliverySucceeded only for terminal sent/suppressed outcomes, and land after required validation is complete.

Do we have a high-confidence way to reproduce the issue?

Not applicable for the additive JSON status surface. The prior no-visible cleanup concern is source-checkable and now covered by the PR's NO_REPLY/no-visible tests returning deliverySucceeded: true.

Is this the best way to solve the issue?

Yes. Projecting sendDurableMessageBatch status at the agent delivery and CLI/Gateway JSON boundary is the narrow maintainable solution, and it avoids extending the older coarse delivery boolean beyond compatibility cleanup semantics.

What I checked:

  • PR head projects durable send outcomes: deliveryStatusFromDurableSend maps sent, suppressed, partial_failed, and failed, and deliverAgentCommandResult emits the JSON envelope after delivery so deliveryStatus is available to --json --deliver callers. (src/agents/command/delivery.ts:158, c515d17f62d1)
  • Prior cleanup blocker is fixed: The empty outbound-plan branch now treats terminal suppressed no-visible delivery as deliverySucceeded: true, which preserves the cleanup signal expected by the current main pending-final-delivery gate. (src/agents/command/delivery.ts:507, c515d17f62d1)
  • Current main cleanup contract: Current main persists pending final delivery before send and clears it only when deliveryResult.deliverySucceeded === true, so the PR's preserved success bit is the right compatibility hook. (src/agents/agent-command.ts:1316, fcc042559f96)
  • Durable delivery contract supports the PR status model: Current main's durable message result type already distinguishes sent, suppressed, partial_failed, and failed, including payload outcomes and suppression reasons. (src/channels/message/send.ts:72, fcc042559f96)
  • Focused regression coverage: The PR tests cover sent JSON status, hook suppression, partial failure, no-payload suppression, and NO_REPLY/no-visible normalization returning terminal success. (src/agents/command/delivery.test.ts:440, c515d17f62d1)
  • Gateway JSON projection is covered: The PR promotes result.deliveryStatus to top-level gateway CLI JSON and adds a focused test for the promoted envelope shape. (src/commands/agent-via-gateway.ts:132, c515d17f62d1)

Likely related people:

  • steipete: Current-main blame for deliverAgentCommandResult, the pending-final-delivery cleanup gate, and the durable message result contract points to Peter Steinberger; Peter also authored the latest PR-branch follow-ups that fixed the prior review blocker. (role: recent delivery API and agent delivery area contributor; confidence: high; commits: 957ed7050161, 0451a9fb312c, 5a61eb6ffe21; files: src/agents/command/delivery.ts, src/agents/agent-command.ts, src/channels/message/send.ts)

Remaining risk / open question:

  • Two Critical Quality check runs were still in progress in the check-runs snapshot; merge should wait for required checks to finish.

Codex review notes: model gpt-5.5, reasoning high; reviewed against fcc042559f96.

Re-review progress:

Kaspre commented May 10, 2026

Copy link
Copy Markdown
Contributor Author

@clawsweeper re-review

Follow-up pushed in 48a90187a6972e208e8872516fc1580cbfc630ec:

  • Fixed the docs finding by splitting error and errorMessage; preflight failures now document error + reason without errorMessage.
  • Re-ran pnpm test src/agents/command/delivery.test.ts (16 passed), pnpm exec oxfmt --check --threads=1 docs/cli/agent.md, and git diff --check.
  • Added real behavior proof to the PR body. The proof uses the real QA gateway child + qa-channel + Pi runtime + mock-openai/gpt-5.5, and observed deliveryStatus.status: "sent" plus one outbound synthetic QA channel message.

The field-gating behavior is covered by the focused delivery tests; the live QA gateway proof covers end-to-end success-path durable delivery surfacing.

@Kaspre Kaspre force-pushed the fix/durable-command-delivery-status branch from 48a9018 to 6377864 Compare May 10, 2026 08:50

Kaspre commented May 10, 2026

Copy link
Copy Markdown
Contributor Author

Rebased onto current main to resolve the CHANGELOG.md conflict. New head is 6377864dba9ce3be87016e5e806a8506f830b541.

Post-rebase local checks:

  • pnpm exec oxfmt --check --threads=1 CHANGELOG.md docs/cli/agent.md
  • git diff --check origin/main...HEAD
  • pnpm test src/agents/command/delivery.test.ts (16 passed)

@clawsweeper clawsweeper Bot added the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 10, 2026
@openclaw-barnacle openclaw-barnacle Bot added proof: supplied External PR includes structured after-fix real behavior proof. and removed proof: sufficient ClawSweeper judged the real behavior proof convincing. triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. labels May 10, 2026
@Kaspre Kaspre force-pushed the fix/durable-command-delivery-status branch from dda5f73 to 4fc66cb Compare May 10, 2026 09:12

Kaspre commented May 10, 2026

Copy link
Copy Markdown
Contributor Author

Rebased onto current main (bb1ca7502a) and force-pushed with lease. New head is 4fc66cb1114697b7ed44d5bbe4779d2962437b97.

Follow-ups handled:

  • Fixed the CI check-lint failure by adding an exhaustive never return after deliveryStatusFromDurableSend's durable-send status switch.
  • Updated the PR body's Real behavior proof section to the repository's required field format (behavior, environment, steps, evidence, observed result, not tested).
  • Re-ran the required local Claude CLI review before pushing; no blocking findings.

Local verification after the final rebase:

  • git diff --check origin/main...HEAD
  • pnpm exec oxlint --tsconfig config/tsconfig/oxlint.core.json src/agents/command/delivery.ts src/agents/command/delivery.test.ts
  • pnpm exec oxfmt --check --threads=1 CHANGELOG.md docs/cli/agent.md src/agents/command/delivery.ts src/agents/command/delivery.test.ts
  • pnpm test src/agents/command/delivery.test.ts

GitHub checks on the new head are now green, including Real behavior proof, check-lint, aggregate check, and the Critical Quality shards.

@clawsweeper clawsweeper Bot added the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 10, 2026
@steipete steipete force-pushed the fix/durable-command-delivery-status branch from 4fc66cb to 3133ebf Compare May 10, 2026 10:28
@openclaw-barnacle openclaw-barnacle Bot removed the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 10, 2026
@clawsweeper clawsweeper Bot added the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 10, 2026
@openclaw-barnacle openclaw-barnacle Bot removed the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 10, 2026
@clawsweeper clawsweeper Bot added the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 10, 2026
@openclaw-barnacle openclaw-barnacle Bot added gateway Gateway runtime commands Command implementations and removed proof: sufficient ClawSweeper judged the real behavior proof convincing. labels May 10, 2026
@steipete steipete force-pushed the fix/durable-command-delivery-status branch from 5d89d4c to e7f9493 Compare May 10, 2026 12:37
@clawsweeper clawsweeper Bot added the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 10, 2026
@openclaw-barnacle openclaw-barnacle Bot removed the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 10, 2026
@clawsweeper clawsweeper Bot added the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 10, 2026
@steipete steipete merged commit 5e0c149 into openclaw:main May 10, 2026
115 of 116 checks passed
@steipete

Copy link
Copy Markdown
Contributor

Landed via rebase onto main.

  • Gate: GitHub checks green on c515d17f62d1ac90ff12b7490c59d788c300c964; local pnpm test src/agents/command/delivery.test.ts src/agents/agent-command.live-model-switch.test.ts, env -u OPENCLAW_TESTBOX -u OPENCLAW_TESTBOX_ID pnpm check:changed, pnpm plugin-sdk:api:check, and pnpm docs:check-links passed during fixup.
  • Source head: c515d17
  • Landed commit: 5e0c149

Thanks @Kaspre!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling commands Command implementations docs Improvements or additions to documentation gateway Gateway runtime proof: sufficient ClawSweeper judged the real behavior proof convincing. proof: supplied External PR includes structured after-fix real behavior proof. size: L

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants