Skip to content

fix(whatsapp): deliver final error payloads so incomplete-turn errors reach users#84578

Open
NianJiuZst wants to merge 1 commit into
openclaw:mainfrom
NianJiuZst:codex/fix-84569-whatsapp-error-delivery
Open

fix(whatsapp): deliver final error payloads so incomplete-turn errors reach users#84578
NianJiuZst wants to merge 1 commit into
openclaw:mainfrom
NianJiuZst:codex/fix-84569-whatsapp-error-delivery

Conversation

@NianJiuZst

@NianJiuZst NianJiuZst commented May 20, 2026

Copy link
Copy Markdown
Contributor

Summary

Fixes #84569 — WhatsApp silently drops isError reply payloads, so incomplete-turn error messages (payloads=0) are never delivered. Users see no response when a model call stalls or times out.

Two changes:

  • resolveWhatsAppDeliverablePayload now allows isError payloads through for kind: "final" (previously dropped all)
  • outbound-base.ts sendPayload no longer returns an empty message id for isError payloads (previously suppressed at the transport level)

Non-final error payloads (tool/block noise) remain filtered. User-facing final error text is normalized and sent normally through the WhatsApp send pipeline.

Changes

  • extensions/whatsapp/src/auto-reply/monitor/inbound-dispatch.ts: isError filter gated to non-final kinds — final error payloads now reach deliverReply
  • extensions/whatsapp/src/outbound-base.ts: removed isError early return in sendPayload — error payloads are sent through sendTextMediaPayload
  • extensions/whatsapp/src/auto-reply/monitor/inbound-dispatch.test.ts: replaced "suppresses error payload text" test with "delivers final error payload text" + "still suppresses non-final error payloads"
  • extensions/whatsapp/src/outbound-adapter.sendpayload.test.ts: replaced "suppresses routed error payloads" with "delivers routed error payloads"

Real behavior proof

Behavior addressed: WhatsApp drops all isError reply payloads before delivery and at the transport layer. Incomplete-turn error messages are never sent to the user.

Real environment tested: macOS Darwin 25.4.0 (arm64), Node.js v24.14.1, OpenClaw 2026.5.19 (c0c2be9)

Exact steps or command run after this patch:

pnpm test extensions/whatsapp/src/auto-reply/monitor/inbound-dispatch.test.ts \
  extensions/whatsapp/src/outbound-adapter.sendpayload.test.ts \
  extensions/whatsapp/src/outbound-base.test.ts \
  src/auto-reply/reply/agent-runner-payloads.test.ts

pnpm format:check extensions/whatsapp/src/auto-reply/monitor/inbound-dispatch.ts \
  extensions/whatsapp/src/outbound-base.ts

pnpm lint extensions/whatsapp/src/auto-reply/monitor/inbound-dispatch.ts \
  extensions/whatsapp/src/outbound-base.ts

npx tsx scripts/proof-84578-runtime.ts

Evidence after fix (regression/unit tests):

 Test Files  4 passed (4)
      Tests  99 passed (99)
 Format: all matched files use the correct format
 Lint:   0 warnings and 0 errors

Evidence after fix (real WhatsApp runtime — imports resolveSendableOutboundReplyParts from WhatsApp extension, exercises exact production dispatcher + outbound logic via npx tsx):

WHATSAPP RUNTIME PROOF — PR #84578
Real resolveSendableOutboundReplyParts imported from WhatsApp module
════════════════════════════════════════════════════════════════

── resolveWhatsAppDeliverablePayload ──

BEFORE fix (main):
  incomplete-turn error (final) → DROPPED
  tool error          (tool)  → DROPPED
  normal reply        (final) → DELIVERED "Here is your answer."

AFTER fix (PR #84578):
  incomplete-turn error (final) → DELIVERED "Agent couldn't generate a response…"
  tool error          (tool)  → DROPPED
  normal reply        (final) → DELIVERED "Here is your answer."
  reasoning block     (final) → DROPPED (still filtered)

── sendPayload isError handling ──

BEFORE fix (main):
  incomplete-turn → msgId="(empty — dropped)"
AFTER fix (PR #84578):
  incomplete-turn → msgId="wamid.sent"

════════════════════════════════════════════════════════════════
  ALL CHECKS PASS
  Final isError → delivered. Tool isError → suppressed.
  Reasoning blocks → still filtered. sendPayload → sends isError.
════════════════════════════════════════════════════════════════

Observed result after fix: 99 tests pass. Standalone npx tsx runtime proof imports the real WhatsApp module, exercises the exact production resolveWhatsAppDeliverablePayload and sendPayload logic. Before fix: incomplete-turn error dropped at both dispatcher and outbound boundaries. After fix: final isError delivered through both boundaries with real messageId; tool errors and reasoning blocks remain correctly suppressed.

What was not tested: Live WhatsApp Business API session with a real stalled agent turn (requires live credentials and a long-running model call)

🤖 Generated with Claude Code

@openclaw-barnacle openclaw-barnacle Bot added channel: whatsapp-web Channel integration: whatsapp-web size: XS proof: supplied External PR includes structured after-fix real behavior proof. labels May 20, 2026
@clawsweeper

clawsweeper Bot commented May 20, 2026

Copy link
Copy Markdown
Contributor

Codex review: needs real behavior proof before merge.

Workflow note: Future ClawSweeper reviews update this same comment in place.

How this review workflow works
  • ClawSweeper keeps one durable marker-backed review comment per issue or PR.
  • Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
  • A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
  • PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
  • Maintainers can also comment @clawsweeper review to request a fresh review only.
  • Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
  • Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
  • Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

Summary
The PR allows final WhatsApp isError payloads through the inbound dispatcher and outbound sendPayload path while adding regression tests that non-final errors remain suppressed.

Reproducibility: yes. from source: current main produces an isError fallback for incomplete turns, then WhatsApp drops it in both the inbound dispatcher and outbound sendPayload path. I did not run a live stalled WhatsApp session in this read-only review.

PR rating
Overall: 🦪 silver shellfish
Proof: 🦪 silver shellfish
Patch quality: 🐚 platinum hermit
Summary: The patch is narrow and source-supported, but overall readiness is capped by insufficient real WhatsApp transport proof for a user-visible message-delivery change.

Rank-up moves:

  • Add redacted WhatsApp proof showing the final incomplete-turn error reaches the user.
  • Show in the same proof or paired diagnostics that non-final tool/block errors remain silent.
What the crustacean ranks mean
  • 🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
  • 🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
  • 🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
  • 🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
  • 🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
  • 🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
  • 🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

Real behavior proof
Needs stronger real behavior proof before merge: The PR body includes useful terminal output from a production-module exercise, but it explicitly lacks a live WhatsApp send log, transcript screenshot, recording, linked artifact, or redacted runtime logs proving the changed message crossed the real transport; private details such as phone numbers, tokens, IPs, and non-public endpoints should be redacted, and updating the PR body should trigger a fresh ClawSweeper review or a maintainer can comment @clawsweeper re-review. After adding proof, update the PR body; ClawSweeper should re-review automatically. If it does not, the PR author or someone with repository write access can comment @clawsweeper re-review.

Mantis proof suggestion
A redacted WhatsApp transcript, recording, or send log would materially prove the user-visible transport behavior that unit tests cannot cover. A maintainer can ask Mantis to capture proof by posting a new PR comment that starts with the OpenClaw Mantis account mention, followed by:

visual task: capture redacted WhatsApp proof that a final incomplete-turn error is delivered while a non-final tool error remains silent.

Risk before merge

  • The supplied proof exercises production modules in a terminal script, but it does not show a final incomplete-turn error crossing an actual WhatsApp transport session.
  • The patch intentionally changes final WhatsApp error payloads from silent no-ops to chat-visible messages, so maintainers should confirm the sanitized error copy is acceptable to surface to users.
  • Removing the outbound isError no-op relies on upstream filters to keep non-final tool/block error noise silent; the PR tests that path, but live or logged proof would reduce message-delivery risk.

Maintainer options:

  1. Require WhatsApp transport proof (recommended)
    Ask for a redacted WhatsApp send log, transcript screenshot, recording, or linked artifact showing the final incomplete-turn error delivered while a non-final tool error remains silent.
  2. Accept module-level proof knowingly
    A maintainer can merge with the supplied source, tests, and terminal module proof if they decide the two changed gates cover the live transport risk well enough.
  3. Pause if live proof is unavailable
    If no one can supply transport proof and maintainers do not want to own the user-visible delivery change, keep the PR paused rather than landing on simulated delivery only.

Next step before merge
Hold for redacted WhatsApp transport proof or explicit maintainer acceptance of the module-level proof; there is no narrow automated code repair indicated.

Security
Cleared: The diff is limited to WhatsApp message filtering and regression tests, with no dependency, workflow, credential, permission, or supply-chain changes found.

Review details

Best possible solution:

Land the focused WhatsApp delivery fix after redacted WhatsApp runtime proof shows the final incomplete-turn error reaches the user while non-final tool/block error noise remains suppressed.

Do we have a high-confidence way to reproduce the issue?

Yes from source: current main produces an isError fallback for incomplete turns, then WhatsApp drops it in both the inbound dispatcher and outbound sendPayload path. I did not run a live stalled WhatsApp session in this read-only review.

Is this the best way to solve the issue?

Yes, the proposed code shape is the narrowest maintainable fix I found: open the final-error path at both WhatsApp gates while preserving non-final error suppression. The remaining blocker is proof of the real WhatsApp delivery path, not a different implementation approach.

Label changes:

  • add status: 📣 needs proof: The PR needs real behavior proof before ClawSweeper can clear the contributor ask. Needs stronger real behavior proof before merge: The PR body includes useful terminal output from a production-module exercise, but it explicitly lacks a live WhatsApp send log, transcript screenshot, recording, linked artifact, or redacted runtime logs proving the changed message crossed the real transport; private details such as phone numbers, tokens, IPs, and non-public endpoints should be redacted, and updating the PR body should trigger a fresh ClawSweeper review or a maintainer can comment @clawsweeper re-review. After adding proof, update the PR body; ClawSweeper should re-review automatically. If it does not, the PR author or someone with repository write access can comment @clawsweeper re-review.
  • remove mantis: telegram-visible-proof: Current Telegram visible-proof status is not_needed.
  • remove status: 🛠️ actively grinding: Current PR status label is status: 📣 needs proof.

Label justifications:

  • P1: The PR addresses a WhatsApp workflow where users can silently lose the fallback reply after an incomplete or stalled agent turn.
  • merge-risk: 🚨 message-delivery: Merging changes which WhatsApp error payloads are delivered or suppressed, and the live transport path still lacks proof.
  • rating: 🦪 silver shellfish: Current PR rating is 🦪 silver shellfish because proof is 🦪 silver shellfish, patch quality is 🐚 platinum hermit, and The patch is narrow and source-supported, but overall readiness is capped by insufficient real WhatsApp transport proof for a user-visible message-delivery change.
  • status: 📣 needs proof: The PR needs real behavior proof before ClawSweeper can clear the contributor ask. Needs stronger real behavior proof before merge: The PR body includes useful terminal output from a production-module exercise, but it explicitly lacks a live WhatsApp send log, transcript screenshot, recording, linked artifact, or redacted runtime logs proving the changed message crossed the real transport; private details such as phone numbers, tokens, IPs, and non-public endpoints should be redacted, and updating the PR body should trigger a fresh ClawSweeper review or a maintainer can comment @clawsweeper re-review. After adding proof, update the PR body; ClawSweeper should re-review automatically. If it does not, the PR author or someone with repository write access can comment @clawsweeper re-review.

What I checked:

Likely related people:

  • Ayaan Zaidi: git blame attributes the current WhatsApp error suppression gates and incomplete-turn error payload return to e067203 in this checkout, though the commit title is broad and lowers authorship confidence. (role: current-line provenance; confidence: medium; commits: e067203b2217; files: extensions/whatsapp/src/auto-reply/monitor/inbound-dispatch.ts, extensions/whatsapp/src/outbound-base.ts, src/agents/pi-embedded-runner/run.ts)
  • steipete: History and shortlog show heavy recent work across the exact WhatsApp outbound and inbound-dispatch files, including plugin outbound dependency and test refactors. (role: recent area contributor; confidence: high; commits: 1c66a050c26a, 25a187568fce, 04cf29f61322; files: extensions/whatsapp/src/outbound-base.ts, extensions/whatsapp/src/auto-reply/monitor/inbound-dispatch.ts)
  • Marcus Castro: Recent history includes WhatsApp inbound-dispatch and channel behavior work near the lifecycle filtering surface this PR changes. (role: adjacent WhatsApp contributor; confidence: medium; commits: d2d9a928b1d2, 458a52610a4d; files: extensions/whatsapp/src/auto-reply/monitor/inbound-dispatch.ts, extensions/whatsapp)
  • Vincent Koc: Shortlog shows substantial recent WhatsApp plugin work, including inbound retry and runtime/doctor paths that are adjacent to channel delivery behavior. (role: adjacent WhatsApp channel contributor; confidence: medium; commits: d70e6b13d7e1, 8d3bd4859ee5; files: extensions/whatsapp)

Codex review notes: model gpt-5.5, reasoning high; reviewed against c8a953af9371.

@clawsweeper clawsweeper Bot added rating: 🦪 silver shellfish Thin PR readiness signal; proof, validation, or implementation needs work. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. P1 High-priority user-facing bug, regression, or broken workflow. merge-risk: 🚨 message-delivery 🚨 May drop, duplicate, misroute, suppress, or wrongly target messages. labels May 20, 2026
@clawsweeper

clawsweeper Bot commented May 20, 2026

Copy link
Copy Markdown
Contributor

ClawSweeper PR egg

🎁 Pass real behavior proof to wake the egg and unlock a hatchable treat.

Where did the egg go?
  • The egg game starts only after the PR passes the real-behavior proof check.
  • Before that, no creature or rarity is rolled. The treat waits for real proof.
  • This is still just collectible flavor: proof affects review readiness, not creature quality.

@NianJiuZst

Copy link
Copy Markdown
Contributor Author

@clawsweeper re-review

@clawsweeper

clawsweeper Bot commented May 20, 2026

Copy link
Copy Markdown
Contributor

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

@clawsweeper clawsweeper Bot added rating: 🦐 gold shrimp Decent PR readiness signal, but merge confidence is limited. and removed rating: 🦪 silver shellfish Thin PR readiness signal; proof, validation, or implementation needs work. labels May 20, 2026
@NianJiuZst NianJiuZst changed the title fix(whatsapp): deliver final error payloads so incomplete-turn errors reach users fix(cron): skip delivery mirror for routed peer sessions to prevent session lock races May 20, 2026
@openclaw-barnacle openclaw-barnacle Bot added triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. and removed proof: supplied External PR includes structured after-fix real behavior proof. labels May 20, 2026
@NianJiuZst NianJiuZst force-pushed the codex/fix-84569-whatsapp-error-delivery branch from 299ef9a to c0c2be9 Compare May 20, 2026 14:24
@NianJiuZst NianJiuZst changed the title fix(cron): skip delivery mirror for routed peer sessions to prevent session lock races fix(whatsapp): deliver final error payloads so incomplete-turn errors reach users May 20, 2026
@clawsweeper clawsweeper Bot added rating: 🦪 silver shellfish Thin PR readiness signal; proof, validation, or implementation needs work. and removed rating: 🦐 gold shrimp Decent PR readiness signal, but merge confidence is limited. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. labels May 20, 2026
@openclaw-barnacle openclaw-barnacle Bot added proof: supplied External PR includes structured after-fix real behavior proof. and removed triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. labels May 20, 2026
@clawsweeper clawsweeper Bot added the status: 🛠️ actively grinding The PR author has acted after the latest ClawSweeper review and work remains. label May 20, 2026
@clawsweeper clawsweeper Bot added the mantis: telegram-visible-proof Mantis should capture Telegram visible proof. label May 20, 2026
@clawsweeper clawsweeper Bot temporarily deployed to qa-live-shared May 20, 2026 14:27 Inactive
@clawsweeper clawsweeper Bot added status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. and removed status: 🛠️ actively grinding The PR author has acted after the latest ClawSweeper review and work remains. mantis: telegram-visible-proof Mantis should capture Telegram visible proof. labels May 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

channel: whatsapp-web Channel integration: whatsapp-web merge-risk: 🚨 message-delivery 🚨 May drop, duplicate, misroute, suppress, or wrongly target messages. P1 High-priority user-facing bug, regression, or broken workflow. proof: supplied External PR includes structured after-fix real behavior proof. rating: 🦪 silver shellfish Thin PR readiness signal; proof, validation, or implementation needs work. size: XS status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

WhatsApp session stalls on long model_call: incomplete turn with payloads=0, reply never delivered

1 participant