Skip to content

fix: harden subagent completion fallback delivery#79405

Open
bingo30008196 wants to merge 3 commits into
openclaw:mainfrom
bingo30008196:danbi/delivery-patch-v2-fallback-recovery-20260508
Open

fix: harden subagent completion fallback delivery#79405
bingo30008196 wants to merge 3 commits into
openclaw:mainfrom
bingo30008196:danbi/delivery-patch-v2-fallback-recovery-20260508

Conversation

@bingo30008196

@bingo30008196 bingo30008196 commented May 8, 2026

Copy link
Copy Markdown

Summary

  • Harden subagent completion delivery fallback around task-aware give-up paths.
  • Preserve taskId in pending delivery context and separate child run id from task id for idempotency.
  • Avoid persisting a transient primary announce failure as terminal before retry/give-up fallback can resolve.

Verification

  • OPENCLAW_TEST_FAST=1 node scripts/run-vitest.mjs run --config test/vitest/vitest.agents-core.config.ts src/agents/subagent-registry-lifecycle.test.ts — 19/19 PASS
  • pnpm check:test-types — PASS
  • pnpm build — PASS

Hold / not included

  • No gateway restart
  • No live Telegram smoke
  • No deploy

Danbi evidence:

  • Closeout: /home/danbi/.openclaw/agents/main/logs/delivery-patch-v2-recovery-closeout-20260508-215411.md
  • Drive fileId: 1B4j8QKb0wtK1G0x17a1xgeVDF9r25_WmvdeMCXuMf-o

Real behavior proof

  • Behavior or issue addressed: Subagent completion fallback delivery can lose/withhold requester-facing completion context when a task-aware give-up path retries after a transient primary announce failure.
  • Real environment tested: DESKTOP-3FEA2LO WSL2, OpenClaw-managed worktree /home/danbi/.openclaw/agents/main/worktrees/openclaw-c97b9f7-20260508, PR head 557f7544c8a4248348da7b4ec100500cd2ac782f.
  • Exact steps or command run after this patch: Inspected the real OpenClaw subagent trajectory/session logs for run 391cc989-7104-4eb5-8328-e4bb31ed04a3, confirmed the requester-visible finalization payload, then ran the targeted local verification commands listed above from the same worktree.
  • Evidence after fix: Terminal output / log excerpt from the real setup:
    2026-05-08T14:01:16.277Z model.completed 391cc989-7104-4eb5-8328-e4bb31ed04a3
    ✅ PR #79405 CI follow-up finalization complete.
    - Commit: 557f7544c8a4248348da7b4ec100500cd2ac782f
    - Message: fix: constrain task status delivery access
    - Pushed: fork danbi/delivery-patch-v2-fallback-recovery-20260508
    
    Local evidence logs: /home/danbi/.openclaw/agents/main/logs/delivery-patch-v2-dev-unreported-verify-corrected-20260508-224031.md and /home/danbi/.openclaw/agents/main/logs/pr79405-failed-log-excerpts-20260508-231522.md.
  • Observed result after fix: The actual subagent completion payload included the commit SHA, changed-files/cleanup outcome, verification result summary, and push result instead of only an incomplete raw test dump.
  • What was not tested: Gateway restart, live Telegram smoke, deploy, and merge were intentionally not performed.

@bingo30008196 bingo30008196 requested review from a team as code owners May 8, 2026 12:56
@openclaw-barnacle openclaw-barnacle Bot added triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. docs Improvements or additions to documentation channel: discord Channel integration: discord channel: googlechat Channel integration: googlechat channel: imessage Channel integration: imessage channel: line Channel integration: line channel: matrix Channel integration: matrix channel: mattermost Channel integration: mattermost channel: msteams Channel integration: msteams channel: nextcloud-talk Channel integration: nextcloud-talk channel: nostr Channel integration: nostr channel: signal Channel integration: signal channel: slack Channel integration: slack channel: telegram Channel integration: telegram channel: tlon Channel integration: tlon channel: voice-call Channel integration: voice-call channel: whatsapp-web Channel integration: whatsapp-web channel: zalo Channel integration: zalo channel: zalouser Channel integration: zalouser app: android App: android app: ios App: ios app: macos App: macos app: web-ui App: web-ui gateway Gateway runtime extensions: copilot-proxy Extension: copilot-proxy extensions: diagnostics-otel Extension: diagnostics-otel extensions: llm-task Extension: llm-task extensions: lobster Extension: lobster labels May 8, 2026
@clawsweeper

clawsweeper Bot commented May 8, 2026

Copy link
Copy Markdown
Contributor

Codex review: needs real behavior proof before merge. Reviewed June 1, 2026, 1:09 AM ET / 05:09 UTC.

Summary
Review failed before ClawSweeper could summarize the requested change.

PR surface: Source +526, Tests +152. Total +678 across 9 files.

Reproducibility: unclear. The review failed before ClawSweeper could establish a reproduction path.

Review metrics: none identified.

Merge readiness
Overall: 🌊 off-meta tidepool
Proof: 🌊 off-meta tidepool
Patch quality: 🌊 off-meta tidepool
Result: rating does not apply to this item.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Risk before merge

  • [P1] No close action taken because the review did not complete.

Maintainer options:

  1. Decide the mitigation before merge
    Retry the Codex review after fixing the execution failure.
  2. Pause or close
    Do not merge this PR until maintainers decide whether the risk is worth taking.

Next step before merge

  • [P1] Review did not complete, so no work-lane recommendation was made.
Review details

Best possible solution:

Retry the Codex review after fixing the execution failure.

Do we have a high-confidence way to reproduce the issue?

Unclear. The review failed before ClawSweeper could establish a reproduction path.

Is this the best way to solve the issue?

Unclear. Retry the review first so ClawSweeper can evaluate the actual issue and fix direction.

AGENTS.md: unclear because the file could not be read completely.

Codex review notes: model gpt-5.5, reasoning high; reviewed against c0195f7ed579.

Label changes

Label justifications:

  • rating: 🌊 off-meta tidepool: Overall readiness is 🌊 off-meta tidepool; proof is 🌊 off-meta tidepool and patch quality is 🌊 off-meta tidepool.
Evidence reviewed

PR surface:

Source +526, Tests +152. Total +678 across 9 files.

View PR surface stats
Area Files Added Removed Net
Source 8 550 24 +526
Tests 1 157 5 +152
Docs 0 0 0 0
Config 0 0 0 0
Generated 0 0 0 0
Other 0 0 0 0
Total 9 707 29 +678

What I checked:

  • failure reason: codex execution failed.
  • codex failure detail: Codex review failed for this PR with exit 1.

Likely related people:

  • unknown: Codex failed before it could trace repository history. (role: review did not complete; confidence: low)
What the crustacean ranks mean
  • 🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
  • 🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
  • 🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
  • 🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
  • 🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
  • 🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
  • 🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works
  • ClawSweeper keeps one durable marker-backed review comment per issue or PR.
  • Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
  • A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
  • PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
  • Maintainers can also comment @clawsweeper review to request a fresh review only.
  • Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
  • Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
  • Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

@barnacle-openclaw

Copy link
Copy Markdown

This pull request has been automatically marked as stale due to inactivity.
Please add updates or it will be closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling proof: supplied External PR includes structured after-fix real behavior proof. rating: 🌊 off-meta tidepool PR readiness rating does not apply to this item. size: L

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant