fix(codex): preserve native subagent completion results by 849261680 · Pull Request #91235 · openclaw/openclaw

849261680 · 2026-06-07T19:52:18Z

Summary

Codex native subagent notifications can report status: { completed: null } when the child completed without a final assistant message. OpenClaw was collapsing that state to (no output) and, when a notification arrived before transcript reconciliation, marking the child terminal before the transcript path could recover last_agent_message.

This PR keeps the change inside the Codex plugin surface:

classifies completed:null and blank completed strings as completed_without_final_message instead of (no output);
lets transcript reconciliation win before delivering an empty successful native completion when codexHome is configured;
adds a bounded fallback so missing/unreadable transcripts still deliver the typed no-final reason instead of leaving the parent stuck;
keeps failed/cancelled null statuses on the existing (no output) behavior.

Dependency contract checked directly in sibling ../codex:

../codex/codex-rs/protocol/src/protocol.rs:1574 defines AgentStatus::Completed(Option<String>).
../codex/codex-rs/core/src/context/subagent_notification.rs:33 serializes subagent notifications with agent_path and status.
../codex/codex-rs/core/src/tools/handlers/multi_agents/wait.rs:178 builds wait_agent results from final AgentStatus entries.

Real behavior proof

Behavior addressed: Codex native subagent completion notification and wait_agent/transcript result can disagree when an early completed:null notification is delivered as (no output) before a later transcript final is available.
Real environment tested: macOS local OpenClaw source checkout in the dedicated worktree openclaw-91120, with repo dependencies installed and OpenClaw Codex plugin modules executed through Node/tsx.
Exact steps or command run after this patch:
- node --import tsx --input-type=module terminal run importing extensions/codex/src/app-server/native-subagent-notification.ts and parsing real Codex native <subagent_notification> payloads.
- node --import tsx --input-type=module terminal run importing extensions/codex/src/app-server/native-subagent-monitor.ts, creating a local Codex transcript under a temp codexHome, sending a real rawResponseItem/completed notification with completed:null, and printing the monitor delivery payload.
- Supplemental checks: node scripts/run-vitest.mjs extensions/codex/src/app-server/native-subagent-notification.test.ts extensions/codex/src/app-server/native-subagent-monitor.test.ts; targeted oxfmt; targeted oxlint; pnpm tsgo:test:extensions; .agents/skills/autoreview/scripts/autoreview --mode local.
Evidence after fix: Terminal capture from the parser run:

OpenClaw Codex native subagent notification live output
[
  {
    "agentPath": "child-null",
    "status": "succeeded",
    "statusLabel": "completed_without_final_message",
    "result": "Codex native subagent completed without a final assistant message."
  },
  {
    "agentPath": "child-final",
    "status": "succeeded",
    "statusLabel": "completed",
    "result": "final research note"
  }
]
observed_null_status_label=completed_without_final_message
observed_null_result=Codex native subagent completed without a final assistant message.
observed_final_result=final research note

Terminal capture from the monitor/transcript run:

OpenClaw Codex native subagent monitor live output
{
  "deliveredResult": "transcript final beats early null notification",
  "deliveredStatusLabel": "task_complete",
  "deliveredStatus": "succeeded"
}

Supplemental verification output:

Vitest: Test Files 2 passed (2) / Tests 29 passed (29)
oxfmt targeted check: All matched files use the correct format.
targeted oxlint wrapper exited 0.
tsgo:test:extensions exited 0 after waiting on the local heavy-check lock.
autoreview rerun: autoreview clean: no accepted/actionable findings reported.

Observed result after fix: completed:null is no longer surfaced as (no output); it becomes statusLabel: "completed_without_final_message". When a transcript final exists, the parent delivery uses the transcript task_complete result (transcript final beats early null notification) instead of the early empty notification. If transcript reconciliation is unavailable, bounded fallback delivers the typed no-final reason rather than leaving the parent without a completion wakeup.
What was not tested: Live Telegram/Codex roundtrip with real native subagents was not run. Remote Crabbox/Testbox check:changed could not be started because this environment has no usable crabbox or blacksmith binary; node scripts/crabbox-wrapper.mjs run --provider blacksmith-testbox ... pnpm check:changed failed at wrapper sanity check before provisioning.

clawsweeper · 2026-06-07T19:54:26Z

Codex review: passed. Reviewed June 7, 2026, 10:21 PM ET / 02:21 UTC.

Summary
The branch updates the Codex plugin native subagent parser, monitor, and tests so successful null or blank completions get a typed no-final result and transcript reconciliation can override early empty notifications before fallback delivery.

PR surface: Source +92, Tests +176. Total +268 across 4 files.

Reproducibility: yes. at source level: current main maps successful null/blank Codex completions to (no output) and processes them before transcript reconciliation can recover final text. I did not run a live current-main Telegram/Codex reproduction in this read-only pass.

Review metrics: none identified.

Merge readiness
Overall: 🐚 platinum hermit
Proof: 🦞 diamond lobster
Patch quality: 🐚 platinum hermit
Result: ready for maintainer review.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Rank-up moves:

Run a live Telegram/Codex native subagent roundtrip only if maintainers want transport-level proof before automerge.

Mantis proof suggestion
A live Telegram/Codex roundtrip would reduce the only remaining transport-level uncertainty around parent-visible completion delivery. A maintainer can ask Mantis to capture proof by posting a new PR comment that starts with the OpenClaw Mantis account mention, followed by:

telegram live: verify that a Telegram parent receiving Codex native subagent completed:null uses transcript final text when available and shows the typed no-final reason otherwise.

Risk before merge

[P1] No live Telegram/Codex roundtrip is supplied, so the full transport-visible finalizer behavior is inferred from the Codex plugin monitor path and terminal proof.
[P1] The linked issue also asks for a stable diagnostic/history path for native subagent ids; this PR fixes the completion-result mismatch but does not implement that broader surface.

Maintainer options:

Land the narrow delivery fix (recommended)
Accept the remaining transport-proof risk, merge after required gates, and keep any diagnostic/history follow-up separate from this completion delivery repair.
Ask for a live transport proof
Before merging, request a live Telegram/Codex native subagent roundtrip that shows transcript-final recovery and typed no-final delivery in an actual chat flow.
Pause for full linked-issue scope
Pause this PR if maintainers want one change to also add the stable native-subagent diagnostic/history path requested by the linked issue.

Next step before merge

No ClawSweeper repair lane is needed; the patch has no blocking findings and should stay in normal automerge/maintainer gates.

Security
Cleared: The diff stays within Codex plugin parser/monitor code and tests, with no dependency, workflow, secret, package, or external code-execution surface changes.

Review details

Best possible solution:

Land the narrow Codex plugin delivery fix after required checks if maintainers accept module-level proof, and track the diagnostic/history acceptance criterion separately if it remains required.

Do we have a high-confidence way to reproduce the issue?

Yes at source level: current main maps successful null/blank Codex completions to (no output) and processes them before transcript reconciliation can recover final text. I did not run a live current-main Telegram/Codex reproduction in this read-only pass.

Is this the best way to solve the issue?

Yes for the parser/monitor mismatch: keeping the fix inside the Codex plugin boundary is the narrowest maintainable path and the tests cover the changed behavior. It is not a complete solution for the linked issue's separate diagnostic-history request.

AGENTS.md: found and applied where relevant.

Codex review notes: model gpt-5.5, reasoning high; reviewed against 9caff5f873cd.

Label changes

Label changes:

add merge-risk: 🚨 message-delivery: The diff intentionally changes when and what Codex native subagent completions are delivered to parent agents, and no live transport roundtrip proves the full channel-visible path.

Label justifications:

P1: The PR targets a broken Codex native subagent completion workflow where parent agents can miss usable child output in real delegated-agent runs.
merge-risk: 🚨 message-delivery: The diff intentionally changes when and what Codex native subagent completions are delivered to parent agents, and no live transport roundtrip proves the full channel-visible path.
rating: 🐚 platinum hermit: Overall readiness is 🐚 platinum hermit; proof is 🦞 diamond lobster and patch quality is 🐚 platinum hermit.
status: 🚀 automerge armed: This PR is in ClawSweeper's automerge lane. Sufficient (terminal): The PR body includes after-fix terminal output from Node/tsx parser and monitor/transcript runs that exercise the changed modules with real Codex-shaped payloads.
proof: sufficient: Contributor real behavior proof is sufficient. The PR body includes after-fix terminal output from Node/tsx parser and monitor/transcript runs that exercise the changed modules with real Codex-shaped payloads.

Evidence reviewed

PR surface:

Source +92, Tests +176. Total +268 across 4 files.

View PR surface stats

Area	Files	Added	Removed	Net
Source	2	104	12	+92
Tests	2	176	0	+176
Docs	0	0	0	0
Config	0	0	0	0
Generated	0	0	0	0
Other	0	0	0	0
Total	4	280	12	+268

What I checked:

Repository policy read: Root AGENTS.md and extensions/AGENTS.md were read; the review applied the Codex dependency-inspection gate and extension boundary guidance. (AGENTS.md:1, 9caff5f873cd)
Current main parser behavior: Current main maps blank, null, or undefined native completion values to (no output) without distinguishing a successful completed-without-final-message state. (extensions/codex/src/app-server/native-subagent-notification.ts:152, 9caff5f873cd)
Current main terminalization path: Current main processes parsed native completions immediately; processCompletion marks the child transcript terminal, which can prevent a later transcript result from replacing an early empty notification. (extensions/codex/src/app-server/native-subagent-monitor.ts:304, 9caff5f873cd)
PR parser fix: The PR classifies successful null or blank completion payloads as completed_without_final_message while preserving (no output) for non-successful empty statuses. (extensions/codex/src/app-server/native-subagent-notification.ts:119, f9270c28e735)
PR monitor fix: The PR waits for transcript reconciliation on typed successful no-final completions, schedules transcript polling, and adds a bounded no-final fallback rather than delivering the early empty notification immediately. (extensions/codex/src/app-server/native-subagent-monitor.ts:306, f9270c28e735)
Regression tests: The PR adds tests for transcript text beating an early null notification, no-transcript typed no-final delivery, fallback delivery, and parser normalization for null/blank completions. (extensions/codex/src/app-server/native-subagent-monitor.test.ts:268, f9270c28e735)

Likely related people:

steipete: Current-main blame and log history in this checkout point the Codex native monitor/parser files to commit 6f2b383 by Peter Steinberger, which added the current files into this repository snapshot. (role: recent area contributor; confidence: medium; commits: 6f2b3830f128; files: extensions/codex/src/app-server/native-subagent-monitor.ts, extensions/codex/src/app-server/native-subagent-notification.ts)
vincentkoc: The linked issue's prior ClawSweeper review cites git blame/git show evidence tying the same Codex-native monitor/parser (no output) behavior to commit 3597cfc across these files. (role: likely introduced current behavior; confidence: medium; commits: 3597cfc7bc26; files: extensions/codex/src/app-server/native-subagent-monitor.ts, extensions/codex/src/app-server/native-subagent-notification.ts)

What the crustacean ranks mean

🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works

ClawSweeper keeps one durable marker-backed review comment per issue or PR.
Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
Maintainers can also comment @clawsweeper review to request a fresh review only.
Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

Takhoffman · 2026-06-08T01:53:14Z

@clawsweeper automerge

clawsweeper · 2026-06-08T01:53:17Z

🦞🧹
ClawSweeper automerge is enabled.

Head: f9270c28e735
Label: clawsweeper:automerge
Action: exact-head review queued (workflow sweep.yml, event repository_dispatch).
Flow: review this head, repair/rebase only if needed, then re-review the exact repaired head before merge.

Draft PRs stay fix-only until GitHub marks them ready for review. Pause with /clawsweeper stop.

Automerge progress:

2026-06-08 01:53:54 UTC review queued 60edfe6f32d6 (queued)

2026-06-08 02:01:10 UTC review passed 60edfe6f32d6 (structured ClawSweeper verdict: pass (sha=60edfe6f32d6cbbb0b4a925be9952b97f1ab5...)

2026-06-08 02:01:31 UTC repair queued 60edfe6f32d6 (autonomous) Run: https://github.com/openclaw/clawsweeper/actions/runs/27111880032

2026-06-08 02:03:18 UTC repair started (running) in 1s Run: https://github.com/openclaw/clawsweeper/actions/runs/27111880032 automerge-openclaw-openclaw-91235

2026-06-08 02:05:41 UTC validation plan (passed) in 2m 23s Run: https://github.com/openclaw/clawsweeper/actions/runs/27111880032 pnpm check:changed; pnpm lint; pnpm check:test-types

2026-06-08 02:05:54 UTC Codex write preflight (passed) in 2m 36s Run: https://github.com/openclaw/clawsweeper/actions/runs/27111880032 danger-full-access

2026-06-08 02:11:10 UTC Codex edit 1 0640d5625bd0 (complete) in 7m 53s Run: https://github.com/openclaw/clawsweeper/actions/runs/27111880032 exit 0

2026-06-08 02:14:48 UTC validation and review 1 f9270c28e735 (base moved) in 11m 30s Run: https://github.com/openclaw/clawsweeper/actions/runs/27111880032 rebased

2026-06-08 02:14:53 UTC repair completed f9270c28e735 (branch updated) in 11m 36s Run: https://github.com/openclaw/clawsweeper/actions/runs/27111880032 initial automerge rebase is delegated to Codex repair

2026-06-08 02:14:53 UTC review queued f9270c28e735 (after repair)

2026-06-08 02:22:20 UTC automerge wait f9270c28e735 (merged) in 19m 2s Run: https://github.com/openclaw/clawsweeper/actions/runs/27111880032 pull request already merged

2026-06-08 02:21:56 UTC review passed f9270c28e735 (structured ClawSweeper verdict: pass (sha=f9270c28e73548cca60aa499ca65ba1d94873...)

2026-06-08 02:22:08 UTC merged f9270c28e735 (merged by ClawSweeper automerge)

2026-06-08 02:22:11 UTC review queued f9270c28e735 (queued)

2026-06-08 02:22:21 UTC repair finished f9270c28e735 (pushed) in 19m 3s Run: https://github.com/openclaw/clawsweeper/actions/runs/27111880032 repair_contributor_branch

Re-review progress:

State: Complete
Detail: The targeted re-review finished, the durable review comment was updated, and the synced verdict was routed.
Run: https://github.com/openclaw/clawsweeper/actions/runs/27112458304
Updated: 2026-06-08T02:24:29.259Z

Summary: - The branch updates the Codex plugin native subagent parser, monitor, and tests so successful null or blank c ... final result and transcript reconciliation can override early empty notifications before fallback delivery. - PR surface: Source +92, Tests +176. Total +268 across 4 files. - Reproducibility: yes. at source level: current main maps successful null/blank Codex completions to `(no out ... n recover final text. I did not run a live current-main Telegram/Codex reproduction in this read-only pass. Automerge notes: - PR branch already contained follow-up commit before automerge: fix(codex): preserve native subagent completion results Validation: - ClawSweeper review passed for head f9270c2. - Required merge gates passed before the squash merge. Prepared head SHA: f9270c2 Review: openclaw#91235 (comment) Co-authored-by: 宇宙熊Yzx <53250620+849261680@users.noreply.github.com> Co-authored-by: clawsweeper <274271284+clawsweeper[bot]@users.noreply.github.com> Co-authored-by: clawsweeper[bot] <274271284+clawsweeper[bot]@users.noreply.github.com> Approved-by: takhoffman Co-authored-by: takhoffman <781889+takhoffman@users.noreply.github.com>

openclaw-barnacle Bot added extensions: codex size: M triage: mock-only-proof Candidate: PR proof only shows tests, mocks, snapshots, lint, typecheck, or CI. labels Jun 7, 2026

openclaw-barnacle Bot added proof: supplied External PR includes structured after-fix real behavior proof. and removed triage: mock-only-proof Candidate: PR proof only shows tests, mocks, snapshots, lint, typecheck, or CI. labels Jun 7, 2026

849261680 and others added 2 commits June 8, 2026 02:14

fix(codex): preserve native subagent completion results

beb975f

fix(codex): preserve native subagent completion results

f9270c2

clawsweeper Bot force-pushed the fix/91120-codex-native-subagent-result branch from 60edfe6 to f9270c2 Compare June 8, 2026 02:14

clawsweeper Bot added the merge-risk: 🚨 message-delivery 🚨 May drop, duplicate, misroute, suppress, or wrongly target messages. label Jun 8, 2026

clawsweeper Bot merged commit 766c5b3 into openclaw:main Jun 8, 2026
159 checks passed

github-actions Bot mentioned this pull request Jun 8, 2026

📡 Upstream Digest — 2026-06-08 02:48 UTC curtismercier/openclaw-mods#1036

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(codex): preserve native subagent completion results#91235

fix(codex): preserve native subagent completion results#91235
clawsweeper[bot] merged 2 commits into
openclaw:mainfrom
849261680:fix/91120-codex-native-subagent-result

849261680 commented Jun 7, 2026 •

edited

Loading

Uh oh!

clawsweeper Bot commented Jun 7, 2026 •

edited

Loading

Uh oh!

Takhoffman commented Jun 8, 2026

Uh oh!

clawsweeper Bot commented Jun 8, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

849261680 commented Jun 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Real behavior proof

Uh oh!

clawsweeper Bot commented Jun 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Takhoffman commented Jun 8, 2026

Uh oh!

clawsweeper Bot commented Jun 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

849261680 commented Jun 7, 2026 •

edited

Loading

clawsweeper Bot commented Jun 7, 2026 •

edited

Loading

clawsweeper Bot commented Jun 8, 2026 •

edited

Loading