Skip to content

fix(codex): preserve native subagent completion results#91235

Merged
clawsweeper[bot] merged 2 commits into
openclaw:mainfrom
849261680:fix/91120-codex-native-subagent-result
Jun 8, 2026
Merged

fix(codex): preserve native subagent completion results#91235
clawsweeper[bot] merged 2 commits into
openclaw:mainfrom
849261680:fix/91120-codex-native-subagent-result

Conversation

@849261680

@849261680 849261680 commented Jun 7, 2026

Copy link
Copy Markdown
Contributor

Summary

Fixes #91120.
Refs #78656.

Codex native subagent notifications can report status: { completed: null } when the child completed without a final assistant message. OpenClaw was collapsing that state to (no output) and, when a notification arrived before transcript reconciliation, marking the child terminal before the transcript path could recover last_agent_message.

This PR keeps the change inside the Codex plugin surface:

  • classifies completed:null and blank completed strings as completed_without_final_message instead of (no output);
  • lets transcript reconciliation win before delivering an empty successful native completion when codexHome is configured;
  • adds a bounded fallback so missing/unreadable transcripts still deliver the typed no-final reason instead of leaving the parent stuck;
  • keeps failed/cancelled null statuses on the existing (no output) behavior.

Dependency contract checked directly in sibling ../codex:

  • ../codex/codex-rs/protocol/src/protocol.rs:1574 defines AgentStatus::Completed(Option<String>).
  • ../codex/codex-rs/core/src/context/subagent_notification.rs:33 serializes subagent notifications with agent_path and status.
  • ../codex/codex-rs/core/src/tools/handlers/multi_agents/wait.rs:178 builds wait_agent results from final AgentStatus entries.

Real behavior proof

  • Behavior addressed: Codex native subagent completion notification and wait_agent/transcript result can disagree when an early completed:null notification is delivered as (no output) before a later transcript final is available.
  • Real environment tested: macOS local OpenClaw source checkout in the dedicated worktree openclaw-91120, with repo dependencies installed and OpenClaw Codex plugin modules executed through Node/tsx.
  • Exact steps or command run after this patch:
    • node --import tsx --input-type=module terminal run importing extensions/codex/src/app-server/native-subagent-notification.ts and parsing real Codex native <subagent_notification> payloads.
    • node --import tsx --input-type=module terminal run importing extensions/codex/src/app-server/native-subagent-monitor.ts, creating a local Codex transcript under a temp codexHome, sending a real rawResponseItem/completed notification with completed:null, and printing the monitor delivery payload.
    • Supplemental checks: node scripts/run-vitest.mjs extensions/codex/src/app-server/native-subagent-notification.test.ts extensions/codex/src/app-server/native-subagent-monitor.test.ts; targeted oxfmt; targeted oxlint; pnpm tsgo:test:extensions; .agents/skills/autoreview/scripts/autoreview --mode local.
  • Evidence after fix: Terminal capture from the parser run:
OpenClaw Codex native subagent notification live output
[
  {
    "agentPath": "child-null",
    "status": "succeeded",
    "statusLabel": "completed_without_final_message",
    "result": "Codex native subagent completed without a final assistant message."
  },
  {
    "agentPath": "child-final",
    "status": "succeeded",
    "statusLabel": "completed",
    "result": "final research note"
  }
]
observed_null_status_label=completed_without_final_message
observed_null_result=Codex native subagent completed without a final assistant message.
observed_final_result=final research note

Terminal capture from the monitor/transcript run:

OpenClaw Codex native subagent monitor live output
{
  "deliveredResult": "transcript final beats early null notification",
  "deliveredStatusLabel": "task_complete",
  "deliveredStatus": "succeeded"
}

Supplemental verification output:

Vitest: Test Files 2 passed (2) / Tests 29 passed (29)
oxfmt targeted check: All matched files use the correct format.
targeted oxlint wrapper exited 0.
tsgo:test:extensions exited 0 after waiting on the local heavy-check lock.
autoreview rerun: autoreview clean: no accepted/actionable findings reported.
  • Observed result after fix: completed:null is no longer surfaced as (no output); it becomes statusLabel: "completed_without_final_message". When a transcript final exists, the parent delivery uses the transcript task_complete result (transcript final beats early null notification) instead of the early empty notification. If transcript reconciliation is unavailable, bounded fallback delivers the typed no-final reason rather than leaving the parent without a completion wakeup.
  • What was not tested: Live Telegram/Codex roundtrip with real native subagents was not run. Remote Crabbox/Testbox check:changed could not be started because this environment has no usable crabbox or blacksmith binary; node scripts/crabbox-wrapper.mjs run --provider blacksmith-testbox ... pnpm check:changed failed at wrapper sanity check before provisioning.

@openclaw-barnacle openclaw-barnacle Bot added extensions: codex size: M triage: mock-only-proof Candidate: PR proof only shows tests, mocks, snapshots, lint, typecheck, or CI. labels Jun 7, 2026
@clawsweeper

clawsweeper Bot commented Jun 7, 2026

Copy link
Copy Markdown
Contributor

Codex review: passed. Reviewed June 7, 2026, 10:21 PM ET / 02:21 UTC.

Summary
The branch updates the Codex plugin native subagent parser, monitor, and tests so successful null or blank completions get a typed no-final result and transcript reconciliation can override early empty notifications before fallback delivery.

PR surface: Source +92, Tests +176. Total +268 across 4 files.

Reproducibility: yes. at source level: current main maps successful null/blank Codex completions to (no output) and processes them before transcript reconciliation can recover final text. I did not run a live current-main Telegram/Codex reproduction in this read-only pass.

Review metrics: none identified.

Merge readiness
Overall: 🐚 platinum hermit
Proof: 🦞 diamond lobster
Patch quality: 🐚 platinum hermit
Result: ready for maintainer review.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Rank-up moves:

  • Run a live Telegram/Codex native subagent roundtrip only if maintainers want transport-level proof before automerge.

Mantis proof suggestion
A live Telegram/Codex roundtrip would reduce the only remaining transport-level uncertainty around parent-visible completion delivery. A maintainer can ask Mantis to capture proof by posting a new PR comment that starts with the OpenClaw Mantis account mention, followed by:

telegram live: verify that a Telegram parent receiving Codex native subagent completed:null uses transcript final text when available and shows the typed no-final reason otherwise.

Risk before merge

  • [P1] No live Telegram/Codex roundtrip is supplied, so the full transport-visible finalizer behavior is inferred from the Codex plugin monitor path and terminal proof.
  • [P1] The linked issue also asks for a stable diagnostic/history path for native subagent ids; this PR fixes the completion-result mismatch but does not implement that broader surface.

Maintainer options:

  1. Land the narrow delivery fix (recommended)
    Accept the remaining transport-proof risk, merge after required gates, and keep any diagnostic/history follow-up separate from this completion delivery repair.
  2. Ask for a live transport proof
    Before merging, request a live Telegram/Codex native subagent roundtrip that shows transcript-final recovery and typed no-final delivery in an actual chat flow.
  3. Pause for full linked-issue scope
    Pause this PR if maintainers want one change to also add the stable native-subagent diagnostic/history path requested by the linked issue.

Next step before merge

  • No ClawSweeper repair lane is needed; the patch has no blocking findings and should stay in normal automerge/maintainer gates.

Security
Cleared: The diff stays within Codex plugin parser/monitor code and tests, with no dependency, workflow, secret, package, or external code-execution surface changes.

Review details

Best possible solution:

Land the narrow Codex plugin delivery fix after required checks if maintainers accept module-level proof, and track the diagnostic/history acceptance criterion separately if it remains required.

Do we have a high-confidence way to reproduce the issue?

Yes at source level: current main maps successful null/blank Codex completions to (no output) and processes them before transcript reconciliation can recover final text. I did not run a live current-main Telegram/Codex reproduction in this read-only pass.

Is this the best way to solve the issue?

Yes for the parser/monitor mismatch: keeping the fix inside the Codex plugin boundary is the narrowest maintainable path and the tests cover the changed behavior. It is not a complete solution for the linked issue's separate diagnostic-history request.

AGENTS.md: found and applied where relevant.

Codex review notes: model gpt-5.5, reasoning high; reviewed against 9caff5f873cd.

Label changes

Label changes:

  • add merge-risk: 🚨 message-delivery: The diff intentionally changes when and what Codex native subagent completions are delivered to parent agents, and no live transport roundtrip proves the full channel-visible path.

Label justifications:

  • P1: The PR targets a broken Codex native subagent completion workflow where parent agents can miss usable child output in real delegated-agent runs.
  • merge-risk: 🚨 message-delivery: The diff intentionally changes when and what Codex native subagent completions are delivered to parent agents, and no live transport roundtrip proves the full channel-visible path.
  • rating: 🐚 platinum hermit: Overall readiness is 🐚 platinum hermit; proof is 🦞 diamond lobster and patch quality is 🐚 platinum hermit.
  • status: 🚀 automerge armed: This PR is in ClawSweeper's automerge lane. Sufficient (terminal): The PR body includes after-fix terminal output from Node/tsx parser and monitor/transcript runs that exercise the changed modules with real Codex-shaped payloads.
  • proof: sufficient: Contributor real behavior proof is sufficient. The PR body includes after-fix terminal output from Node/tsx parser and monitor/transcript runs that exercise the changed modules with real Codex-shaped payloads.
Evidence reviewed

PR surface:

Source +92, Tests +176. Total +268 across 4 files.

View PR surface stats
Area Files Added Removed Net
Source 2 104 12 +92
Tests 2 176 0 +176
Docs 0 0 0 0
Config 0 0 0 0
Generated 0 0 0 0
Other 0 0 0 0
Total 4 280 12 +268

What I checked:

Likely related people:

  • steipete: Current-main blame and log history in this checkout point the Codex native monitor/parser files to commit 6f2b383 by Peter Steinberger, which added the current files into this repository snapshot. (role: recent area contributor; confidence: medium; commits: 6f2b3830f128; files: extensions/codex/src/app-server/native-subagent-monitor.ts, extensions/codex/src/app-server/native-subagent-notification.ts)
  • vincentkoc: The linked issue's prior ClawSweeper review cites git blame/git show evidence tying the same Codex-native monitor/parser (no output) behavior to commit 3597cfc across these files. (role: likely introduced current behavior; confidence: medium; commits: 3597cfc7bc26; files: extensions/codex/src/app-server/native-subagent-monitor.ts, extensions/codex/src/app-server/native-subagent-notification.ts)
What the crustacean ranks mean
  • 🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
  • 🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
  • 🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
  • 🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
  • 🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
  • 🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
  • 🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works
  • ClawSweeper keeps one durable marker-backed review comment per issue or PR.
  • Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
  • A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
  • PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
  • Maintainers can also comment @clawsweeper review to request a fresh review only.
  • Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
  • Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
  • Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

@openclaw-barnacle openclaw-barnacle Bot added proof: supplied External PR includes structured after-fix real behavior proof. and removed triage: mock-only-proof Candidate: PR proof only shows tests, mocks, snapshots, lint, typecheck, or CI. labels Jun 7, 2026
@clawsweeper clawsweeper Bot added proof: sufficient ClawSweeper judged the real behavior proof convincing. rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. P1 High-priority user-facing bug, regression, or broken workflow. labels Jun 7, 2026
@Takhoffman

Copy link
Copy Markdown
Contributor

@clawsweeper automerge

@clawsweeper

clawsweeper Bot commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

🦞🧹
ClawSweeper automerge is enabled.

  • Head: f9270c28e735
  • Label: clawsweeper:automerge
  • Action: exact-head review queued (workflow sweep.yml, event repository_dispatch).
  • Flow: review this head, repair/rebase only if needed, then re-review the exact repaired head before merge.

Draft PRs stay fix-only until GitHub marks them ready for review. Pause with /clawsweeper stop.

Automerge progress:

  • 2026-06-08 01:53:54 UTC review queued 60edfe6f32d6 (queued)
  • 2026-06-08 02:01:10 UTC review passed 60edfe6f32d6 (structured ClawSweeper verdict: pass (sha=60edfe6f32d6cbbb0b4a925be9952b97f1ab5...)
  • 2026-06-08 02:14:53 UTC review queued f9270c28e735 (after repair)
  • 2026-06-08 02:21:56 UTC review passed f9270c28e735 (structured ClawSweeper verdict: pass (sha=f9270c28e73548cca60aa499ca65ba1d94873...)
  • 2026-06-08 02:22:08 UTC merged f9270c28e735 (merged by ClawSweeper automerge)
  • 2026-06-08 02:22:11 UTC review queued f9270c28e735 (queued)

Re-review progress:

@clawsweeper clawsweeper Bot added clawsweeper:automerge Maintainer opted this PR into bounded ClawSweeper-reviewed automerge status: 🚀 automerge armed This PR is in ClawSweeper's automerge lane. and removed status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. labels Jun 8, 2026
@clawsweeper clawsweeper Bot force-pushed the fix/91120-codex-native-subagent-result branch from 60edfe6 to f9270c2 Compare June 8, 2026 02:14
@clawsweeper clawsweeper Bot added the merge-risk: 🚨 message-delivery 🚨 May drop, duplicate, misroute, suppress, or wrongly target messages. label Jun 8, 2026
@clawsweeper clawsweeper Bot merged commit 766c5b3 into openclaw:main Jun 8, 2026
159 checks passed
github-actions Bot pushed a commit to Desicool/openclaw that referenced this pull request Jun 8, 2026
Summary:
- The branch updates the Codex plugin native subagent parser, monitor, and tests so successful null or blank c ... final result and transcript reconciliation can override early empty notifications before fallback delivery.
- PR surface: Source +92, Tests +176. Total +268 across 4 files.
- Reproducibility: yes. at source level: current main maps successful null/blank Codex completions to `(no out ... n recover final text. I did not run a live current-main Telegram/Codex reproduction in this read-only pass.

Automerge notes:
- PR branch already contained follow-up commit before automerge: fix(codex): preserve native subagent completion results

Validation:
- ClawSweeper review passed for head f9270c2.
- Required merge gates passed before the squash merge.

Prepared head SHA: f9270c2
Review: openclaw#91235 (comment)

Co-authored-by: 宇宙熊Yzx <53250620+849261680@users.noreply.github.com>
Co-authored-by: clawsweeper <274271284+clawsweeper[bot]@users.noreply.github.com>
Co-authored-by: clawsweeper[bot] <274271284+clawsweeper[bot]@users.noreply.github.com>
Approved-by: takhoffman
Co-authored-by: takhoffman <781889+takhoffman@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

clawsweeper:automerge Maintainer opted this PR into bounded ClawSweeper-reviewed automerge extensions: codex merge-risk: 🚨 message-delivery 🚨 May drop, duplicate, misroute, suppress, or wrongly target messages. P1 High-priority user-facing bug, regression, or broken workflow. proof: sufficient ClawSweeper judged the real behavior proof convincing. proof: supplied External PR includes structured after-fix real behavior proof. rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. size: M status: 🚀 automerge armed This PR is in ClawSweeper's automerge lane.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] Codex native subagent completion can surface as no-output/null despite completed child result

2 participants