Skip to content

fix(acpx): surface Codex wrapper stderr on internal errors#80718

Merged
steipete merged 1 commit into
openclaw:mainfrom
leoge007:fix/acp-codex-diagnostics
May 14, 2026
Merged

fix(acpx): surface Codex wrapper stderr on internal errors#80718
steipete merged 1 commit into
openclaw:mainfrom
leoge007:fix/acp-codex-diagnostics

Conversation

@leoge007

@leoge007 leoge007 commented May 11, 2026

Copy link
Copy Markdown
Contributor

OpenClaw ACP Codex Runtime Fix

Related issue:

https://github.com/openclaw/openclaw/issues/80079

Bug

Direct acpx -> Codex can work while OpenClaw sessions_spawn(runtime="acp", agentId="codex") fails before Codex output with only:

AcpRuntimeError: Internal error
Result: no output
Tokens: 0 in / 0 out

That hides the real Codex/acpx stderr, so provider/config failures are not actionable from OpenClaw.

Root Cause

OpenClaw does not invoke exactly the same path as direct acpx.

Direct path:

acpx CLI -> Codex ACP adapter

OpenClaw path:

sessions_spawn -> src/agents/acp-spawn.ts -> ACP manager -> extensions/acpx runtime -> generated Codex wrapper -> Codex ACP adapter

The generated Codex ACP wrapper used isolated CODEX_HOME. Generic adapter failures surfaced as Internal error, while the underlying wrapper stderr was only piped and not captured for the ACP error path.

Fix

  • Capture generated Codex wrapper stderr to a per-lease log while preserving stderr mirroring.
  • Redact diagnostics before surfacing or persisting them.
  • Append redacted stderr tails to generic Codex ACP Internal error failures during session init, thrown first-turn failures, and terminal runTurn() error events.
  • Copy only safe Codex config routing keys into isolated CODEX_HOME, including model/provider selection but not auth, headers, query params, hooks, or notifications.
  • Preserve process lease identity on Windows through wrapper args.
  • Assert ACP metadata/session initialization dispatch order.
  • Document the new diagnostic behavior.

Changed Files

CHANGELOG.md
docs/tools/acp-agents.md
extensions/acpx/src/codex-auth-bridge.ts
extensions/acpx/src/codex-auth-bridge.test.ts
extensions/acpx/src/codex-trust-config.ts
extensions/acpx/src/process-lease.ts
extensions/acpx/src/process-lease.test.ts
extensions/acpx/src/runtime.ts
extensions/acpx/src/runtime.test.ts
src/agents/acp-spawn.test.ts

Real behavior proof

Behavior addressed: Codex ACP failures surfaced only as AcpRuntimeError: Internal error with no underlying stderr, making configuration/provider issues hard to diagnose.

Real environment tested: macOS/arm64 source checkout with @openclaw/acpx tests and agent-core ACP spawn coverage.

Exact steps or command run after this patch: Ran focused acpx runtime, auth-bridge, process-lease, and acp-spawn tests, docs formatting, and whitespace checks.

Evidence after fix:

OPENCLAW_VITEST_MAX_WORKERS=1 node scripts/run-vitest.mjs extensions/acpx/src/runtime.test.ts extensions/acpx/src/codex-auth-bridge.test.ts extensions/acpx/src/process-lease.test.ts src/agents/acp-spawn.test.ts
5 files passed, 167 tests passed

node scripts/run-vitest.mjs run --config test/vitest/vitest.extension-acpx.config.ts extensions/acpx/src/runtime.test.ts extensions/acpx/src/codex-auth-bridge.test.ts extensions/acpx/src/process-lease.test.ts
3 files passed, 57 tests passed

node scripts/run-vitest.mjs run --config test/vitest/vitest.agents-core.config.ts src/agents/acp-spawn.test.ts
1 file passed, 55 tests passed

git diff --check
passed

pnpm docs:list
passed

pnpm format:docs:check
Docs formatting clean (625 files).

Observed result after fix: Tests cover redacted stderr diagnostics for generic Codex ACP failures, persisted log redaction, safe Codex config bridging, Windows wrapper lease args, and ACP parent stream dispatch order.

What was not tested: A live installed gateway package running this PR branch against sessions_spawn(runtime="acp", agentId="codex"); local proof stayed at the source-level ACP runtime and agent spawn boundaries.

@openclaw-barnacle openclaw-barnacle Bot added agents Agent runtime and tooling extensions: acpx size: M triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. labels May 11, 2026
@clawsweeper

clawsweeper Bot commented May 11, 2026

Copy link
Copy Markdown
Contributor

Codex review: needs real behavior proof before merge.

Summary
The PR captures and redacts Codex ACP wrapper stderr, decorates generic ACP internal errors, preserves selected Codex config in isolated CODEX_HOME, and updates tests, docs, and changelog.

Reproducibility: yes. at source level: current main's generated Codex wrapper does not retain stderr and ACPX ensureSession/runTurn paths pass through generic Internal error failures without a retained stderr tail. I did not establish a live installed-gateway reproduction in this read-only review.

Real behavior proof
Needs real behavior proof before merge: Needs real behavior proof before merge: the PR body supplies focused test output only and explicitly says the installed gateway path was not tested; add a redacted terminal/log/recording proof and update the PR body so ClawSweeper re-reviews automatically, or ask a maintainer to comment @clawsweeper re-review if it does not. Redact private information such as IP addresses, API keys, phone numbers, non-public endpoints, and other private details.

Next step before merge
Human/contributor action is needed to address the sandbox inheritance finding and supply redacted installed-path behavior proof; automation cannot provide the contributor's real setup proof.

Security
Needs attention: The stderr capture is bounded and redacted, but the diff also copies Codex sandbox_mode into isolated ACP config, which can silently broaden child process permissions.

Review findings

  • [P1] Stop inheriting Codex sandbox policy — extensions/acpx/src/codex-trust-config.ts:166
Review details

Best possible solution:

Keep the lease-scoped redacted stderr-tail design, remove or explicitly justify sandbox policy inheritance from isolated Codex config, then land after redacted installed-path proof shows the actionable stderr tail in the sessions_spawn flow.

Do we have a high-confidence way to reproduce the issue?

Yes at source level: current main's generated Codex wrapper does not retain stderr and ACPX ensureSession/runTurn paths pass through generic Internal error failures without a retained stderr tail. I did not establish a live installed-gateway reproduction in this read-only review.

Is this the best way to solve the issue?

Mostly yes, but not as submitted: lease-scoped redacted stderr tails are a narrow maintainable fix for opaque generic Codex ACP failures. The sandbox policy inheritance should be removed or handled through an explicit OpenClaw-owned policy contract before merge.

Full review comments:

  • [P1] Stop inheriting Codex sandbox policy — extensions/acpx/src/codex-trust-config.ts:166
    This whitelist is described as copying safe model/provider routing into the isolated CODEX_HOME, but sandbox_mode is execution policy, not routing. If a host Codex config uses a broader value such as danger-full-access, ACP child sessions now silently inherit that permission posture even though the current ACP docs say isolated Codex config only carries trusted project entries. Keep the model/provider keys, but leave sandbox policy out unless OpenClaw has an explicit policy path for it.
    Confidence: 0.86

Overall correctness: patch is incorrect
Overall confidence: 0.86

Security concerns:

  • [medium] Sandbox policy is inherited as routing config — extensions/acpx/src/codex-trust-config.ts:166
    Adding sandbox_mode to the inherited Codex config whitelist moves host Codex execution policy into ACP child sessions. That can change sandbox strength without an explicit OpenClaw policy decision, so it should be removed or separately approved and documented.
    Confidence: 0.86

What I checked:

Likely related people:

  • steipete: Recent current-main history repeatedly touches ACPX runtime, Codex wrapper/auth isolation, process leases, and bundled ACP adapter behavior central to this PR; the current PR head commit is also authored by this account with contributor co-author credit. (role: recent ACPX area contributor; confidence: high; commits: 42ecd5d95eae, d3e4640bedf5, 137a3629cc7f; files: extensions/acpx/src/runtime.ts, extensions/acpx/src/codex-auth-bridge.ts, extensions/acpx/src/process-lease.ts)
  • vincentkoc: Recent merged history includes ACP-only field handling at the sessions_spawn boundary adjacent to the acp-spawn behavior and tests touched here. (role: sessions_spawn ACP contributor; confidence: medium; commits: efc3a52947e9; files: src/agents/acp-spawn.ts)
  • hexsprite: Recent merged history preserved ACP runtime error detail across lifecycle boundaries, adjacent to this PR's diagnostic error-surfacing behavior. (role: adjacent ACP error propagation contributor; confidence: medium; commits: 86c1622a3a3c; files: src/agents/acp-spawn.ts, src/acp/runtime/errors.test.ts)

Remaining risk / open question:

  • No redacted installed-gateway proof currently shows the patched sessions_spawn(runtime="acp", agentId="codex") path surfacing the stderr tail.
  • The linked bug also includes Discord reply delivery behavior, which this diagnostic PR does not by itself prove fixed.
  • The sandbox_mode inheritance needs removal or explicit maintainer approval because it changes ACP child process permission policy.

Codex review notes: model gpt-5.5, reasoning high; reviewed against 926bf66ee336.

@openclaw-barnacle openclaw-barnacle Bot added proof: supplied External PR includes structured after-fix real behavior proof. and removed triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. labels May 11, 2026
@leoge007 leoge007 force-pushed the fix/acp-codex-diagnostics branch from e1b0e87 to 51ef17e Compare May 12, 2026 12:04
@steipete steipete force-pushed the fix/acp-codex-diagnostics branch from 0af9469 to d188047 Compare May 14, 2026 19:19
@steipete steipete force-pushed the fix/acp-codex-diagnostics branch from d188047 to f62f429 Compare May 14, 2026 20:39
@openclaw-barnacle openclaw-barnacle Bot added the docs Improvements or additions to documentation label May 14, 2026
@steipete steipete force-pushed the fix/acp-codex-diagnostics branch 3 times, most recently from 1da0cdd to 25ff8cf Compare May 14, 2026 21:11
Co-authored-by: leoge007 <leoge@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling docs Improvements or additions to documentation extensions: acpx proof: supplied External PR includes structured after-fix real behavior proof. size: L

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants