Skip to content

fix(codex): route node exec through OpenClaw tools#85417

Merged
vincentkoc merged 1 commit into
mainfrom
remote-mac-node-exec-proof-20260522
May 22, 2026
Merged

fix(codex): route node exec through OpenClaw tools#85417
vincentkoc merged 1 commit into
mainfrom
remote-mac-node-exec-proof-20260522

Conversation

@vincentkoc

Copy link
Copy Markdown
Member

Summary

  • Disable Codex app-server native Code Mode when the effective OpenClaw exec host is node.
  • Re-expose OpenClaw exec and process dynamic tools for node-targeted Codex app-server runs so shell commands route through the selected node.
  • Add regression coverage and a changelog entry crediting the original contributor direction from fix(codex): disable native shell for node exec sessions #85090.

Root Cause

/exec host=node updated the OpenClaw exec defaults, but Codex app-server native shell remained enabled and still executed in the app-server/gateway environment. The node-aware OpenClaw exec path was present, but Codex could bypass it through its native shell surface.

Validation

  • git diff --check origin/main
  • node scripts/run-vitest.mjs extensions/codex/src/app-server/run-attempt.test.ts (206 passed)
  • codex review --base origin/main (no actionable regressions)

Blocked in this shell:

  • Crabbox mac live proof: local Crabbox has no configured coordinator.
  • Testbox/changed gate: Blacksmith is not authenticated in this environment.

Follow-up Proof Required Before Merge

Run a real Linux gateway/container plus connected macOS node proof:

/exec host=node node=<mac-node> security=full ask=off
uname -s && hostname && pwd && whoami

Expected result: the visible shell command reports the selected macOS node environment, including Darwin, not the Linux gateway/container.

Fixes #85012.
Supersedes #85090.

@vincentkoc vincentkoc self-assigned this May 22, 2026
@vincentkoc vincentkoc added extensions: codex size: S status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. impact:security Security boundary, credential, authz, sandbox, or sensitive-data risk. impact:session-state Session, memory, transcript, context, or agent state can drift or corrupt. labels May 22, 2026
@openclaw-barnacle openclaw-barnacle Bot added size: S maintainer Maintainer-authored PR and removed size: S triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. impact:session-state Session, memory, transcript, context, or agent state can drift or corrupt. impact:security Security boundary, credential, authz, sandbox, or sensitive-data risk. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. labels May 22, 2026
@vincentkoc vincentkoc added status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. impact:security Security boundary, credential, authz, sandbox, or sensitive-data risk. impact:session-state Session, memory, transcript, context, or agent state can drift or corrupt. labels May 22, 2026
@vincentkoc

Copy link
Copy Markdown
Member Author

Verification note: the green GitHub Real behavior proof check on this maintainer PR is a maintainer bypass, not the requested Linux gateway/container plus macOS node proof. The job log says the PR author is an active maintainer and the proof gate was skipped.

Keeping this draft until a real node-routing proof is attached or run:

/exec host=node node=<mac-node> security=full ask=off
uname -s && hostname && pwd && whoami

Expected: the visible shell reports the selected macOS node environment, including Darwin.

@clawsweeper

clawsweeper Bot commented May 22, 2026

Copy link
Copy Markdown
Contributor

Codex review: needs maintainer review before merge.

Latest ClawSweeper review: 2026-05-22 15:19 UTC / May 22, 2026, 11:19 AM ET.

Workflow note: Future ClawSweeper reviews update this same comment in place.

How this review workflow works
  • ClawSweeper keeps one durable marker-backed review comment per issue or PR.
  • Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
  • A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
  • PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
  • Maintainers can also comment @clawsweeper review to request a fresh review only.
  • Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
  • Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
  • Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

Summary
The PR disables Codex app-server native Code Mode for node-targeted exec sessions, re-adds OpenClaw exec/process dynamic tools for that case, and adds focused regression tests plus a changelog entry.

Reproducibility: yes. source-reproducible: current main receives session /exec state as execOverrides, but Codex app-server still enables native Code Mode and filters OpenClaw exec/process out. I did not run the live Linux-to-macOS node scenario.

PR rating
Overall: 🦐 gold shrimp
Proof: 🦐 gold shrimp
Patch quality: 🐚 platinum hermit
Summary: The patch is focused and source-backed with good regression coverage, but reviewer confidence is capped by the acknowledged missing live node-routing proof.

Rank-up moves:

  • Attach redacted terminal or log proof for /exec host=node node=<mac-node> security=full ask=off followed by uname -s && hostname && pwd && whoami, showing the macOS node environment.
What the crustacean ranks mean
  • 🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
  • 🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
  • 🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
  • 🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
  • 🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
  • 🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
  • 🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

Real behavior proof
Not applicable: This is a maintainer PR, so the external-contributor proof gate does not apply; the PR discussion itself still asks for live Linux gateway/container plus macOS node proof before merge.

Risk before merge

  • Live cross-host proof is still missing: the PR should show the visible shell command running through a Linux gateway/container to the selected macOS node and reporting Darwin, not Linux.
  • The change intentionally disables Codex native Code Mode for node-targeted app-server sessions; that is likely the right boundary, but it can affect users who expected native Code Mode while tools.exec.host or the session default is node.
  • Because the patch changes command execution routing, green unit tests do not settle the security-boundary question that the selected node, approvals, and shell environment are the actual runtime path.

Maintainer options:

  1. Attach live node-routing proof (recommended)
    Run the documented Linux gateway/container plus connected macOS node command and attach redacted terminal or log output showing the selected node reports Darwin.
  2. Accept the native Code Mode tradeoff
    Maintainers may intentionally merge with unit/source proof only, but should explicitly own that node-targeted Codex app-server sessions lose native Code Mode.
  3. Keep the draft paused
    If no real cross-host setup is available, keep this draft open and continue tracking the user bug at [Bug]: Agent shell tool ignores /exec host=node and still runs in container #85012 rather than landing an unproven routing change.

Next step before merge
This protected maintainer draft needs live cross-host proof or an explicit maintainer risk decision, not an automated code repair.

Security
Cleared: The diff changes command execution routing but reuses existing OpenClaw exec/process tools, honors explicit dynamic-tool excludes, and does not add dependencies, scripts, secrets handling, or new network code.

Review details

Best possible solution:

Keep this PR open and land it only after live node-routing proof is attached or maintainers explicitly accept the compatibility and command-routing risk for node-targeted Codex app-server sessions.

Do we have a high-confidence way to reproduce the issue?

Yes, source-reproducible: current main receives session /exec state as execOverrides, but Codex app-server still enables native Code Mode and filters OpenClaw exec/process out. I did not run the live Linux-to-macOS node scenario.

Is this the best way to solve the issue?

Yes, the proposed direction is the narrowest owner-boundary fix: disable the native Codex shell only when the effective OpenClaw exec host is node and expose the existing OpenClaw shell tools. The remaining gap is live upgrade/runtime proof, not a different code path.

Label changes:

  • add P1: The linked bug breaks an explicit multi-node shell workflow and can send commands to the wrong host for real Codex users.
  • add merge-risk: 🚨 compatibility: Merging changes node-targeted Codex app-server sessions from native Code Mode to OpenClaw dynamic shell tools.
  • add merge-risk: 🚨 security-boundary: The diff changes which execution surface runs shell commands and needs live proof that the selected node and OpenClaw approval path are used.
  • add rating: 🦐 gold shrimp: Current PR rating is 🦐 gold shrimp because proof is 🦐 gold shrimp, patch quality is 🐚 platinum hermit, and The patch is focused and source-backed with good regression coverage, but reviewer confidence is capped by the acknowledged missing live node-routing proof.
  • add status: 👀 ready for maintainer look: ClawSweeper has no concrete contributor-facing blocker left for this PR. Not applicable: This is a maintainer PR, so the external-contributor proof gate does not apply; the PR discussion itself still asks for live Linux gateway/container plus macOS node proof before merge.
  • remove impact:session-state: Current review selected no impact labels.
  • remove impact:security: Current review selected no impact labels.
  • remove status: 📣 needs proof: Current PR status label is status: 👀 ready for maintainer look.

Label justifications:

  • P1: The linked bug breaks an explicit multi-node shell workflow and can send commands to the wrong host for real Codex users.
  • merge-risk: 🚨 compatibility: Merging changes node-targeted Codex app-server sessions from native Code Mode to OpenClaw dynamic shell tools.
  • merge-risk: 🚨 security-boundary: The diff changes which execution surface runs shell commands and needs live proof that the selected node and OpenClaw approval path are used.
  • rating: 🦐 gold shrimp: Current PR rating is 🦐 gold shrimp because proof is 🦐 gold shrimp, patch quality is 🐚 platinum hermit, and The patch is focused and source-backed with good regression coverage, but reviewer confidence is capped by the acknowledged missing live node-routing proof.
  • status: 👀 ready for maintainer look: ClawSweeper has no concrete contributor-facing blocker left for this PR. Not applicable: This is a maintainer PR, so the external-contributor proof gate does not apply; the PR discussion itself still asks for live Linux gateway/container plus macOS node proof before merge.

Acceptance criteria:

  • Run a real Linux gateway/container plus connected macOS node proof: /exec host=node node=<mac-node> security=full ask=off, then uname -s && hostname && pwd && whoami, expecting the selected macOS node environment and Darwin.

What I checked:

Likely related people:

  • vincentkoc: Current-main blame for the relevant dynamic-tool filtering and native-surface path points to commit 01e7f64, and this PR modifies the same surface. (role: recent Codex app-server tool-surface contributor; confidence: high; commits: 01e7f6462910, 4e9721700492; files: extensions/codex/src/app-server/dynamic-tool-profile.ts, extensions/codex/src/app-server/run-attempt.ts)
  • steipete: Local history shows repeated foundational and hardening commits on extensions/codex/src/app-server/run-attempt.ts, including app-server controls and lifecycle refactors. (role: major Codex app-server feature-history owner; confidence: medium; commits: 31a0b7bd42a5, 8d72aafdbb8d, 9ac7a0398213; files: extensions/codex/src/app-server/run-attempt.ts)
  • Bryan P: Recent history shows a same-file Codex app-server fix for native subagent completions, making this person a plausible adjacent reviewer for app-server runtime interactions. (role: recent adjacent contributor; confidence: low; commits: f9d35dc68180; files: extensions/codex/src/app-server/run-attempt.ts)

Codex review notes: model gpt-5.5, reasoning high; reviewed against 8fc48af09190.

@clawsweeper clawsweeper Bot added rating: 🦐 gold shrimp Decent PR readiness signal, but merge confidence is limited. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. P1 High-priority user-facing bug, regression, or broken workflow. and removed status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. impact:session-state Session, memory, transcript, context, or agent state can drift or corrupt. impact:security Security boundary, credential, authz, sandbox, or sensitive-data risk. labels May 22, 2026
@clawsweeper clawsweeper Bot added merge-risk: 🚨 compatibility 🚨 May break existing users, config, migrations, defaults, or upgrade paths. merge-risk: 🚨 security-boundary 🚨 May affect sandboxing, authorization, credentials, or sensitive data. labels May 22, 2026
@clawsweeper

clawsweeper Bot commented May 22, 2026

Copy link
Copy Markdown
Contributor

ClawSweeper PR egg

✨ Hatched: 🥚 common Tiny Clawlet

Hatch command

Comment @clawsweeper hatch when this PR is hatchable.

Hatchability rules:

  • Merged PRs are hatchable.
  • Open PRs are hatchable when they are status: 👀 ready for maintainer look, status: 🚀 automerge armed, or labeled clawsweeper:automerge.
  • Closed unmerged PRs are hatchable only when one of those hatchable labels is still present in the durable record.

Rarity: 🥚 common.
Trait: hums during re-review.
Image traits: location green-check meadow; accessory green check lantern; palette seafoam, black, and opal; mood bright-eyed; pose balancing on a branch marker; shell glossy opal shell; lighting gentle morning glow; background smooth stones and checkmarks.
Share on X: post this hatch
Copy: My PR egg hatched a 🥚 common Tiny Clawlet in ClawSweeper.

What is this egg doing here?
  • Eggs appear after the PR passes real-behavior proof. It is here for vibes, not verdicts: it does not change labels, ratings, merge decisions, or automation.
  • The shell reacts to review momentum: open follow-up work warms it up, re-review makes it wobble, and a clean final review lets it hatch.
  • Hatchability usually comes from sufficient real-behavior proof, no blocking P0/P1/P2 findings, no security attention needed, and clean correctness. A merged PR is already final, so merge makes the egg hatchable independently.
  • The hatch is seeded from this repository and PR number, so the same PR keeps the same creature; the reviewed head SHA can only change safe visual details.
  • Rarity is just collectible sparkle: 🥚 common, 🌱 uncommon, 💎 rare, ✨ glimmer, and 🌈 legendary.

@vincentkoc vincentkoc marked this pull request as ready for review May 22, 2026 15:43
@vincentkoc vincentkoc merged commit 5cc0dbc into main May 22, 2026
174 of 196 checks passed
@vincentkoc vincentkoc deleted the remote-mac-node-exec-proof-20260522 branch May 22, 2026 15:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

extensions: codex maintainer Maintainer-authored PR merge-risk: 🚨 compatibility 🚨 May break existing users, config, migrations, defaults, or upgrade paths. merge-risk: 🚨 security-boundary 🚨 May affect sandboxing, authorization, credentials, or sensitive data. P1 High-priority user-facing bug, regression, or broken workflow. rating: 🦐 gold shrimp Decent PR readiness signal, but merge confidence is limited. size: S status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: Agent shell tool ignores /exec host=node and still runs in container

1 participant