Skip to content

fix(gateway): resolve macOS App node shell command probing timeouts#83429

Closed
htjworld wants to merge 1 commit into
openclaw:mainfrom
htjworld:main
Closed

fix(gateway): resolve macOS App node shell command probing timeouts#83429
htjworld wants to merge 1 commit into
openclaw:mainfrom
htjworld:main

Conversation

@htjworld

Copy link
Copy Markdown

Summary

Resolves a performance regression where macOS App nodes experience 10-second timeout delays during shell command probing when co-located with the gateway. Since App nodes do not execute shell commands, system.run is excluded from their allowed command list, allowing the gateway to skip command probing entirely.

This change ensures that clientId and clientMode are correctly forwarded during both node connection and pairing reconciliation flows. It also corrects the E2E test validation in test/app-node-probe.e2e.test.ts to utilize the valid client mode "node" (conforming to GatewayClientModeSchema) in combination with the "openclaw-macos" client ID, which successfully and safely classifies the client as an App Node under the gateway command policy.

Related Issues

Real behavior proof

  • Behavior addressed: macOS App nodes experiencing 10-second timeout delays during gateway shell command probing.
  • Real environment tested: Local host environment using Vitest + Node.js 22.
  • Exact steps or command run after this patch:
pnpm test test/app-node-probe.e2e.test.ts
  • Evidence after fix:
✓ test/app-node-probe.e2e.test.ts (1 test) 4861ms
    ✓ does not timeout probing an app node  4814ms
Test Files  1 passed (1)
Tests       1 passed (1)
Duration    7.23s
  • Observed result after fix: The E2E test completed successfully under 5 seconds (4.8s total), confirming that the gateway skips shell probing entirely and does not assign system.run to App nodes.
  • What was not tested: Real macOS device hardware execution and full E2E pairing sequence via Crabbox.

@openclaw-barnacle openclaw-barnacle Bot added gateway Gateway runtime size: S proof: supplied External PR includes structured after-fix real behavior proof. labels May 18, 2026
@clawsweeper

clawsweeper Bot commented May 18, 2026

Copy link
Copy Markdown
Contributor

Codex review: needs real behavior proof before merge.

Workflow note: Future ClawSweeper reviews update this same comment in place.

How this review workflow works
  • ClawSweeper keeps one durable marker-backed review comment per issue or PR.
  • Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
  • A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
  • PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
  • Maintainers can also comment @clawsweeper review to request a fresh review only.
  • Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
  • Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
  • Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

Summary
The PR changes gateway node command policy to remove system-run commands from app-identified node allowlists, forwards client identity into those policy decisions, and adds a gateway app-node probe regression test.

Reproducibility: yes. Source inspection shows the gateway can start automatic remote-bin probing during node connect, and the linked issue’s live report confirms automatic probe timeout while manual system.which succeeds; I did not run live macOS hardware proof in this read-only review.

PR rating
Overall: 🧂 unranked krab
Proof: 🦪 silver shellfish
Patch quality: 🧂 unranked krab
Summary: The PR is not merge-ready because the patch appears to break documented macOS app-node exec behavior and the supplied proof is test-only.

Rank-up moves:

  • Preserve remote macOS app-node system.run/system.which when declared, and move the fix to probe readiness/order or a co-location-specific skip.
  • Add redacted real macOS app plus gateway proof showing automatic connect-time probing no longer times out without disabling valid remote node exec.
What the crustacean ranks mean
  • 🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
  • 🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
  • 🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
  • 🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
  • 🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
  • 🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
  • 🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

Real behavior proof
Needs real behavior proof before merge: The PR body provides only a Vitest run against a synthetic gateway client; contributor still needs redacted real macOS app plus gateway proof before merge. After adding proof, update the PR body; ClawSweeper should re-review automatically. If it does not, the PR author or someone with repository write access can comment @clawsweeper re-review.

Risk before merge
Why this matters: - Merging this PR can remove system.run, system.run.prepare, and system.which from valid remote macOS app-backed nodes during pairing or reconnect, making exec host=node fail for existing documented setups.

  • The PR closes the linked issue, but that issue has newer evidence for a first-connect readiness/timing problem that this allowlist removal does not solve.
  • Contributor proof is test-only; there is no redacted real macOS app plus gateway run showing the affected connect path after the patch.

Maintainer choices:

  1. (recommended) Fix the PR before merge with ClawSweeper automerge:
@clawsweeper automerge
Special instructions: Preserve app-backed node `system.run` and `system.which` when the node declares and handles them; fix the timeout by gating or deferring automatic remote-bin refresh until the app-backed node is ready, or by skipping only a proven co-located non-exec case without changing remote node command approval.
  1. Accept the merge risk explicitly:
Merge as-is only if maintainers intentionally accept this risk: Preserve app-backed node `system.run` and `system.which` when the node declares and handles them; fix the timeout by gating or deferring automatic remote-bin refresh until the app-backed node is ready, or by skipping only a proven co-located non-exec case without changing remote node command approval.
  1. Stop automation and require a human merge decision:
Do not automerge this PR. Ask a maintainer to decide whether to require changes, merge anyway, or close the PR as not worth the risk: Preserve app-backed node `system.run` and `system.which` when the node declares and handles them; fix the timeout by gating or deferring automatic remote-bin refresh until the app-backed node is ready, or by skipping only a proven co-located non-exec case without changing remote node command approval.

Next step before merge
This external PR needs human-guided revision plus contributor real behavior proof; automation should not take over while the supplied proof is test-only and the fix direction disables a documented command surface.

Security
Cleared: The diff does not add dependencies, workflows, package scripts, secret handling, or new code-execution sources; the exec impact is a compatibility regression, not a supply-chain concern.

Review findings

  • [P1] Preserve declared macOS app-node exec commands — src/gateway/node-command-policy.ts:354
Review details

Best possible solution:

Preserve app-backed node system.run and system.which when the node declares and handles them; fix the timeout by gating or deferring automatic remote-bin refresh until the app-backed node is ready, or by skipping only a proven co-located non-exec case without changing remote node command approval.

Do we have a high-confidence way to reproduce the issue?

Yes. Source inspection shows the gateway can start automatic remote-bin probing during node connect, and the linked issue’s live report confirms automatic probe timeout while manual system.which succeeds; I did not run live macOS hardware proof in this read-only review.

Is this the best way to solve the issue?

No. The proposed path disables the documented macOS app-node exec surface, while the narrower maintainable solution is a readiness/order or co-location-specific probe fix that preserves valid remote app-node execution.

Full review comments:

  • [P1] Preserve declared macOS app-node exec commands — src/gateway/node-command-policy.ts:354
    This deletes system.run, system.run.prepare, and system.which for every node identified as openclaw-macos/openclaw-windows or clientMode: "app". Current main’s macOS app declares and handles system.run/system.which, docs describe macOS node mode as an exec host, and exec host=node requires nodeInfo.commands to include system.run; valid remote app-backed nodes would stop working instead of fixing the connect-time probe readiness bug.
    Confidence: 0.91

Overall correctness: patch is incorrect
Overall confidence: 0.9

Acceptance criteria:

  • node scripts/run-vitest.mjs src/gateway/node-command-policy.test.ts src/infra/skills-remote.test.ts test/app-node-probe.e2e.test.ts
  • Redacted live macOS app plus gateway proof showing connect-time probe behavior and exec host=node/system.which still work

What I checked:

Likely related people:

  • steipete: Blame and live commit metadata show recent ownership of node-command-policy.ts, the connect-time remote-bin probe call, and skills-remote.ts probe changes. (role: current area contributor; confidence: high; commits: bef33563756f, 84b34519a8d2, f6317fb747f5; files: src/gateway/node-command-policy.ts, src/gateway/server/ws-connection/message-handler.ts, src/infra/skills-remote.ts)
  • pgondhi987: Recent commit history for node-command-policy.ts includes canonical node platform ID policy work, which is directly adjacent to this PR’s app/client identity policy change. (role: adjacent node policy contributor; confidence: medium; commits: d656087b3156; files: src/gateway/node-command-policy.ts, src/gateway/server/ws-connection/message-handler.ts)
  • Takhoffman: Recent remote-skills history includes snapshot invalidation work in the same probe/cache subsystem affected by the linked timeout issue. (role: adjacent remote skills contributor; confidence: medium; commits: 39048e054de3; files: src/infra/skills-remote.ts, src/infra/skills-remote.test.ts)
  • shanselman: The PR also special-cases openclaw-windows; recent node-command-policy history includes Windows companion node command work. (role: adjacent Windows companion node contributor; confidence: medium; commits: 8f277e4b7f4f; files: src/gateway/node-command-policy.ts)

Codex review notes: model gpt-5.5, reasoning high; reviewed against cd15ce35a0ee.

@clawsweeper clawsweeper Bot added rating: 🌊 off-meta tidepool PR readiness rating does not apply to this item. rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. P2 Normal backlog priority with limited blast radius. impact:message-loss Channel message delivery can be lost, duplicated, or misrouted. impact:session-state Session, memory, transcript, context, or agent state can drift or corrupt. merge-risk: 🚨 compatibility 🚨 May break existing users, config, migrations, defaults, or upgrade paths. merge-risk: 🚨 message-delivery 🚨 May drop, duplicate, misroute, suppress, or wrongly target messages. and removed rating: 🌊 off-meta tidepool PR readiness rating does not apply to this item. labels May 18, 2026
@vincentkoc

Copy link
Copy Markdown
Member

Closing this as not planned for the remote mac/node exec lane.

The risky bit is that this removes the app-node system.run / system.which command surface from macOS/Windows app nodes. That conflicts with the existing node exec contract: host=node requires the selected node to support system.run, and the macOS app node is expected to expose that command path.

The safer fix direction is #85417: keep the app-node command contract intact, and make Codex app-server node-targeted sessions route through OpenClaw exec / process instead of Codex native shell.

If the App node probe timeout still reproduces after that lands, the follow-up should be a narrower probe fix that does not remove declared node execution commands.

@vincentkoc vincentkoc closed this May 22, 2026
vincentkoc added a commit that referenced this pull request May 22, 2026
Fixes https://github.com/openclaw/openclaw/issues/85012.\n\nSupersedes #85090 and closes out #83429 as the wrong direction.\n\nVerification before merge:\n- git diff --check origin/main\n- node scripts/run-vitest.mjs extensions/codex/src/app-server/run-attempt.test.ts\n- codex review --base origin/main\n\nNote: the GitHub Real behavior proof check on this maintainer PR was a maintainer bypass, not the live Linux gateway/container plus macOS node proof. User approved merge with this caveat preserved on the PR thread.
SebTardif pushed a commit to SebTardif/openclaw that referenced this pull request May 24, 2026
Fixes https://github.com/openclaw/openclaw/issues/85012.\n\nSupersedes openclaw#85090 and closes out openclaw#83429 as the wrong direction.\n\nVerification before merge:\n- git diff --check origin/main\n- node scripts/run-vitest.mjs extensions/codex/src/app-server/run-attempt.test.ts\n- codex review --base origin/main\n\nNote: the GitHub Real behavior proof check on this maintainer PR was a maintainer bypass, not the live Linux gateway/container plus macOS node proof. User approved merge with this caveat preserved on the PR thread.
SebTardif pushed a commit to SebTardif/openclaw that referenced this pull request May 24, 2026
Fixes https://github.com/openclaw/openclaw/issues/85012.\n\nSupersedes openclaw#85090 and closes out openclaw#83429 as the wrong direction.\n\nVerification before merge:\n- git diff --check origin/main\n- node scripts/run-vitest.mjs extensions/codex/src/app-server/run-attempt.test.ts\n- codex review --base origin/main\n\nNote: the GitHub Real behavior proof check on this maintainer PR was a maintainer bypass, not the live Linux gateway/container plus macOS node proof. User approved merge with this caveat preserved on the PR thread.
SebTardif pushed a commit to SebTardif/openclaw that referenced this pull request May 24, 2026
Fixes https://github.com/openclaw/openclaw/issues/85012.\n\nSupersedes openclaw#85090 and closes out openclaw#83429 as the wrong direction.\n\nVerification before merge:\n- git diff --check origin/main\n- node scripts/run-vitest.mjs extensions/codex/src/app-server/run-attempt.test.ts\n- codex review --base origin/main\n\nNote: the GitHub Real behavior proof check on this maintainer PR was a maintainer bypass, not the live Linux gateway/container plus macOS node proof. User approved merge with this caveat preserved on the PR thread.
github-actions Bot pushed a commit to Desicool/openclaw that referenced this pull request May 24, 2026
Fixes https://github.com/openclaw/openclaw/issues/85012.\n\nSupersedes openclaw#85090 and closes out openclaw#83429 as the wrong direction.\n\nVerification before merge:\n- git diff --check origin/main\n- node scripts/run-vitest.mjs extensions/codex/src/app-server/run-attempt.test.ts\n- codex review --base origin/main\n\nNote: the GitHub Real behavior proof check on this maintainer PR was a maintainer bypass, not the live Linux gateway/container plus macOS node proof. User approved merge with this caveat preserved on the PR thread.
galiniliev pushed a commit to galiniliev/openclaw that referenced this pull request May 25, 2026
Fixes https://github.com/openclaw/openclaw/issues/85012.\n\nSupersedes openclaw#85090 and closes out openclaw#83429 as the wrong direction.\n\nVerification before merge:\n- git diff --check origin/main\n- node scripts/run-vitest.mjs extensions/codex/src/app-server/run-attempt.test.ts\n- codex review --base origin/main\n\nNote: the GitHub Real behavior proof check on this maintainer PR was a maintainer bypass, not the live Linux gateway/container plus macOS node proof. User approved merge with this caveat preserved on the PR thread.
SebTardif pushed a commit to SebTardif/openclaw that referenced this pull request May 26, 2026
Fixes https://github.com/openclaw/openclaw/issues/85012.\n\nSupersedes openclaw#85090 and closes out openclaw#83429 as the wrong direction.\n\nVerification before merge:\n- git diff --check origin/main\n- node scripts/run-vitest.mjs extensions/codex/src/app-server/run-attempt.test.ts\n- codex review --base origin/main\n\nNote: the GitHub Real behavior proof check on this maintainer PR was a maintainer bypass, not the live Linux gateway/container plus macOS node proof. User approved merge with this caveat preserved on the PR thread.
SebTardif pushed a commit to SebTardif/openclaw that referenced this pull request May 26, 2026
Fixes https://github.com/openclaw/openclaw/issues/85012.\n\nSupersedes openclaw#85090 and closes out openclaw#83429 as the wrong direction.\n\nVerification before merge:\n- git diff --check origin/main\n- node scripts/run-vitest.mjs extensions/codex/src/app-server/run-attempt.test.ts\n- codex review --base origin/main\n\nNote: the GitHub Real behavior proof check on this maintainer PR was a maintainer bypass, not the live Linux gateway/container plus macOS node proof. User approved merge with this caveat preserved on the PR thread.
SebTardif pushed a commit to SebTardif/openclaw that referenced this pull request May 26, 2026
Fixes https://github.com/openclaw/openclaw/issues/85012.\n\nSupersedes openclaw#85090 and closes out openclaw#83429 as the wrong direction.\n\nVerification before merge:\n- git diff --check origin/main\n- node scripts/run-vitest.mjs extensions/codex/src/app-server/run-attempt.test.ts\n- codex review --base origin/main\n\nNote: the GitHub Real behavior proof check on this maintainer PR was a maintainer bypass, not the live Linux gateway/container plus macOS node proof. User approved merge with this caveat preserved on the PR thread.
jameslcowan pushed a commit to jameslcowan/openclaw that referenced this pull request Jun 2, 2026
Fixes https://github.com/openclaw/openclaw/issues/85012.\n\nSupersedes openclaw#85090 and closes out openclaw#83429 as the wrong direction.\n\nVerification before merge:\n- git diff --check origin/main\n- node scripts/run-vitest.mjs extensions/codex/src/app-server/run-attempt.test.ts\n- codex review --base origin/main\n\nNote: the GitHub Real behavior proof check on this maintainer PR was a maintainer bypass, not the live Linux gateway/container plus macOS node proof. User approved merge with this caveat preserved on the PR thread.
SYU8384 pushed a commit to SYU8384/openclaw that referenced this pull request Jun 3, 2026
Fixes https://github.com/openclaw/openclaw/issues/85012.\n\nSupersedes openclaw#85090 and closes out openclaw#83429 as the wrong direction.\n\nVerification before merge:\n- git diff --check origin/main\n- node scripts/run-vitest.mjs extensions/codex/src/app-server/run-attempt.test.ts\n- codex review --base origin/main\n\nNote: the GitHub Real behavior proof check on this maintainer PR was a maintainer bypass, not the live Linux gateway/container plus macOS node proof. User approved merge with this caveat preserved on the PR thread.
sablehead pushed a commit to sablehead/openclaw that referenced this pull request Jun 10, 2026
Fixes https://github.com/openclaw/openclaw/issues/85012.\n\nSupersedes openclaw#85090 and closes out openclaw#83429 as the wrong direction.\n\nVerification before merge:\n- git diff --check origin/main\n- node scripts/run-vitest.mjs extensions/codex/src/app-server/run-attempt.test.ts\n- codex review --base origin/main\n\nNote: the GitHub Real behavior proof check on this maintainer PR was a maintainer bypass, not the live Linux gateway/container plus macOS node proof. User approved merge with this caveat preserved on the PR thread.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

gateway Gateway runtime impact:message-loss Channel message delivery can be lost, duplicated, or misrouted. impact:session-state Session, memory, transcript, context, or agent state can drift or corrupt. merge-risk: 🚨 compatibility 🚨 May break existing users, config, migrations, defaults, or upgrade paths. merge-risk: 🚨 message-delivery 🚨 May drop, duplicate, misroute, suppress, or wrongly target messages. P2 Normal backlog priority with limited blast radius. proof: supplied External PR includes structured after-fix real behavior proof. rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. size: S

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Remote skill bin probe times out when node is co-located with gateway (macOS app as node on same host)

2 participants