fix(cli): guard OpenShell ssh-config calls#3859
Conversation
📝 WalkthroughWalkthroughExtracts sandbox-aware SSH-config capture into new client/runtime helpers, centralizes stdout+stderr handling, refactors callers/tests/state/diagnostics to use the helper, and updates CLI docs to prefer ChangesSSH Config Capture Extraction and Refactoring
Sequence Diagram (high level): sequenceDiagram
participant Caller as Caller (process-recovery / skill-install / version / state)
participant Runtime as src/lib/adapters/openshell/runtime.ts
participant Client as src/lib/adapters/openshell/client.ts
participant Openshell as openshell CLI
Caller->>Runtime: captureSandboxSshConfig(sandboxName, opts)
Runtime->>Client: captureSandboxSshConfigCommand(binary, sandboxName, opts)
Client->>Openshell: "sandbox get <name>" (ignoreError, includeStderr)
alt sandbox exists
Client->>Openshell: "sandbox ssh-config <name>"
Openshell-->>Client: ssh-config stdout
Client-->>Runtime: ssh-config output
Runtime-->>Caller: ssh-config output
else sandbox missing or get failed
Openshell-->>Client: get error output
Client-->>Runtime: "sandbox '<name>' not found" (or get error)
Runtime-->>Caller: error output
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related issues
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 ESLint
ESLint skipped: no ESLint configuration detected in root package.json. To enable, add Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/lib/adapters/openshell/client.ts`:
- Around line 183-187: The current code always replaces an empty output for any
non-zero sandboxGet.status with "sandbox '<sandboxName>' not found", which hides
other failure modes; change the logic in the sandbox retrieval branch so you
only synthesize the "not found" message for a proven not-found condition (e.g.,
a specific exit code or when stderr/output explicitly contains "not found"), and
for any other non-zero status preserve sandboxGet.output (or include
sandboxGet.error/exit code in the returned object) instead of overwriting it;
update the handling around sandboxGet, sandboxName and sandboxGet.status to
return the original result for transient or timeout errors and only emit the
not-found string when you can definitively detect that condition.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 56954f5f-d7fc-4262-8804-8413bb15e52b
📒 Files selected for processing (11)
docs/reference/cli-selection-guide.mdxsrc/lib/actions/sandbox/process-recovery.tssrc/lib/actions/sandbox/skill-install.tssrc/lib/adapters/openshell/client.test.tssrc/lib/adapters/openshell/client.tssrc/lib/adapters/openshell/runtime.tssrc/lib/diagnostics/debug.tssrc/lib/sandbox/version.test.tssrc/lib/sandbox/version.tssrc/lib/share-command-deps.tssrc/lib/state/sandbox.ts
28e1c46 to
d2da090
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/lib/adapters/openshell/client.ts`:
- Around line 35-37: The async capture path currently ignores includeStderr when
ignoreError is true, so update the async output-building logic (the code
handling CaptureOpenshellAsyncOptions / the function that assembles async
results) to respect the includeStderr flag even if ignoreError === true: when
includeStderr is true, append or merge stderr into the returned output/error
payload the same way the sync path does. Locate the code that checks ignoreError
and constructs the async response (references: CaptureOpenshellAsyncOptions,
includeStderr, ignoreError) and change the conditional so stderr is
preserved/attached when includeStderr is set.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 38ab85a1-548c-41a5-bdfa-df2508563f4c
📒 Files selected for processing (11)
docs/reference/cli-selection-guide.mdxsrc/lib/actions/sandbox/process-recovery.tssrc/lib/actions/sandbox/skill-install.tssrc/lib/adapters/openshell/client.test.tssrc/lib/adapters/openshell/client.tssrc/lib/adapters/openshell/runtime.tssrc/lib/diagnostics/debug.tssrc/lib/sandbox/version.test.tssrc/lib/sandbox/version.tssrc/lib/share-command-deps.tssrc/lib/state/sandbox.ts
✅ Files skipped from review due to trivial changes (1)
- docs/reference/cli-selection-guide.mdx
d2da090 to
96d0f41
Compare
There was a problem hiding this comment.
🧹 Nitpick comments (2)
src/lib/state/sandbox.ts (1)
472-476: Run the state lifecycle E2E set for this change path.Since this touches sandbox backup/restore SSH-config retrieval, please run
state-backup-restore-e2e,snapshot-commands-e2e, andrebuild-openclaw-e2eon this branch before merge.As per coding guidelines: "
src/lib/state/sandbox.ts: This file manages sandbox state (backup, restore, rebuild, snapshot). Changes affect data persistence across sandbox lifecycle operations." and the listed E2E test recommendation.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/lib/state/sandbox.ts` around lines 472 - 476, This change touches sandbox SSH-config retrieval in src/lib/state/sandbox.ts (specifically the call to captureSandboxSshConfigCommand and the result handling) and requires running the full state lifecycle E2E suites before merging: execute state-backup-restore-e2e, snapshot-commands-e2e, and rebuild-openclaw-e2e on this branch to validate backup/restore/snapshot/rebuild behavior; if any failures appear, reproduce them locally while inspecting captureSandboxSshConfigCommand and related backup/restore logic in sandbox.ts to determine whether the SSH-config fallback (ignoreError: true) or result.status handling should be adjusted and then re-run the E2Es until all pass.src/lib/sandbox/version.test.ts (1)
31-31: ⚡ Quick winRemove unused mock for
captureOpenshellCommand.The mock for
captureOpenshellCommandis no longer used after the refactor tocaptureSandboxSshConfigCommand. All tests now reference the new helper. Removing this line will prevent confusion for future maintainers.🧹 Proposed cleanup
versionGte: (left = "0.0.0", right = "0.0.0") => { const lhs = String(left).split(".").map((p) => parseInt(p, 10) || 0); const rhs = String(right).split(".").map((p) => parseInt(p, 10) || 0); const length = Math.max(lhs.length, rhs.length); for (let i = 0; i < length; i++) { const a = lhs[i] || 0; const b = rhs[i] || 0; if (a > b) return true; if (a < b) return false; } return true; }, - captureOpenshellCommand: vi.fn(), captureSandboxSshConfigCommand: vi.fn(), }));🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/lib/sandbox/version.test.ts` at line 31, Remove the unused mock property captureOpenshellCommand from the test mocks in version.test.ts: locate the mock object that currently defines captureOpenshellCommand: vi.fn() and delete that entry so only the active helper (captureSandboxSshConfigCommand) remains; ensure no other tests reference captureOpenshellCommand and run the test suite to confirm no breakage.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@src/lib/sandbox/version.test.ts`:
- Line 31: Remove the unused mock property captureOpenshellCommand from the test
mocks in version.test.ts: locate the mock object that currently defines
captureOpenshellCommand: vi.fn() and delete that entry so only the active helper
(captureSandboxSshConfigCommand) remains; ensure no other tests reference
captureOpenshellCommand and run the test suite to confirm no breakage.
In `@src/lib/state/sandbox.ts`:
- Around line 472-476: This change touches sandbox SSH-config retrieval in
src/lib/state/sandbox.ts (specifically the call to
captureSandboxSshConfigCommand and the result handling) and requires running the
full state lifecycle E2E suites before merging: execute
state-backup-restore-e2e, snapshot-commands-e2e, and rebuild-openclaw-e2e on
this branch to validate backup/restore/snapshot/rebuild behavior; if any
failures appear, reproduce them locally while inspecting
captureSandboxSshConfigCommand and related backup/restore logic in sandbox.ts to
determine whether the SSH-config fallback (ignoreError: true) or result.status
handling should be adjusted and then re-run the E2Es until all pass.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 1e258949-de29-47f2-ad5e-eaba572ef30b
📒 Files selected for processing (11)
docs/reference/cli-selection-guide.mdxsrc/lib/actions/sandbox/process-recovery.tssrc/lib/actions/sandbox/skill-install.tssrc/lib/adapters/openshell/client.test.tssrc/lib/adapters/openshell/client.tssrc/lib/adapters/openshell/runtime.tssrc/lib/diagnostics/debug.tssrc/lib/sandbox/version.test.tssrc/lib/sandbox/version.tssrc/lib/share-command-deps.tssrc/lib/state/sandbox.ts
✅ Files skipped from review due to trivial changes (1)
- docs/reference/cli-selection-guide.mdx
96d0f41 to
66a54f1
Compare
|
CodeRabbit follow-up:
|
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
src/lib/actions/sandbox/skill-install.ts (1)
171-173: ⚡ Quick winAdd an explicit timeout to SSH-config capture in skill install.
Line 171 performs a blocking OpenShell probe without an explicit timeout; a hung CLI call can stall the command.
♻️ Proposed patch
import { captureSandboxSshConfig } from "../../adapters/openshell/runtime"; +import { OPENSHELL_PROBE_TIMEOUT_MS } from "../../adapters/openshell/timeouts"; @@ const sshConfigResult = captureSandboxSshConfig(sandboxName, { ignoreError: true, + timeout: OPENSHELL_PROBE_TIMEOUT_MS, });🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/lib/actions/sandbox/skill-install.ts` around lines 171 - 173, The call to captureSandboxSshConfig(sandboxName, { ignoreError: true }) can block indefinitely; add an explicit timeout (e.g., 5–15s) so the skill install cannot hang: either extend the call to pass a timeout option if captureSandboxSshConfig supports it (e.g., { ignoreError: true, timeoutMs: 10000 } or an AbortSignal), or wrap the call in a Promise.race that rejects after the timeout and then handle the timeout error the same way ignoreError is handled; update the invocation in skill-install.ts (the captureSandboxSshConfig usage) to use the timeout and ensure any thrown timeout is caught/logged consistent with existing error handling.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/lib/state/sandbox.ts`:
- Around line 472-474: captureSandboxSshConfigCommand is called without any
timeout which allows a hung OpenShell process to stall lifecycle operations;
update the call site that constructs result (calling
captureSandboxSshConfigCommand(openshellBinary, sandboxName, { ignoreError: true
})) to pass a probe timeout option (e.g., { ignoreError: true, timeoutMs:
<reasonable-ms> }) or use the existing timeout parameter shape accepted by
captureSandboxSshConfigCommand so the OpenShell invocation is cancelled after a
bounded time, and ensure the surrounding logic handles the timeout as an
ignorable error similar to other transient failures.
---
Nitpick comments:
In `@src/lib/actions/sandbox/skill-install.ts`:
- Around line 171-173: The call to captureSandboxSshConfig(sandboxName, {
ignoreError: true }) can block indefinitely; add an explicit timeout (e.g.,
5–15s) so the skill install cannot hang: either extend the call to pass a
timeout option if captureSandboxSshConfig supports it (e.g., { ignoreError:
true, timeoutMs: 10000 } or an AbortSignal), or wrap the call in a Promise.race
that rejects after the timeout and then handle the timeout error the same way
ignoreError is handled; update the invocation in skill-install.ts (the
captureSandboxSshConfig usage) to use the timeout and ensure any thrown timeout
is caught/logged consistent with existing error handling.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 78d5c329-ccd6-4763-a09c-2476c43310a4
📒 Files selected for processing (11)
docs/reference/cli-selection-guide.mdxsrc/lib/actions/sandbox/process-recovery.tssrc/lib/actions/sandbox/skill-install.tssrc/lib/adapters/openshell/client.test.tssrc/lib/adapters/openshell/client.tssrc/lib/adapters/openshell/runtime.tssrc/lib/diagnostics/debug.tssrc/lib/sandbox/version.test.tssrc/lib/sandbox/version.tssrc/lib/share-command-deps.tssrc/lib/state/sandbox.ts
✅ Files skipped from review due to trivial changes (1)
- docs/reference/cli-selection-guide.mdx
Signed-off-by: Yimo Jiang <yimoj@nvidia.com>
66a54f1 to
4b42805
Compare
|
@coderabbitai resume |
|
Resuming reviews and kicking off a fresh review now. ✅ Actions performedReviews resumed. Full review triggered. |
|
Correction/update on lifecycle E2E verification: I did attempt the CodeRabbit-requested lifecycle scripts locally with isolated state and unique sandbox names after the workflow dispatch was rejected:
Local preflight evidence: Docker is available ( |
…mmand-docs-ux Signed-off-by: Aaron Erickson <aerickson@nvidia.com> # Conflicts: # docs/reference/cli-selection-guide.mdx
There was a problem hiding this comment.
🧹 Nitpick comments (1)
src/lib/state/sandbox.ts (1)
1376-1389: Run the targeted lifecycle E2E suites before merge.This restore-path update changes backup/restore persistence behavior; please run
state-backup-restore-e2e,snapshot-commands-e2e, andrebuild-openclaw-e2efor confidence on state integrity.As per coding guidelines:
src/lib/state/sandbox.tschanges should validatestate-backup-restore-e2e,snapshot-commands-e2e, andrebuild-openclaw-e2e.Also applies to: 1401-1406
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/lib/state/sandbox.ts` around lines 1376 - 1389, This change around restore behavior (calling shouldPreserveOpenClawManagedExtensions, buildRestoreTarArgs and using spawnSync to run tar producing tarResult) requires running the targeted lifecycle E2E suites to validate state integrity; run state-backup-restore-e2e, snapshot-commands-e2e, and rebuild-openclaw-e2e against the branch and confirm backups/restores and snapshots behave as before (including the alternate code paths at the nearby block referenced around lines where buildRestoreTarArgs/spawnSync/tarResult are used), fix any failing assertions or state drift observed, and update tests or code (preserveManagedExtensions handling in shouldPreserveOpenClawManagedExtensions or argument construction in buildRestoreTarArgs) until all three suites pass.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@src/lib/state/sandbox.ts`:
- Around line 1376-1389: This change around restore behavior (calling
shouldPreserveOpenClawManagedExtensions, buildRestoreTarArgs and using spawnSync
to run tar producing tarResult) requires running the targeted lifecycle E2E
suites to validate state integrity; run state-backup-restore-e2e,
snapshot-commands-e2e, and rebuild-openclaw-e2e against the branch and confirm
backups/restores and snapshots behave as before (including the alternate code
paths at the nearby block referenced around lines where
buildRestoreTarArgs/spawnSync/tarResult are used), fix any failing assertions or
state drift observed, and update tests or code (preserveManagedExtensions
handling in shouldPreserveOpenClawManagedExtensions or argument construction in
buildRestoreTarArgs) until all three suites pass.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 7a640608-99ff-4f98-873e-4f2950559c51
📒 Files selected for processing (6)
docs/reference/cli-selection-guide.mdxsrc/lib/actions/sandbox/process-recovery.tssrc/lib/diagnostics/debug.tssrc/lib/sandbox/version.test.tssrc/lib/share-command-deps.tssrc/lib/state/sandbox.ts
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
Summary
This PR guards NemoClaw's internal OpenShell SSH-config probes by verifying the sandbox with
openshell sandbox getfirst, avoiding the upstreamssh-configfalse-positive path for unknown sandbox names. It also updates the OpenShell command docs to use the workinglogs -nanddoctor checkforms.Related Issue
Fixes #3701
Changes
captureSandboxSshConfigCommand/runtime wrapper and use it from backup, restore, version probing, skill install, process recovery, and share-command paths.openshell sandbox getbefore collecting SSH-based internals.openshell logs --tail/openshell doctortoopenshell logs <name> -n 20/openshell doctor check.Type of Change
Verification
npx prek run --all-filespassesnpm testpassesmake docsbuilds without warnings (doc changes only)Focused checks run:
npx vitest run src/lib/adapters/openshell/client.test.ts src/lib/sandbox/version.test.tsnpx vitest run src/lib/adapters/openshell/client.test.ts src/lib/sandbox/version.test.ts test/snapshot.test.tsnpx vitest run test/gateway-state-reconcile-2276.test.ts -t "skill install"NEMOCLAW_TEST_TIMEOUT=20000 npx vitest run src/lib/domain/sandbox/logs.test.ts test/cli.test.ts -t logsNEMOCLAW_TEST_TIMEOUT=20000 npx vitest run src/lib/share-command.test.ts test/share-command-remote-path.test.ts test/snapshot.test.ts test/repro-2201.test.ts test/rebuild-credential-preflight.test.ts test/cli.test.ts -t "connect --probe-only|recovers non-OpenClaw|share|snapshot|rebuild|2201"npx vitest run test/snapshot.test.ts test/repro-2201.test.ts test/rebuild-credential-preflight.test.tsnpx biome check src/lib/adapters/openshell/client.ts src/lib/adapters/openshell/client.test.ts src/lib/sandbox/version.test.ts src/lib/state/sandbox.ts src/lib/actions/sandbox/skill-install.tsnpm run build:clinpm run typecheck:clinpm run checksgit diff --checkcodex review -c sandbox_mode="danger-full-access" --uncommittedLifecycle E2E status:
state-backup-restore-e2e,snapshot-commands-e2e, andrebuild-openclaw-e2ewas rejected by GitHub withHTTP 403: Must have admin rights to Repository.HOME,NEMOCLAW_HOME, andNEMOCLAW_SANDBOX_NAMEvalues. Docker and OpenShell were available (docker_rc:0,openshell 0.0.39), but this shell has noNVIDIA_API_KEY; each script failed at preflight before sandbox creation:test/e2e/test-state-backup-restore.sh:ERROR: NVIDIA_API_KEY not set,state_backup_restore_rc:1test/e2e/test-snapshot-commands.sh:[FAIL] NVIDIA_API_KEY is required,snapshot_commands_rc:1test/e2e/test-rebuild-openclaw.sh:[FAIL] NVIDIA_API_KEY is required,rebuild_openclaw_rc:1Signed-off-by: Yimo Jiang yimoj@nvidia.com
Summary by CodeRabbit
Documentation
openshell logs <sandbox-name> -n 20, addsopenshell doctor check, clarifies-nvs--tail, and updates lower-level debugging guidance.Bug Fixes
Tests