Skip to content

[WSL2][Install] install.sh does not auto-install OpenShell, aborts onboard with circular "Run the NemoClaw installer" advice #3989

@wangericnv

Description

@wangericnv

Description

On a fresh Windows ARM WSL2 install (no OpenShell present), running bash install.sh from the NemoClaw source tree completes [1/3] Node.js and [2/3] NemoClaw CLI successfully, then aborts at [3/3] Onboarding with:

[WARN]  Host preflight found issues that will prevent onboarding right now.
  - Install OpenShell: OpenShell is required before onboarding can create or manage a gateway.
    Run the NemoClaw installer or `scripts/install-openshell.sh`.

This is circular advice: the user is currently running the NemoClaw installer (bash install.sh). The actual workaround is to run scripts/install-openshell.sh manually after install.sh completes — but the wording sends the user back to re-run the installer that just told them to re-run it.

Either:

  • (a) install.sh should run scripts/install-openshell.sh itself as part of step [2/3] or [3/3], OR
  • (b) The error message should explicitly say "Run scripts/install-openshell.sh" and drop the "Run the NemoClaw installer" suggestion (which is the loop the user is already in).

Environment

Device:        Windows ARM reference host (Snapdragon X laptop)
OS:            Ubuntu 24.04.4 LTS Noble Numbat inside WSL2
Architecture:  aarch64 (Snapdragon X)
Node.js:       v22.22.2
npm:           10.9.7
Docker:        29.1.3, build 29.1.3-0ubuntu3~24.04.2
OpenShell CLI: N/A (not installed at install.sh time)
NemoClaw:      v0.1.0 (main HEAD cfa817b — installed by this run)
OpenClaw:      N/A (onboard never completed)

Steps to Reproduce

  1. Fresh WSL2 Ubuntu-24.04 (no NemoClaw, no OpenShell, no nemoclaw artifacts in ~/.local/bin or ~/.nemoclaw).
  2. Clone NemoClaw main HEAD:
    git clone --depth 1 https://github.com/NVIDIA/NemoClaw.git
  3. Run install.sh:
    cd NemoClaw
    NEMOCLAW_ACCEPT_THIRD_PARTY_SOFTWARE=1 NEMOCLAW_NON_INTERACTIVE=1 \
      bash install.sh --yes-i-accept-third-party-software
  4. install.sh completes [1/3] Node.js and [2/3] NemoClaw CLI successfully. At [3/3] Onboarding, observe the error.

Expected Result

Either:

  • (a) install.sh runs scripts/install-openshell.sh as part of its bootstrap so the user gets OpenShell installed in the same flow, OR
  • (b) install.sh's "OpenShell missing" error says exactly which script to run, without suggesting the installer the user is already running.

In either case the install.sh flow should be self-sufficient on a fresh box (Node → NemoClaw CLI → OpenShell → onboard). Today it stops one step short and leaves a confusing breadcrumb.

Actual Result

install.sh aborts at [3/3] Onboarding with:

[3/3] Onboarding
──────────────────────────────────────────────────
Detected container runtime: docker
Running under WSL
[WARN]  Host preflight found issues that will prevent onboarding right now.
  - Install OpenShell: OpenShell is required before onboarding can create or manage a gateway.
    Run the NemoClaw installer or `scripts/install-openshell.sh`.
  - Install NVIDIA Container Toolkit and generate CDI device specs: …
    …
    nemoclaw onboard      # or rerun with --no-gpu to skip GPU passthrough
[WARN]  Skipping onboarding until the host prerequisites above are fixed.
[INFO]  === Installation complete ===

NemoClaw  (30s)

NemoClaw CLI is installed.
Onboarding did not run.

To finish setup, run:
$ source /home/lab/.bashrc
$ nemoclaw onboard

User experience:

  • nemoclaw onboard still fails because OpenShell is genuinely not installed.
  • User has to discover separately that the fix is bash ~/NemoClaw-main/scripts/install-openshell.sh, not "re-run install.sh".

Logs

=== ~/ARM64_phase2.log excerpt — install.sh [3/3] Onboarding ===
[3/3] Onboarding
  ──────────────────────────────────────────────────

  Detected container runtime: docker
  Running under WSL
[WARN]  Host preflight found issues that will prevent onboarding right now.
  - Install OpenShell: OpenShell is required before onboarding can create or manage a gateway.
    Run the NemoClaw installer or `scripts/install-openshell.sh`.
  - Install NVIDIA Container Toolkit and generate CDI device specs: Docker is configured for CDI device injection (CDISpecDirs is set) but the `nvidia-container-toolkit` package (which provides `nvidia-ctk`) is not installed on the host. OpenShell's `gateway start --gpu` will fail with `unresolvable CDI devices nvidia.com/gpu=all` until the toolkit is installed and a CDI spec is generated.
    curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg
    …
    nemoclaw onboard      # or rerun with --no-gpu to skip GPU passthrough
[WARN]  Skipping onboarding until the host prerequisites above are fixed.
[INFO]  === Installation complete ===

  NemoClaw  (30s)

  NemoClaw CLI is installed.
  Onboarding did not run.

  To finish setup, run:
  $ source /home/lab/.bashrc
  $ nemoclaw onboard

=== workaround (manually run install-openshell.sh) ===
$ cd ~/NemoClaw-main
$ bash scripts/install-openshell.sh
[install] Detected Linux (aarch64)
[install] Installing OpenShell from release 'v0.0.39'...
[install] Verifying SHA-256 checksum...
[install] Installed openshell to /home/lab/.local/bin/openshell (user-local path)
[install] openshell 0.0.39 installed

# Now nemoclaw onboard works (after also setting NEMOCLAW_SANDBOX_GPU=0 + Ollama systemd override; see NVB#6199809, NVB#6199755).

Related

Possibly a regression of NVB#6000851 / GH#708 (Closed Fixed 2026-04-16) which described "nemoclaw onboard will automatically install openshell in the PATH ~/.local/bin/openshell" — that bug suggested auto-install was working. Today on Windows ARM reference host main HEAD cfa817b, install.sh's onboard step does not auto-install OpenShell and instead aborts with a circular-advice error.

Discovered during ARM64 validation for PR #3925 QA but is not PR-introduced — reproduces on main HEAD cfa817b.

This is the 5th bug in the Windows ARM reference host-install-experience series I'm filing today:

  • NVB#6198894 (PR-induced regression on Spark+Station ARM64)
  • NVB#6199735 (rebuild destroys --no-gpu sandbox)
  • NVB#6199755 (openshell-docker-gateway idle shutdown)
  • NVB#6199809 (preflight false-positive Snapdragon iGPU as NVIDIA GPU)
  • this one

NVB#6199869

Metadata

Metadata

Assignees

Labels

NV QABugs found by the NVIDIA QA Teamarea: sandboxOpenShell sandbox lifecycle, runtime, config, or recoveryplatform: wslAffects Windows Subsystem for Linux

Type

No fields configured for Bug.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions