Skip to content

[WSL2][Onboard] install exits 1 after start-windows-ollama / install-windows-ollama provider handoff succeeds #4208

@wangericnv

Description

@wangericnv

Description

When using NEMOCLAW_PROVIDER=start-windows-ollama or NEMOCLAW_PROVIDER=install-windows-ollama with non-interactive install in WSL2 Ubuntu-24.04 + Docker Desktop, the Ollama-on-Windows provider handoff succeeds (Ollama installed on Windows host, started via PowerShell WSL interop, reachable from WSL on host.docker.internal:11434) — but the overall install still exits with code 1, no nemoclaw binary is installed, no sandbox is registered, and onboard never reaches [8/8] Deployment verified. No clear error surfaces.

Environment

Device:        Win 11 Enterprise host with WSL2 + Docker Desktop (2u2g-gen-0689 / 10.57.210.126)
OS:            Ubuntu 24.04.4 LTS in WSL2; Windows 11 Enterprise Build 26100 host
Architecture:  x86_64
Node.js:       (installed by NemoClaw v0.0.50 installer during run — version not captured because install failed before final report)
npm:           (same as above)
Docker:        Docker version 29.5.2 build 79eb04c (Docker Desktop WSL integration)
OpenShell CLI: openshell 0.0.44 (installed during run)
NemoClaw:      v0.0.50 (install attempted; binary NOT ultimately installed due to exit 1)
OpenClaw:      N/A (onboard not completed)

Steps to Reproduce

  1. Fresh WSL2 Ubuntu-24.04 with Docker Desktop integration. Confirm host.docker.internal resolves in WSL (add manually to /etc/hosts pointing at default gateway IP if Docker Desktop didn't auto-configure it).

  2. For start-windows-ollama (case 587897) — Ollama installed on Windows but stopped:

    powershell.exe -Command "Get-Process ollama | Stop-Process -Force; Get-Service Ollama | Stop-Service -Force"

    For install-windows-ollama (case 590915) — Ollama NOT installed on Windows:

    winget uninstall --id Ollama.Ollama --silent
  3. Wipe any prior NemoClaw state in WSL:

    rm -rf ~/.nemoclaw ~/.local/state/nemoclaw ~/.config/openshell ~/.npm-global
    rm -f ~/.local/bin/nemoclaw ~/.local/bin/openshell*
    docker images --format '{{.Repository}}:{{.Tag}}' | grep -E 'openshell|nemoclaw' | xargs -r docker rmi -f
  4. Run inside WSL (substitute the relevant provider env var):

    NEMOCLAW_INSTALL_TAG=v0.0.50 NEMOCLAW_PROVIDER=start-windows-ollama \
      NEMOCLAW_NON_INTERACTIVE=1 NEMOCLAW_ACCEPT_THIRD_PARTY_SOFTWARE=1 \
      bash -c 'curl -fsSL https://www.nvidia.com/nemoclaw.sh | bash' > ~/install.log 2>&1
    echo "EXIT=$?"
  5. Inspect outcome:

    which nemoclaw
    nemoclaw list
    curl -fsS --max-time 5 http://host.docker.internal:11434/api/tags

Expected Result

  • Install exits 0
  • nemoclaw binary at ~/.local/bin/nemoclaw
  • Onboard reports [8/8] Deployment verified — gateway and dashboard are healthy
  • nemoclaw list shows a sandbox with provider=ollama-local
  • Ollama remains running on the Windows host

Actual Result

The provider handoff succeeds — captured output during install shows:

[3/8] Configuring inference provider
[non-interactive] Provider: start-windows-ollama
Starting Ollama on Windows host via WSL interop...
Waiting for Ollama to respond on host.docker.internal...
✓ Using Ollama on host.docker.internal:11434
Loading Ollama model: qwen2.5:7b

Or for install-windows-ollama:

✓ Installed: C:\Users\<user>\AppData\Local\Programs\Ollama\ollama.exe
✓ Using Ollama on host.docker.internal:11434
Pulling Ollama model 'qwen2.5:7b' (4.36 GB).
Pulling Ollama model: qwen2.5:7b

Then install exits with EXIT=1. Afterward:

$ which nemoclaw            -> (nothing)
$ nemoclaw list             -> bash: nemoclaw: command not found
$ curl http://host.docker.internal:11434/api/tags -> Failed to connect

No clear error message in the captured output before exit. Model pull progress dominates the install.log tail.

Logs

Reproduced 4 times across 587897 + 590915 (same INSTALL_EXIT=1 pattern):

Run 1 (case 587897, before host.docker.internal fix):

Starting Ollama on Windows host via WSL interop...
Waiting for Ollama to respond on host.docker.internal...
PowerShell launch via verified ollama.exe failed: Ollama did not become reachable
PowerShell launch via refreshed Windows PATH failed: Ollama did not become reachable
Timed out waiting for Ollama to start on the Windows host.
-> install exit 1 (root cause: host.docker.internal not resolving in WSL)

Run 2 (case 590915, before fix): similar timeout pattern.

After manually adding host.docker.internal -> WSL default gateway in /etc/hosts:

Run 3 (case 587897 retry):

[1/8] Preflight checks                  OK
[2/8] Starting OpenShell gateway        OK
[3/8] Configuring inference provider
  [non-interactive] Provider: start-windows-ollama
  Starting Ollama on Windows host via WSL interop...
  Waiting for Ollama to respond on host.docker.internal...
  ✓ Using Ollama on host.docker.internal:11434
  Loading Ollama model: qwen2.5:7b
  -> install exit 1 (no further error message captured)

Run 4 (case 590915 retry):

✓ Installed: C:\Users\local-mercl\AppData\Local\Programs\Ollama\ollama.exe
✓ Using Ollama on host.docker.internal:11434
Pulling Ollama model 'qwen2.5:7b' (4.36 GB).
Pulling Ollama model: qwen2.5:7b
-> install exit 1 (no further error message captured)

The actual error after model load/pull was not captured because the install log's tail was dominated by model-pull progress lines. The tail -150 / tail -300 captures of /tmp/install_*.log all cut off before reaching whatever line caused the non-zero exit. Reproducing with full log capture (no tail truncation) is needed to surface the exact failing step.

Related observations

  • Same failure mode hit on both start-windows-ollama (case 587897) and install-windows-ollama (case 590915) — likely the same root cause in a step AFTER [3/8] provider config (possibly [6/8] Creating sandbox or a guard that checks Ollama state after model load).
  • host.docker.internal does NOT auto-resolve in this WSL distro despite Docker Desktop WSL integration being enabled. Had to manually add it to /etc/hosts pointing at default gateway. This may indicate a wider Docker Desktop integration gap on certain Win11 builds; the NemoClaw install path could surface a clearer error when it's unreachable (currently shows "PowerShell launch failed: Ollama did not become reachable" without naming host.docker.internal).
  • Bug filed alongside this NVBug: NVB#6217051 (cross-version upgrade workspace loss, also v0.0.50) — separate issue, different code path, but both filed against the same v0.0.50 testing cycle.

NVB#6220627

Metadata

Metadata

Labels

NV QABugs found by the NVIDIA QA Teamplatform: wslAffects Windows Subsystem for Linuxprovider: ollamaOllama local model provider behaviorv0.0.54Release target

Type

No fields configured for Bug.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions