Skip to content

[WSL2 x86_64][CLI&UX] After nemoclaw upgrade, sandbox openclaw stays old; status silent on drift #4429

@wangericnv

Description

@wangericnv

Description

DevTest case T642364 covers host-CLI / sandbox-openclaw version alignment after an in-place host upgrade. In v0.0.53, three real product gaps surface when you upgrade nemoclaw v0.0.50 → v0.0.53 against an existing sandbox:

  1. nemoclaw <name> status is SILENT about host/sandbox version drift — case spec explicitly states "never silent". The status output's Agent: OpenClaw v2026.5.22 line reports the bundled host-side metadata, not what is actually running inside the sandbox (verified via docker exec: sandbox is still OpenClaw 2026.5.18, the v0.0.50 era).

  2. The remediation command nemoclaw <name> start (recommended in the case spec's example actionable line) does NOT exist in v0.0.53. Valid actions are: channels, config, connect, dashboard-url, destroy, doctor, exec, gateway-token, hosts-*, logs, policy-*, rebuild, recover, share, shields, skill, snapshot, status — no start.

  3. The upgrade installer prints "All sandboxes are up to date." right after upgrading the host CLI, even though the sandbox's openclaw is verifiably still at the old version (2026.5.18) and the host now bundles 2026.5.22. The auto-upgrade check is broken.

Combined effect: a user who upgrades nemoclaw will see a green "installation complete" + "all sandboxes up to date", with no signal that their sandbox is actually behind the host, and no documented command to realign.

Environment

Device:        2u2g-gen-0689 (RTX 5090, 32607 MB GPU, 128 GB RAM)
OS:            Microsoft Windows 11 Enterprise (build 10.0.26100.0)
Architecture:  x86_64
Node.js:       v22.22.3
npm:           10.9.8
Docker:        29.5.2 (native docker-ce in WSL2)
OpenShell CLI: openshell 0.0.44
NemoClaw:      v0.0.50 → v0.0.53 (in-place upgrade via curl|bash)
OpenClaw:      2026.5.18 (sandbox built at v0.0.50) vs 2026.5.22 (host bundle after upgrade)
WSL:           Ubuntu 24.04.4 LTS x86_64

Steps to Reproduce

  1. Install NemoClaw v0.0.50:

    rm -f /home/lab/.local/bin/nemoclaw
    NEMOCLAW_NON_INTERACTIVE=1 NEMOCLAW_NO_EXPRESS=1 \
      NEMOCLAW_ACCEPT_THIRD_PARTY_SOFTWARE=1 \
      NEMOCLAW_INSTALL_TAG=v0.0.50 \
      bash <(curl -fsSL https://www.nvidia.com/nemoclaw.sh)
    nemoclaw --version    # → nemoclaw v0.0.50
  2. Onboard a sandbox at v0.0.50 using NVIDIA Endpoints:

    export NVIDIA_API_KEY=...
    export NEMOCLAW_PROVIDER=build
    export NEMOCLAW_SANDBOX_NAME=t364-vold
    nemoclaw onboard --fresh --non-interactive \
      --yes-i-accept-third-party-software
  3. Capture V_old:

    nemoclaw --version
    # nemoclaw v0.0.50
    CONTAINER=$(docker ps --filter "name=t364-vold" --format '{{.Names}}' | head -1)
    docker exec "$CONTAINER" openclaw --version
    # OpenClaw 2026.5.18 (50a2481)
  4. Upgrade nemoclaw to v0.0.53 in place:

    NEMOCLAW_NON_INTERACTIVE=1 NEMOCLAW_NO_EXPRESS=1 \
      NEMOCLAW_ACCEPT_THIRD_PARTY_SOFTWARE=1 \
      bash <(curl -fsSL https://www.nvidia.com/nemoclaw.sh)
    nemoclaw --version    # → nemoclaw v0.0.53
  5. Observe the upgrade installer's final messages.

  6. Run: nemoclaw t364-vold status

  7. Read actual sandbox openclaw version:

    docker exec "$CONTAINER" openclaw --version
  8. Try recommended remediation: nemoclaw t364-vold start

Expected Result (per T642364 spec)

  • Step 5: upgrade installer correctly identifies that sandbox openclaw is behind host nemoclaw and surfaces it (either warns or auto-upgrades).
  • Step 6: nemoclaw t364-vold status detects drift and prints an actionable line, e.g. host nemoclaw v0.0.53 vs sandbox openclaw 2026.5.18 — restart with nemoclaw t364-vold start``. Never silent.
  • Step 7: Pre-restart, sandbox openclaw is still 2026.5.18 (drift confirmed).
  • Step 8: The recommended remediation runs cleanly, sandbox openclaw becomes 2026.5.22, and status no longer warns.

Actual Result

Step 5 (upgrade installer):

[INFO]  Checking for sandboxes that need upgrading…
All sandboxes are up to date.
[INFO]  === Installation complete ===
NemoClaw  (58s)

← Claim "all sandboxes up to date" is wrong. Sandbox is still at 2026.5.18.

Step 6 (status output, abridged):

Sandbox: t364-vold
  Model:    nvidia/nemotron-3-super-120b-a12b
  Provider: nvidia-prod
  Inference: healthy (https://integrate.api.nvidia.com/v1/models)
  Host GPU: yes
  Sandbox GPU: enabled (auto)
  OpenShell: 0.0.44 (docker)
  Policies: npm, pypi, huggingface, brew, openclaw-pricing
  Connected: no
  Permissions: not configured (default mutable state)
  Agent:    OpenClaw v2026.5.22       ← reports host-bundled version, not
                                        the actually-running 2026.5.18

No drift warning anywhere in the full status output (full 7521-byte dump searched for /drift/i, /mismatch/i, /update/i, /outdated/i — zero hits outside of unrelated Resource version: 9 and policy version: 5).

Step 7 (actual sandbox state):

$ docker exec openshell-t364-vold-... openclaw --version
OpenClaw 2026.5.18 (50a2481)        ← Sandbox is still v0.0.50 era.

So status's Agent: OpenClaw v2026.5.22 is misleading — it is the host-side metadata, not what the sandbox container actually has.

Step 8 (try remediation):

$ nemoclaw t364-vold start
  Unknown action: start
  Valid actions: channels, config, connect, dashboard-url, destroy,
  doctor, exec, gateway-token, hosts-add, hosts-list, hosts-remove,
  logs, policy-add, policy-list, policy-remove, rebuild, recover,
  share, shields, skill, snapshot, status

start is not a valid command in v0.0.53. The case spec's example actionable remediation does not exist. Closest equivalents in the action list are rebuild or recover but they are not surfaced by status.

Logs

Full session in /home/lab/eric-test-session/v053-wsl/T642364/:

  • downgrade.log: confirms v0.0.50 install
  • onboard_vold.log: confirms sandbox built at v0.0.50, openclaw 2026.5.18
  • upgrade.log: shows "All sandboxes are up to date" claim post-upgrade
  • status_post_upgrade.txt: full status output, no drift warning

Reproducible: this is structural, not flaky. The same observations should hold on any v0.0.50→v0.0.53 upgrade with an existing sandbox.

Suggested fix paths

  • (a) Restore drift detection in status command: compare host bundled openclaw version to the openclaw actually running in each sandbox (docker exec or sandbox-state file). When drift, print a clear actionable line.
  • (b) Make the upgrade installer's "Checking for sandboxes" check compare against actual sandbox-container openclaw version, not just metadata. If drift, either auto-upgrade or print a clear "sandboxes <name> need upgrade — run …" message.
  • (c) Add a nemoclaw <name> start (or rename) command to do the restart-with-realignment that the case spec describes, or document rebuild / recover as the actual remediation in the status drift warning.

NVB#6236086

Metadata

Metadata

Assignees

No one assigned

    Labels

    NV QABugs found by the NVIDIA QA Teamarea: cliCommand line interface, flags, terminal UX, or outputarea: sandboxOpenShell sandbox lifecycle, runtime, config, or recoveryplatform: wslAffects Windows Subsystem for Linux

    Type

    No fields configured for Bug.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions