Skip to content

fix(upgrade): pin base image via pull-then-inspect and rebuild stale sandboxes#1943

Merged
ericksoa merged 4 commits into
mainfrom
fix/digest-pin-v2b
Apr 16, 2026
Merged

fix(upgrade): pin base image via pull-then-inspect and rebuild stale sandboxes#1943
ericksoa merged 4 commits into
mainfrom
fix/digest-pin-v2b

Conversation

@ericksoa

@ericksoa ericksoa commented Apr 16, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Re-implements digest pinning and upgrade-sandboxes reverted in fix: revert base image digest pinning that broke all e2e tests #1938
  • Fixes the root cause: original PR read a digest from blueprint.yaml (belongs to ghcr.io/nvidia/openshell-community) and applied it to ghcr.io/nvidia/nemoclaw/sandbox-base — a different registry. Docker digests are repo-specific, so every sandbox creation failed with "manifest unknown"
  • New approach: pullAndResolveBaseImageDigest() pulls sandbox-base:latest from GHCR, then docker inspect extracts the actual repo digest. The digest is self-consistent by construction — it always comes from the same registry we pin to
  • Falls back to unpinned :latest when GHCR is unreachable (offline/firewall users)
  • Re-adds nemoclaw upgrade-sandboxes command with --check/--auto/--yes flags and fixes from CodeRabbit review (sandbox list error handling, throwOnError for batch rebuilds, --auto in install.sh)

Closes #1904

Test plan

Summary by CodeRabbit

  • New Features

    • Added a CLI command to detect and batch-upgrade stale sandboxes (report-only and auto/yes modes).
    • Installer now performs a post-onboarding scan and can trigger non-blocking sandbox upgrades.
  • Improvements

    • Sandbox base images are pinned to resolved digests when available for more reproducible builds.
    • Rebuild flow updated to continue processing multiple sandboxes and surface per-sandbox failures without aborting the whole run.
  • Tests

    • New unit/integration and end-to-end tests covering detection, reporting, upgrade, and rebuild workflows.
  • Chores

    • Nightly workflow extended with new E2E jobs; CI failure artifacts and notifications updated.

@coderabbitai

coderabbitai Bot commented Apr 16, 2026

Copy link
Copy Markdown
Contributor
📝 Walkthrough

Walkthrough

After onboarding, the installer may run nemoclaw upgrade-sandboxes --auto when sandboxes exist. The CLI adds upgrade-sandboxes to detect and rebuild running stale sandboxes. Sandbox creation now attempts to pull ghcr.io/nvidia/nemoclaw/sandbox-base:latest and pins staged Dockerfiles to the resolved image digest when available.

Changes

Cohort / File(s) Summary
Installer Post-Onboard Logic
scripts/install.sh
Runs nemoclaw upgrade-sandboxes --auto after successful onboarding when _has_sandboxes > 0 and nemoclaw is available; failures are warned, not fatal.
Onboard: Base Image Pinning
src/lib/onboard.ts
Adds SANDBOX_BASE_IMAGE/SANDBOX_BASE_TAG and pullAndResolveBaseImageDigest(); patchStagedDockerfile(..., baseImageRef) can pin ARG BASE_IMAGE to a resolved @sha256: ref; createSandbox() resolves/pulls and passes baseImageRef, with warnings and fallback if resolution fails.
CLI: Upgrade Flow & Rebuilds
src/nemoclaw.ts
Adds upgrade-sandboxes command (--check, --auto/--yes); discovers registry & live sandboxes via openshell, classifies stale vs unknown, and sequentially rebuilds running stale sandboxes; sandboxRebuild gains opts.throwOnError and bail() to avoid global exits during batch runs.
Unit & Integration Tests
test/cli.test.ts, test/onboard.test.ts
Adds tests for upgrade-sandboxes --check (stale vs up-to-date) and regression tests validating Dockerfile ARG BASE_IMAGE digest pinning, null-ref/no-ARG behavior, non-overwrite of custom BASE_IMAGE, and that pull/resolve is invoked before patching.
E2E: Nightly Workflow & Script
.github/workflows/nightly-e2e.yaml, test/e2e/test-upgrade-stale-sandbox.sh
Adds nightly E2E jobs shields-config-e2e and upgrade-stale-sandbox-e2e; new E2E script reproduces Issue #1904 by creating an old sandbox, marking onboarding complete, running upgrade-sandboxes --check, rebuilding, and verifying OpenClaw version update.
Misc Test Config
.gitleaksignore
Ignores a specific generic-api-key finding from an e2e test script.

Sequence Diagram(s)

sequenceDiagram
    actor User
    participant Installer as "Install Script"
    participant CLI as "nemoclaw CLI"
    participant Manager as "upgradeSandboxes"
    participant Registry as "Sandboxes Registry"
    participant Openshell as "openshell"
    participant Rebuilder as "sandboxRebuild"
    participant Docker as "Docker Engine"

    User->>Installer: run installer (post-onboard)
    Installer->>CLI: "nemoclaw upgrade-sandboxes --auto"
    CLI->>Manager: start upgrade-sandboxes
    Manager->>Registry: list registered sandboxes
    Manager->>Openshell: "openshell sandbox list"
    Openshell-->>Manager: running sandbox names
    Manager->>Manager: compute stale vs current agentVersion
    loop each stale running sandbox
        Manager->>Rebuilder: sandboxRebuild(name, ["--yes"], {throwOnError:true})
        Rebuilder->>Docker: build/pull base image / create image
        Docker-->>Rebuilder: build result / image digest
        Rebuilder-->>Manager: success/failure
    end
    Manager-->>CLI: summary (counts)
    CLI-->>User: output + exit
Loading
sequenceDiagram
    participant createSandbox as "createSandbox()"
    participant DockerPull as "docker pull"
    participant DockerInspect as "docker inspect"
    participant Patch as "patchStagedDockerfile()"
    participant Filesystem as "Staged Dockerfile"

    createSandbox->>DockerPull: pull ghcr.io/nvidia/nemoclaw/sandbox-base:latest
    alt pull & inspect succeed
        DockerPull-->>DockerInspect: inspect for RepoDigests
        DockerInspect-->>createSandbox: repo digest (ghcr.io/...@sha256:...)
        createSandbox->>Patch: patchStagedDockerfile(..., baseImageRef=ghcr.io/...@sha256:...)
        Patch->>Filesystem: rewrite ARG BASE_IMAGE to `@sha256:...`
        Filesystem-->>Patch: patched Dockerfile
    else pull/inspect fails
        createSandbox->>createSandbox: warn and fall back to :latest
        createSandbox->>Patch: patchStagedDockerfile(..., baseImageRef=null)
        Patch->>Filesystem: leave ARG BASE_IMAGE unchanged
    end
    Patch-->>createSandbox: staged Dockerfile ready
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested reviewers

  • cv

"I hopped through logs in morning light,
Pinned base images so builds take flight.
Stale sandboxes I nudge and mend,
Rebuilds and retries until the end.
Fresh OpenClaw now spins — hooray! 🐇🥕"

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 42.86% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly summarizes the main changes: implementing digest pinning via pull-then-inspect and restoring sandbox upgrade functionality.
Linked Issues check ✅ Passed The PR addresses all requirements from issue #1904: detects stale sandboxes, provides rebuild mechanism via upgrade-sandboxes command, implements digest pinning via pull-then-inspect, and handles offline scenarios with fallback behavior.
Out of Scope Changes check ✅ Passed All changes align with PR objectives: base image pinning logic, upgrade-sandboxes command restoration, E2E testing, CI workflow updates, and supporting infrastructure changes are all directly scoped to issue #1904.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/digest-pin-v2b

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (2)
test/cli.test.ts (1)

2199-2313: Tighten stale/current assertions to prevent false positives.

Both new tests assert expected substrings, but neither asserts the opposite state is absent. Adding negative assertions will make regressions much harder to slip through.

Suggested assertion hardening
       expect(r.code).toBe(0);
       // Should report the stale sandbox with version info
       expect(r.out).toContain("my-agent");
       expect(r.out).toContain("2026.3.11");
       expect(r.out).toMatch(/stale|need upgrading/i);
+      expect(r.out).not.toMatch(/up to date/i);
@@
       const r = runWithEnv("upgrade-sandboxes --check 2>&1", {
         HOME: home,
         PATH: `${localBin}:${process.env.PATH || ""}`,
       });

       expect(r.code).toBe(0);
       expect(r.out).toContain("up to date");
+      expect(r.out).not.toMatch(/stale|need upgrading/i);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/cli.test.ts` around lines 2199 - 2313, The tests for "upgrade-sandboxes
--check" should add negative assertions to avoid false positives: in the first
test (the one expecting a stale sandbox) assert that r.out does NOT contain the
"up to date" message (or equivalent string), and in the second test (the one
expecting all-current) assert that r.out does NOT contain the stale sandbox
indicators (e.g., the sandbox name "my-agent", the old version "2026.3.11", or
the regex /stale|need upgrading/i). Locate the two specs by their it(...)
descriptions and add these complementary expect(r.out).not.toContain /
not.toMatch assertions using the existing r variable to harden the checks.
test/onboard.test.ts (1)

4703-4728: Add direct behavior tests for pullAndResolveBaseImageDigest().

Current checks are mostly structural/constant-based. Add runtime-path tests for digest resolution and GHCR-unreachable fallback to fully protect the regression fix.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/onboard.test.ts` around lines 4703 - 4728, The test suite only checks
structure/constants and misses exercising pullAndResolveBaseImageDigest
behavior; add new runtime tests that call pullAndResolveBaseImageDigest (and
indirectly createSandbox if needed) to assert it returns a resolved image digest
when GHCR is reachable and that it falls back to the sandbox-base registry or a
local default when GHCR is unreachable/unresolvable; ensure tests stub/mock
network/Docker registry calls (or inject a mock registry client) so you can
simulate a successful digest resolution and a GHCR failure, and add assertions
for the returned image string containing a digest and for the fallback branch,
referencing pullAndResolveBaseImageDigest, createSandbox (if used),
patchStagedDockerfile, and SANDBOX_BASE_IMAGE to locate relevant code.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/lib/onboard.ts`:
- Around line 482-496: The current logic reads a single RepoDigests entry (via
runCapture into inspectOutput) and picks the first sha256, which can come from
any registry; modify the parsing after inspectOutput is set so you
iterate/filter the RepoDigests entries returned by docker inspect and select the
digest whose repository prefix exactly matches the expected SANDBOX_BASE_IMAGE
(i.e., "ghcr.io/nvidia/nemoclaw/sandbox-base"); then extract the sha256 from
that matched entry, build ref = `${SANDBOX_BASE_IMAGE}@${digest}`, and return as
before (keep function names/variables inspectOutput, runCapture,
SANDBOX_BASE_IMAGE, digest, ref).

In `@src/nemoclaw.ts`:
- Around line 1844-1848: The current handler around captureOpenshell(["sandbox",
"list"], { ignoreError: true }) (liveResult) and the subsequent rebuild loop
returns normally on failure, so the process exits with code 0 even when
OpenShell query or any sandbox rebuild fails; change these code paths to exit
with a non-zero status: when liveResult.status !== 0 call process.exit(1) (or
propagate an error) instead of returning, and in the sandbox rebuild logic
detect any failed rebuilds (e.g., track non-zero statuses from the rebuild
command or function used for rebuilding sandboxes) and call process.exit(1)
after finishing the loop if any rebuild failed so --auto scripts receive a
non-zero exit code; update both the captureOpenshell failure block and the
rebuild-result aggregation logic (referencing captureOpenshell, liveResult and
the rebuild loop/function) accordingly.
- Around line 1854-1864: The loop over sandboxes currently only pushes entries
when versionCheck.isStale is true, but checkAgentVersion() can return
detectionMethod: "unavailable" (no cached agentVersion) and isStale: false;
update the condition to treat detectionMethod === "unavailable" as needing
attention. Change the if in the for-loop that references
sandboxVersion.checkAgentVersion(sb.name) to push the sandbox when
versionCheck.isStale || versionCheck.detectionMethod === "unavailable", and
populate the stale entry using versionCheck.sandboxVersion (or a fallback like
"unknown") and versionCheck.expectedVersion so legacy/stopped sandboxes appear
in the report.

---

Nitpick comments:
In `@test/cli.test.ts`:
- Around line 2199-2313: The tests for "upgrade-sandboxes --check" should add
negative assertions to avoid false positives: in the first test (the one
expecting a stale sandbox) assert that r.out does NOT contain the "up to date"
message (or equivalent string), and in the second test (the one expecting
all-current) assert that r.out does NOT contain the stale sandbox indicators
(e.g., the sandbox name "my-agent", the old version "2026.3.11", or the regex
/stale|need upgrading/i). Locate the two specs by their it(...) descriptions and
add these complementary expect(r.out).not.toContain / not.toMatch assertions
using the existing r variable to harden the checks.

In `@test/onboard.test.ts`:
- Around line 4703-4728: The test suite only checks structure/constants and
misses exercising pullAndResolveBaseImageDigest behavior; add new runtime tests
that call pullAndResolveBaseImageDigest (and indirectly createSandbox if needed)
to assert it returns a resolved image digest when GHCR is reachable and that it
falls back to the sandbox-base registry or a local default when GHCR is
unreachable/unresolvable; ensure tests stub/mock network/Docker registry calls
(or inject a mock registry client) so you can simulate a successful digest
resolution and a GHCR failure, and add assertions for the returned image string
containing a digest and for the fallback branch, referencing
pullAndResolveBaseImageDigest, createSandbox (if used), patchStagedDockerfile,
and SANDBOX_BASE_IMAGE to locate relevant code.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: bf8a27d9-ac86-4345-aca4-478be0bbdbcc

📥 Commits

Reviewing files that changed from the base of the PR and between bb5fdb4 and 3a89ee4.

📒 Files selected for processing (5)
  • scripts/install.sh
  • src/lib/onboard.ts
  • src/nemoclaw.ts
  • test/cli.test.ts
  • test/onboard.test.ts

Comment thread src/lib/onboard.ts
Comment thread src/nemoclaw.ts Outdated
Comment thread src/nemoclaw.ts

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
test/e2e/test-upgrade-stale-sandbox.sh (2)

169-175: Wait loop lacks explicit timeout failure.

Both wait loops (lines 105-110 and 170-175) iterate 30 times with 5-second sleeps but don't explicitly fail if the sandbox never becomes ready. The subsequent grep -q check (line 111-112) will catch the failure, but having an explicit timeout message would improve debuggability.

💡 Optional: Add explicit timeout handling
 # Wait for sandbox to be ready after rebuild
+ready=0
 for _i in $(seq 1 30); do
   if openshell sandbox list 2>/dev/null | grep -q "${SANDBOX_NAME}.*Ready\|${SANDBOX_NAME}.*Running"; then
+    ready=1
     break
   fi
   sleep 5
 done
+[ "$ready" = "1" ] || fail "Sandbox ${SANDBOX_NAME} did not become ready within 150s after rebuild"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/e2e/test-upgrade-stale-sandbox.sh` around lines 169 - 175, The wait
loops that run `for _i in $(seq 1 30); do ... openshell sandbox list ... | grep
-q "${SANDBOX_NAME}.*Ready\|${SANDBOX_NAME}.*Running"; done` lack explicit
timeout failure handling; after the loop completes without a break add an
explicit check for the sandbox state and emit a clear error message to stderr
(including ${SANDBOX_NAME}) and exit non‑zero (e.g., `echo "Timed out waiting
for ${SANDBOX_NAME} to be Ready/Running" >&2; exit 1`). Apply this to both
occurrences of the loop so the script fails fast with a descriptive message when
the sandbox never becomes ready.

119-127: Shell variable not expanded inside Python heredoc.

The ${REGISTRY_FILE} and ${SANDBOX_NAME} variables are inside single quotes of the Python string, but they're being expanded by the shell because the outer quotes are double quotes. This actually works correctly here because the shell expands them before Python sees them. However, this pattern can be confusing and error-prone.

Additionally, consider adding a timeout failure check after the wait loop (lines 105-110) to fail explicitly if the sandbox never becomes ready, rather than continuing silently.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/e2e/test-upgrade-stale-sandbox.sh` around lines 119 - 127, Replace the
inline Python invocation that embeds shell vars into the quoted Python string
(producing OLD_AGENT_VERSION) with a safer pattern that passes ${REGISTRY_FILE}
and ${SANDBOX_NAME} as explicit arguments or reads them from the environment
(e.g., call python3 with "$REGISTRY_FILE" "$SANDBOX_NAME" and use sys.argv or
os.environ inside Python) so shell expansion isn't implicitly relied upon; also
add an explicit timeout/failure check immediately after the sandbox readiness
wait loop (the loop that waits for the sandbox to become ready) so the script
exits non‑zero and logs a clear error if the sandbox never becomes ready instead
of continuing silently.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@test/e2e/test-upgrade-stale-sandbox.sh`:
- Around line 169-175: The wait loops that run `for _i in $(seq 1 30); do ...
openshell sandbox list ... | grep -q
"${SANDBOX_NAME}.*Ready\|${SANDBOX_NAME}.*Running"; done` lack explicit timeout
failure handling; after the loop completes without a break add an explicit check
for the sandbox state and emit a clear error message to stderr (including
${SANDBOX_NAME}) and exit non‑zero (e.g., `echo "Timed out waiting for
${SANDBOX_NAME} to be Ready/Running" >&2; exit 1`). Apply this to both
occurrences of the loop so the script fails fast with a descriptive message when
the sandbox never becomes ready.
- Around line 119-127: Replace the inline Python invocation that embeds shell
vars into the quoted Python string (producing OLD_AGENT_VERSION) with a safer
pattern that passes ${REGISTRY_FILE} and ${SANDBOX_NAME} as explicit arguments
or reads them from the environment (e.g., call python3 with "$REGISTRY_FILE"
"$SANDBOX_NAME" and use sys.argv or os.environ inside Python) so shell expansion
isn't implicitly relied upon; also add an explicit timeout/failure check
immediately after the sandbox readiness wait loop (the loop that waits for the
sandbox to become ready) so the script exits non‑zero and logs a clear error if
the sandbox never becomes ready instead of continuing silently.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: b752cc4a-56f3-4eeb-9753-42b196f08e2f

📥 Commits

Reviewing files that changed from the base of the PR and between 3a89ee4 and 8d36fc1.

📒 Files selected for processing (2)
  • .github/workflows/nightly-e2e.yaml
  • test/e2e/test-upgrade-stale-sandbox.sh

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
test/e2e/test-upgrade-stale-sandbox.sh (1)

119-127: Registry JSON lookup is brittle for sandboxes shape.

The Python snippet assumes sandboxes is a key-addressable object (.get(name)), but runtime code paths consume it as an iterable list of sandbox objects. This can degrade diagnostics to error: ... strings.

Proposed fix
 OLD_AGENT_VERSION=$(python3 -c "
 import json, sys
+name = '${SANDBOX_NAME}'
 try:
     d = json.load(open('${REGISTRY_FILE}'))
-    sb = d.get('sandboxes', {}).get('${SANDBOX_NAME}', {})
+    sandboxes = d.get('sandboxes', {})
+    if isinstance(sandboxes, list):
+        sb = next((s for s in sandboxes if s.get('name') == name), {})
+    elif isinstance(sandboxes, dict):
+        sb = sandboxes.get(name, {})
+    else:
+        sb = {}
     print(sb.get('agentVersion', 'unknown'))
 except Exception as e:
     print(f'error: {e}')
 " 2>/dev/null || echo "unknown")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/e2e/test-upgrade-stale-sandbox.sh` around lines 119 - 127, The Python
snippet that sets OLD_AGENT_VERSION treats the registry's "sandboxes" as a dict
and can emit "error: ..." into OLD_AGENT_VERSION; update the lookup to handle
"sandboxes" as a list by iterating d.get('sandboxes', []) and finding the object
whose "name" equals the SANDBOX_NAME, then print its "agentVersion" (or
"unknown" if missing); also ensure any exception prints nothing useful to STDOUT
(or print "unknown") so OLD_AGENT_VERSION never captures raw exception strings
from REGISTRY_FILE.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@test/e2e/test-upgrade-stale-sandbox.sh`:
- Around line 115-116: The version probe commands for OLD_OPENCLAW_VERSION and
NEW_OPENCLAW_VERSION currently use "|| true" which hides failures; remove the
"|| true" from the openshell sandbox exec calls (the ones assigning
OLD_OPENCLAW_VERSION and NEW_OPENCLAW_VERSION), capture both output and the
command exit status, and if the exit status is non‑zero log an error (including
the captured output) and exit non‑zero so the subsequent comparison (the version
equality check) is reliable; ensure you update any surrounding logic that
assumed a silent failure so the script fails fast on openshell sandbox exec
errors.
- Around line 190-197: The test currently treats a non-clean result from
`nemoclaw upgrade-sandboxes --check` as non-fatal (using `info`) so the E2E can
pass while stale sandboxes remain; change the behavior so when `RECHECK_OUTPUT`
does not contain "up to date" (the grep check around `RECHECK_OUTPUT`), the
script fails the test (use the test harness `fail` function or `exit 1`) and
include `RECHECK_OUTPUT` in the failure message to aid debugging; update the
block that calls `upgrade-sandboxes --check` and the conditional that currently
calls `pass`/`info` to instead call `pass` on success and `fail` (with output)
on any other result.

---

Nitpick comments:
In `@test/e2e/test-upgrade-stale-sandbox.sh`:
- Around line 119-127: The Python snippet that sets OLD_AGENT_VERSION treats the
registry's "sandboxes" as a dict and can emit "error: ..." into
OLD_AGENT_VERSION; update the lookup to handle "sandboxes" as a list by
iterating d.get('sandboxes', []) and finding the object whose "name" equals the
SANDBOX_NAME, then print its "agentVersion" (or "unknown" if missing); also
ensure any exception prints nothing useful to STDOUT (or print "unknown") so
OLD_AGENT_VERSION never captures raw exception strings from REGISTRY_FILE.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 1068226c-08ea-41e4-bef7-7bead4747463

📥 Commits

Reviewing files that changed from the base of the PR and between 8d36fc1 and 1ff8061.

📒 Files selected for processing (1)
  • test/e2e/test-upgrade-stale-sandbox.sh

Comment thread test/e2e/test-upgrade-stale-sandbox.sh Outdated
Comment thread test/e2e/test-upgrade-stale-sandbox.sh

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (2)
test/e2e/test-upgrade-stale-sandbox.sh (2)

120-121: ⚠️ Potential issue | 🟠 Major

Fail fast when OpenClaw version probes fail.

Line 120 and Line 182 still use || true, which masks openshell sandbox exec failures and can make the version comparison unreliable.

Proposed fix
-OLD_OPENCLAW_VERSION=$(openshell sandbox exec --name "${SANDBOX_NAME}" -- openclaw --version 2>&1 || true)
+OLD_OPENCLAW_VERSION=$(openshell sandbox exec --name "${SANDBOX_NAME}" -- openclaw --version 2>&1) \
+  || fail "Failed to read old sandbox OpenClaw version: ${OLD_OPENCLAW_VERSION}"

@@
-NEW_OPENCLAW_VERSION=$(openshell sandbox exec --name "${SANDBOX_NAME}" -- openclaw --version 2>&1 || true)
+NEW_OPENCLAW_VERSION=$(openshell sandbox exec --name "${SANDBOX_NAME}" -- openclaw --version 2>&1) \
+  || fail "Failed to read new sandbox OpenClaw version after rebuild: ${NEW_OPENCLAW_VERSION}"

Also applies to: 182-183

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/e2e/test-upgrade-stale-sandbox.sh` around lines 120 - 121, The version
probe currently masks failures by appending "|| true" to the "openshell sandbox
exec --name \"${SANDBOX_NAME}\" -- openclaw --version" calls (used to populate
OLD_OPENCLAW_VERSION and the later NEW_OPENCLAW_VERSION), which can produce
unreliable comparisons; remove "|| true" and instead immediately fail fast when
the command exits non-zero: run the openshell command without "|| true", check
its exit status, and if it fails log an error including the SANDBOX_NAME and the
raw command output and exit the script with a non-zero status so the test never
compares invalid version strings.

198-202: ⚠️ Potential issue | 🟠 Major

Make Phase 6 a hard failure when stale state remains.

Line 201 treats a non-clean upgrade-sandboxes --check result as non-fatal, so this E2E can still report success incorrectly.

Proposed fix
 if echo "$RECHECK_OUTPUT" | grep -qi "up to date"; then
   pass "Phase 6: upgrade-sandboxes --check reports all up to date after rebuild"
 else
-  info "Phase 6: Sandbox may still appear stale (non-fatal)"
+  fail "Phase 6: upgrade-sandboxes --check did not report 'up to date' after rebuild. Output: ${RECHECK_OUTPUT}"
 fi
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/e2e/test-upgrade-stale-sandbox.sh` around lines 198 - 202, Currently the
Phase 6 check inspects RECHECK_OUTPUT with grep -qi "up to date" and calls
pass(...) on success but only logs info(...) on failure; change the failure
branch to make the test a hard failure by invoking the test-failure mechanism
(replace info "Phase 6: Sandbox may still appear stale (non-fatal)" with fail
"Phase 6: upgrade-sandboxes --check reports stale state after rebuild" or
otherwise exit with non-zero status), keeping the same condition that inspects
RECHECK_OUTPUT and the existing pass(...) call on success.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@test/e2e/test-upgrade-stale-sandbox.sh`:
- Line 48: The diagnostics pipeline invoked in the fail() path (the echo command
that runs docker images | grep ...) can fail the whole pipeline under set -o
pipefail when grep finds no matches; change that line so the pipeline cannot
return non-zero by appending "|| true" to the pipeline (or capture output into a
variable and echo that), e.g. modify the echo invocation that contains "docker
images --format ... | grep -i 'sandbox\|nemoclaw\|openclaw' | head -10" to
ensure it always succeeds (pipeline || true) so fail() diagnostics never get
short-circuited by pipefail.

---

Duplicate comments:
In `@test/e2e/test-upgrade-stale-sandbox.sh`:
- Around line 120-121: The version probe currently masks failures by appending
"|| true" to the "openshell sandbox exec --name \"${SANDBOX_NAME}\" -- openclaw
--version" calls (used to populate OLD_OPENCLAW_VERSION and the later
NEW_OPENCLAW_VERSION), which can produce unreliable comparisons; remove "||
true" and instead immediately fail fast when the command exits non-zero: run the
openshell command without "|| true", check its exit status, and if it fails log
an error including the SANDBOX_NAME and the raw command output and exit the
script with a non-zero status so the test never compares invalid version
strings.
- Around line 198-202: Currently the Phase 6 check inspects RECHECK_OUTPUT with
grep -qi "up to date" and calls pass(...) on success but only logs info(...) on
failure; change the failure branch to make the test a hard failure by invoking
the test-failure mechanism (replace info "Phase 6: Sandbox may still appear
stale (non-fatal)" with fail "Phase 6: upgrade-sandboxes --check reports stale
state after rebuild" or otherwise exit with non-zero status), keeping the same
condition that inspects RECHECK_OUTPUT and the existing pass(...) call on
success.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 6bedb28e-302d-43e0-a79e-a8675e5305e7

📥 Commits

Reviewing files that changed from the base of the PR and between 1ff8061 and bc38da0.

📒 Files selected for processing (1)
  • test/e2e/test-upgrade-stale-sandbox.sh

Comment thread test/e2e/test-upgrade-stale-sandbox.sh Outdated

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (3)
test/e2e/test-upgrade-stale-sandbox.sh (3)

201-208: ⚠️ Potential issue | 🟠 Major

Phase 6 should fail when --check is not clean.

At Lines 201-208, the test can pass even if stale sandboxes remain because the non-clean result is logged as info instead of failing.

Suggested fix
 if echo "$RECHECK_OUTPUT" | grep -qi "up to date"; then
   pass "Phase 6: upgrade-sandboxes --check reports all up to date after rebuild"
 else
-  info "Phase 6: Sandbox may still appear stale (non-fatal)"
+  fail "Phase 6: upgrade-sandboxes --check did not report 'up to date' after rebuild. Output: ${RECHECK_OUTPUT}"
 fi
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/e2e/test-upgrade-stale-sandbox.sh` around lines 201 - 208, The test
currently treats a non-clean "nemoclaw upgrade-sandboxes --check" result as
non-fatal by calling info, so Phase 6 can pass incorrectly; change the else
branch that currently calls info to instead fail the test (use the existing fail
function or exit with non-zero) so that when RECHECK_OUTPUT does not contain "up
to date" the script fails (update the block referencing RECHECK_OUTPUT, the grep
conditional, and replace info "Phase 6: Sandbox may still appear stale
(non-fatal)" with a failing call such as fail "Phase 6: upgrade-sandboxes
--check reports stale sandboxes" or exit 1).

48-48: ⚠️ Potential issue | 🟡 Minor

Guard diagnostics pipeline from pipefail short-circuit.

At Line 48, grep can return non-zero when no matches exist, which can interrupt diagnostics in fail() under strict mode.

Suggested fix
-  echo -e "${YELLOW}[DIAG]${NC} Docker images: $(docker images --format '{{.Repository}}:{{.Tag}} {{.ID}}' | grep -i 'sandbox\|nemoclaw\|openclaw' | head -10)" >&2
+  echo -e "${YELLOW}[DIAG]${NC} Docker images: $(docker images --format '{{.Repository}}:{{.Tag}} {{.ID}}' | grep -Ei 'sandbox|nemoclaw|openclaw' | head -10 || true)" >&2
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/e2e/test-upgrade-stale-sandbox.sh` at line 48, The diagnostics pipeline
using "docker images ... | grep -i 'sandbox\|nemoclaw\|openclaw' | head -10" can
return non-zero and short-circuit under set -o pipefail (affecting the fail()
diagnostics); update that echo/substitution to guard the pipeline by making the
grep failure non-fatal (for example append "|| true" to the grep or to the whole
substitution) so the echo command always succeeds; locate the line containing
the docker images + grep pipeline in test/e2e/test-upgrade-stale-sandbox.sh and
add the "|| true" guard to the pipeline expression.

120-121: ⚠️ Potential issue | 🟠 Major

Do not mask OpenClaw version probe failures.

Lines 120 and 188 use || true, which can hide openshell sandbox exec failures and make the before/after comparison unreliable.

Suggested fix
-OLD_OPENCLAW_VERSION=$(openshell sandbox exec --name "${SANDBOX_NAME}" -- openclaw --version 2>&1 || true)
+OLD_OPENCLAW_VERSION=$(openshell sandbox exec --name "${SANDBOX_NAME}" -- openclaw --version 2>&1) \
+  || fail "Failed to read OpenClaw version from old sandbox"

@@
-NEW_OPENCLAW_VERSION=$(openshell sandbox exec --name "${SANDBOX_NAME}" -- openclaw --version 2>&1 || true)
+NEW_OPENCLAW_VERSION=$(openshell sandbox exec --name "${SANDBOX_NAME}" -- openclaw --version 2>&1) \
+  || fail "Failed to read OpenClaw version after rebuild"

Also applies to: 188-189

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/e2e/test-upgrade-stale-sandbox.sh` around lines 120 - 121, The test
currently masks failures by appending "|| true" to the OpenClaw probe calls (the
assignment to OLD_OPENCLAW_VERSION and the later probe around lines referenced),
which can hide command failures; remove the "|| true" from the openshell sandbox
exec invocations (the commands that set OLD_OPENCLAW_VERSION and the later
NEW_OPENCLAW_VERSION) and instead explicitly check the command exit status or
the captured variable (e.g. if [ -z "$OLD_OPENCLAW_VERSION" ] || [ $? -ne 0 ];
then error "openshell sandbox exec failed for SANDBOX_NAME"; exit 1; fi) so any
probe failure fails the test and the before/after comparison remains reliable.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@test/e2e/test-upgrade-stale-sandbox.sh`:
- Around line 201-208: The test currently treats a non-clean "nemoclaw
upgrade-sandboxes --check" result as non-fatal by calling info, so Phase 6 can
pass incorrectly; change the else branch that currently calls info to instead
fail the test (use the existing fail function or exit with non-zero) so that
when RECHECK_OUTPUT does not contain "up to date" the script fails (update the
block referencing RECHECK_OUTPUT, the grep conditional, and replace info "Phase
6: Sandbox may still appear stale (non-fatal)" with a failing call such as fail
"Phase 6: upgrade-sandboxes --check reports stale sandboxes" or exit 1).
- Line 48: The diagnostics pipeline using "docker images ... | grep -i
'sandbox\|nemoclaw\|openclaw' | head -10" can return non-zero and short-circuit
under set -o pipefail (affecting the fail() diagnostics); update that
echo/substitution to guard the pipeline by making the grep failure non-fatal
(for example append "|| true" to the grep or to the whole substitution) so the
echo command always succeeds; locate the line containing the docker images +
grep pipeline in test/e2e/test-upgrade-stale-sandbox.sh and add the "|| true"
guard to the pipeline expression.
- Around line 120-121: The test currently masks failures by appending "|| true"
to the OpenClaw probe calls (the assignment to OLD_OPENCLAW_VERSION and the
later probe around lines referenced), which can hide command failures; remove
the "|| true" from the openshell sandbox exec invocations (the commands that set
OLD_OPENCLAW_VERSION and the later NEW_OPENCLAW_VERSION) and instead explicitly
check the command exit status or the captured variable (e.g. if [ -z
"$OLD_OPENCLAW_VERSION" ] || [ $? -ne 0 ]; then error "openshell sandbox exec
failed for SANDBOX_NAME"; exit 1; fi) so any probe failure fails the test and
the before/after comparison remains reliable.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: c549fd91-c296-4557-8096-6a880c7c3c3b

📥 Commits

Reviewing files that changed from the base of the PR and between bc38da0 and ab326f5.

📒 Files selected for processing (1)
  • test/e2e/test-upgrade-stale-sandbox.sh

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (2)
src/nemoclaw.ts (2)

1612-1618: Preserve the exit code on the throw path.

code is accepted here but discarded when throwOnError is enabled, so batch callers only get a generic Error. Attaching it to the thrown error also removes the unused-parameter guideline violation.

Suggested change
   const bail = opts.throwOnError
     ? (msg, code = 1) => {
-        throw new Error(msg);
+        const error = new Error(msg);
+        error.exitCode = code;
+        throw error;
       }
     : (_msg, code = 1) => process.exit(code);

As per coding guidelines, "Prefix unused variables with underscore (_) in JavaScript/TypeScript".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/nemoclaw.ts` around lines 1612 - 1618, The bail function currently
discards the exit code when opts.throwOnError is true; modify the throw branch
(the function assigned to bail when opts.throwOnError is truthy) to include the
provided code on the thrown error so callers (e.g., upgradeSandboxes) can
inspect the exit code; specifically, in the throw path of bail, construct an
Error with msg and attach the code (e.g., set error.code = code or use
Object.assign) before throwing so the code parameter is used and preserved.

1832-1938: Split upgradeSandboxes() into smaller helpers.

This function now handles parsing, classification, reporting, prompting, and execution in one block. It is already at/over the repo complexity cap, so extracting helpers here would make the failure paths much easier to maintain.

As per coding guidelines, "Limit cyclomatic complexity to 20 in JavaScript/TypeScript files, with target of 15".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/nemoclaw.ts` around lines 1832 - 1938, The upgradeSandboxes function is
too large—extract smaller helpers to lower cyclomatic complexity and improve
maintainability: create (1) a gatherSandboxes helper that wraps
registry.listSandboxes, calls captureOpenshell(["sandbox","list"]) and
parseLiveSandboxNames, and returns { sandboxes, liveNames } and any error state;
(2) a classifySandboxes(sandboxes, liveNames) that uses
sandboxVersion.checkAgentVersion to build and return { stale, unknown }; (3) a
reportSandboxStatus(stale, unknown, checkOnly) that prints the status/summary
lines currently in the function; and (4) a rebuildStaleSandboxes(rebuildable,
stopped, skipConfirm) that encapsulates prompting via askPrompt and calling
sandboxRebuild, counting rebuilt/failed and returning the result (and throwing
or returning an exit code). Then make upgradeSandboxes orchestrate these
helpers, preserving existing calls/behavior to sandboxRebuild, askPrompt,
sandboxVersion.checkAgentVersion, captureOpenshell, parseLiveSandboxNames, and
registry.listSandboxes.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/nemoclaw.ts`:
- Around line 1894-1912: The current logic ignores `unknown` sandboxes when not
in `--check` mode, causing a no-op when all sandboxes are `unknown`; update the
rebuild selection so `unknown` running sandboxes are considered for rebuild:
when computing `rebuildable` (currently `const rebuildable = stale.filter((s) =>
s.running)`), include `unknown.filter((s) => s.running)` as well (or merge
`unknown` into `stale` beforehand), and adjust the `stopped` computation to
include stopped `unknown` entries so the `Skipping ... stopped sandbox(es)` and
the "No running stale sandboxes to rebuild." message reflect both `stale` and
`unknown` states (refer to variables `checkOnly`, `stale`, `unknown`,
`rebuildable`, and `stopped`).

In `@test/e2e/test-upgrade-stale-sandbox.sh`:
- Around line 101-114: The script currently can exit early on a failing docker
build (due to set -e) and never restore the original BLUEPRINT; modify the build
block so cleanup always runs: either add a trap that restores BLUEPRINT from
BLUEPRINT_BAK (use trap 'mv "${BLUEPRINT_BAK}" "${BLUEPRINT}"' EXIT) before
running the docker build, or wrap the docker build in an if ! docker build ...;
then BUILD_RC=$?; mv "${BLUEPRINT_BAK}" "${BLUEPRINT}"; fail "Failed to build
old base image"; fi flow; ensure BUILD_RC is still set and the final mv cleanup
happens in all paths referencing BLUEPRINT, BLUEPRINT_BAK, BUILD_RC and the
docker build invocation.

---

Nitpick comments:
In `@src/nemoclaw.ts`:
- Around line 1612-1618: The bail function currently discards the exit code when
opts.throwOnError is true; modify the throw branch (the function assigned to
bail when opts.throwOnError is truthy) to include the provided code on the
thrown error so callers (e.g., upgradeSandboxes) can inspect the exit code;
specifically, in the throw path of bail, construct an Error with msg and attach
the code (e.g., set error.code = code or use Object.assign) before throwing so
the code parameter is used and preserved.
- Around line 1832-1938: The upgradeSandboxes function is too large—extract
smaller helpers to lower cyclomatic complexity and improve maintainability:
create (1) a gatherSandboxes helper that wraps registry.listSandboxes, calls
captureOpenshell(["sandbox","list"]) and parseLiveSandboxNames, and returns {
sandboxes, liveNames } and any error state; (2) a classifySandboxes(sandboxes,
liveNames) that uses sandboxVersion.checkAgentVersion to build and return {
stale, unknown }; (3) a reportSandboxStatus(stale, unknown, checkOnly) that
prints the status/summary lines currently in the function; and (4) a
rebuildStaleSandboxes(rebuildable, stopped, skipConfirm) that encapsulates
prompting via askPrompt and calling sandboxRebuild, counting rebuilt/failed and
returning the result (and throwing or returning an exit code). Then make
upgradeSandboxes orchestrate these helpers, preserving existing calls/behavior
to sandboxRebuild, askPrompt, sandboxVersion.checkAgentVersion,
captureOpenshell, parseLiveSandboxNames, and registry.listSandboxes.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 820a0dcd-5684-496d-ae2f-200784b365aa

📥 Commits

Reviewing files that changed from the base of the PR and between bc38da0 and 73e3644.

📒 Files selected for processing (5)
  • .github/workflows/nightly-e2e.yaml
  • .gitleaksignore
  • src/lib/onboard.ts
  • src/nemoclaw.ts
  • test/e2e/test-upgrade-stale-sandbox.sh
✅ Files skipped from review due to trivial changes (1)
  • .gitleaksignore
🚧 Files skipped from review as they are similar to previous changes (1)
  • .github/workflows/nightly-e2e.yaml

Comment thread src/nemoclaw.ts
Comment thread test/e2e/test-upgrade-stale-sandbox.sh
…sandboxes (#1904)

Re-implements the digest pinning and upgrade-sandboxes features reverted
in #1938. The original PR (#1937) broke all e2e tests because it read a
digest from blueprint.yaml (belongs to ghcr.io/nvidia/openshell-community)
and applied it to ghcr.io/nvidia/nemoclaw/sandbox-base — a different
registry. Docker digests are repo-specific, so every pull returned
"manifest unknown".

This fix never reads blueprint.yaml for the base image digest. Instead:
- pullAndResolveBaseImageDigest() pulls sandbox-base:latest from GHCR
- docker inspect extracts the actual repo digest
- patchStagedDockerfile() pins ARG BASE_IMAGE to the inspected digest

The digest is self-consistent by construction — it always comes from the
same registry we pin to. Falls back to unpinned :latest when GHCR is
unreachable (offline/firewall users).

Also re-adds the upgrade-sandboxes command with fixes from code review:
- --auto flag for non-interactive installer contexts
- sandbox list failure handling before classifying sandboxes
- throwOnError option for sandboxRebuild to prevent process.exit in loops
- RepoDigests filtered by SANDBOX_BASE_IMAGE prefix (ordering not guaranteed)
- Exit non-zero from upgrade-sandboxes when sandbox list or rebuilds fail
- Report sandboxes with unavailable version detection

Closes #1904

Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
Adds test-upgrade-stale-sandbox.sh that reproduces the reporter's exact
scenario: install current NemoClaw, create a sandbox with an older
OpenClaw, run upgrade-sandboxes --check to detect staleness, rebuild,
and verify the sandbox runs the new version.

Adds the upgrade-stale-sandbox-e2e job and shields-config-e2e job to
the nightly workflow, and wires them into the failure notification.

Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
@ericksoa ericksoa force-pushed the fix/digest-pin-v2b branch from 73e3644 to 1aa450a Compare April 16, 2026 12:33

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

♻️ Duplicate comments (1)
test/e2e/test-upgrade-stale-sandbox.sh (1)

101-114: ⚠️ Potential issue | 🟡 Minor

Blueprint restore may be skipped if docker build fails under set -e.

With set -euo pipefail, a failing docker build at lines 106-110 will exit the script before BUILD_RC=$? (line 111) and before the mv restore (line 113). This leaves the repository with a modified blueprint.yaml.

Consider using a trap to ensure cleanup:

Suggested fix
 cp "${BLUEPRINT}" "${BLUEPRINT_BAK}"
+restore_blueprint() {
+  [ -f "${BLUEPRINT_BAK}" ] && mv "${BLUEPRINT_BAK}" "${BLUEPRINT}"
+}
+trap restore_blueprint EXIT
+
 sed "s/min_openclaw_version:.*/min_openclaw_version: \"${OLD_OPENCLAW_VERSION}\"/" "${BLUEPRINT}" >"${BLUEPRINT}.tmp"
 mv "${BLUEPRINT}.tmp" "${BLUEPRINT}"
 
-docker build \
+if ! docker build \
   --build-arg "OPENCLAW_VERSION=${OLD_OPENCLAW_VERSION}" \
   -f "${REPO_ROOT}/Dockerfile.base" \
   -t "${OLD_BASE_TAG}" \
-  "${REPO_ROOT}"
-BUILD_RC=$?
-
-mv "${BLUEPRINT_BAK}" "${BLUEPRINT}"
-[ "$BUILD_RC" -eq 0 ] || fail "Failed to build old base image"
+  "${REPO_ROOT}"; then
+  fail "Failed to build old base image"
+fi
+
+restore_blueprint
+trap - EXIT
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/e2e/test-upgrade-stale-sandbox.sh` around lines 101 - 114, The script
can exit before restoring BLUEPRINT if docker build fails under set -e; add a
cleanup trap that always restores the original file (move BLUEPRINT_BAK back to
BLUEPRINT) and captures the exit status so BUILD_RC is set correctly after the
docker build attempt; specifically wrap the docker build block so that either a
trap (using trap 'mv "${BLUEPRINT_BAK}" "${BLUEPRINT}"' EXIT) or a local cleanup
function is installed before running docker build, ensure BUILD_RC is assigned
from the docker build exit code (or from $? immediately after the build) and the
trap is cleared/reset after successful restore, referencing variables and steps
BLUEPRINT, BLUEPRINT_BAK, docker build, and BUILD_RC to locate where to add the
trap and restore logic.
🧹 Nitpick comments (2)
test/onboard.test.ts (1)

4703-4714: Add a behavioral test for pullAndResolveBaseImageDigest, not just constant shape.

This check validates SANDBOX_BASE_IMAGE, but it doesn’t exercise the helper’s pull+inspect flow or its null fallback path. A focused unit test for pullAndResolveBaseImageDigest() would better protect the core fix path.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/onboard.test.ts` around lines 4703 - 4714, Add a focused behavioral unit
test that calls pullAndResolveBaseImageDigest (not just checking
SANDBOX_BASE_IMAGE string) and verifies both the successful pull+inspect path
and the null fallback; mock/stub the Docker interactions (the pull and image
inspect/resolve used inside pullAndResolveBaseImageDigest) to simulate a
resolved digest from the nemoclaw/sandbox-base registry and assert the returned
digest and that the registry host in the pulled image matches
SANDBOX_BASE_IMAGE, then add a second sub-case where the pull/inspect mocks
fail/throw and assert pullAndResolveBaseImageDigest returns null to cover the
fallback path.
src/nemoclaw.ts (1)

1894-1903: Minor UX improvement: Clarify rebuild guidance when unknown sandboxes are present.

Line 1901 says "Run nemoclaw upgrade-sandboxes to rebuild them" even when the only sandboxes reported are unknown (version could not be checked). Since unknown sandboxes are not included in the rebuild path (only stale are), this message could be misleading.

Consider conditionally showing this message only when stale.length > 0:

   if (checkOnly) {
     if (stale.length > 0) console.log(`  ${stale.length} sandbox(es) need upgrading.`);
     if (unknown.length > 0) {
       console.log(
         `  ${unknown.length} sandbox(es) could not be version-checked; start them and rerun, or rebuild manually.`,
       );
     }
-    console.log("  Run `nemoclaw upgrade-sandboxes` to rebuild them.");
+    if (stale.length > 0) {
+      console.log("  Run `nemoclaw upgrade-sandboxes` to rebuild them.");
+    }
     return;
   }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/nemoclaw.ts` around lines 1894 - 1903, The message telling users to run
`nemoclaw upgrade-sandboxes` is shown even when only unknown sandboxes exist;
update the checkOnly branch in the function containing variables checkOnly,
stale, and unknown so that the "Run `nemoclaw upgrade-sandboxes` to rebuild
them." line is only printed when stale.length > 0 (i.e., when there are actually
sandboxes eligible for upgrade); keep the existing unknown diagnostic and
overall early return behavior unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/nightly-e2e.yaml:
- Around line 553-554: The notify-on-failure aggregation is missing the
shields-config-e2e job—add "shields-config-e2e" into the needs array alongside
cloud-e2e, cloud-experimental-e2e, etc., and include the corresponding OR clause
in the if condition (needs.shields-config-e2e.result == 'failure') so a failure
in shields-config-e2e will trigger the failure path; update the needs list and
the if expression in the nightly-e2e workflow accordingly.

In `@scripts/install.sh`:
- Around line 1270-1273: The current block around the sandbox upgrade call
treats failures as non-fatal by using "|| warn", which lets critical upgrade
errors go unnoticed; change the logic in the block that checks _has_sandboxes
and command_exists nemoclaw so that when nemoclaw upgrade-sandboxes --auto fails
you log the failure (including stderr) via the existing error logging function
and exit with a non-zero status (e.g., call error or exit 1) instead of calling
warn; look for the symbols _has_sandboxes, command_exists, nemoclaw and the
invocation upgrade-sandboxes --auto to update the error handling accordingly.

In `@src/lib/onboard.ts`:
- Around line 2817-2825: The current code treats any falsy result from
pullAndResolveBaseImageDigest() as "use cached :latest"; change that so when
resolved is falsy you explicitly check for a local image named
`${SANDBOX_BASE_IMAGE}:${SANDBOX_BASE_TAG}` (e.g. via an
imageExistsLocally/inspectImage call or Docker daemon query) and only fall back
if that check returns true—otherwise fail early with a clear error/exit. Update
the block around pullAndResolveBaseImageDigest() and the resolved handling to
perform this local-image existence check and emit a fatal error (including the
image:tag) when neither pulling nor a local image is available. Ensure
references to pullAndResolveBaseImageDigest, resolved, and
SANDBOX_BASE_IMAGE/SANDBOX_BASE_TAG are used so the change is applied at the
correct location.
- Around line 1051-1056: The current replacement unconditionally overrides any
ARG BASE_IMAGE in the Dockerfile; change the logic in onboard.ts so before
replacing you parse the existing ARG BASE_IMAGE from the dockerfile (use the
same regex that finds /^ARG BASE_IMAGE=.*$/m), and only perform the replace with
baseImageRef when the extracted value already references the built-in image
(contains "ghcr.io/nvidia/nemoclaw/sandbox-base"); otherwise leave the ARG
untouched (alternatively, gate passing baseImageRef only for the built-in
Dockerfile path where pullAndResolveBaseImageDigest() is used). Ensure you still
use the same dockerfile and baseImageRef variables and preserve existing
behavior when the condition matches.

---

Duplicate comments:
In `@test/e2e/test-upgrade-stale-sandbox.sh`:
- Around line 101-114: The script can exit before restoring BLUEPRINT if docker
build fails under set -e; add a cleanup trap that always restores the original
file (move BLUEPRINT_BAK back to BLUEPRINT) and captures the exit status so
BUILD_RC is set correctly after the docker build attempt; specifically wrap the
docker build block so that either a trap (using trap 'mv "${BLUEPRINT_BAK}"
"${BLUEPRINT}"' EXIT) or a local cleanup function is installed before running
docker build, ensure BUILD_RC is assigned from the docker build exit code (or
from $? immediately after the build) and the trap is cleared/reset after
successful restore, referencing variables and steps BLUEPRINT, BLUEPRINT_BAK,
docker build, and BUILD_RC to locate where to add the trap and restore logic.

---

Nitpick comments:
In `@src/nemoclaw.ts`:
- Around line 1894-1903: The message telling users to run `nemoclaw
upgrade-sandboxes` is shown even when only unknown sandboxes exist; update the
checkOnly branch in the function containing variables checkOnly, stale, and
unknown so that the "Run `nemoclaw upgrade-sandboxes` to rebuild them." line is
only printed when stale.length > 0 (i.e., when there are actually sandboxes
eligible for upgrade); keep the existing unknown diagnostic and overall early
return behavior unchanged.

In `@test/onboard.test.ts`:
- Around line 4703-4714: Add a focused behavioral unit test that calls
pullAndResolveBaseImageDigest (not just checking SANDBOX_BASE_IMAGE string) and
verifies both the successful pull+inspect path and the null fallback; mock/stub
the Docker interactions (the pull and image inspect/resolve used inside
pullAndResolveBaseImageDigest) to simulate a resolved digest from the
nemoclaw/sandbox-base registry and assert the returned digest and that the
registry host in the pulled image matches SANDBOX_BASE_IMAGE, then add a second
sub-case where the pull/inspect mocks fail/throw and assert
pullAndResolveBaseImageDigest returns null to cover the fallback path.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: a2da5716-9774-432e-8ebf-798f6a55cf79

📥 Commits

Reviewing files that changed from the base of the PR and between 73e3644 and 1aa450a.

📒 Files selected for processing (8)
  • .github/workflows/nightly-e2e.yaml
  • .gitleaksignore
  • scripts/install.sh
  • src/lib/onboard.ts
  • src/nemoclaw.ts
  • test/cli.test.ts
  • test/e2e/test-upgrade-stale-sandbox.sh
  • test/onboard.test.ts
✅ Files skipped from review due to trivial changes (1)
  • .gitleaksignore

Comment thread .github/workflows/nightly-e2e.yaml
Comment thread scripts/install.sh
Comment thread src/lib/onboard.ts
Comment thread src/lib/onboard.ts
Address two CodeRabbit findings:

1. patchStagedDockerfile now only rewrites ARG BASE_IMAGE when the
   current value already references ghcr.io/nvidia/nemoclaw/sandbox-base.
   Custom --from Dockerfiles using a different base image are preserved.

2. When pullAndResolveBaseImageDigest fails, verify the base image
   exists locally before falling back to unpinned :latest. On first-time
   installs behind a firewall with no cached image, fail early with a
   clear error instead of deferring to a confusing Docker build failure.

Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
The process.exit(1) in the offline fallback broke CI unit tests — the
basic-checks runner has no Docker daemon, so docker image inspect always
fails, causing 10+ createSandbox tests to abort. The Docker build itself
will fail clearly if the image is truly unavailable, so a warning with
actionable guidance is sufficient here.

Signed-off-by: Aaron Erickson <aerickson@nvidia.com>

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (3)
src/lib/onboard.ts (1)

2825-2848: Consider extracting this fallback block to keep createSandbox() from growing further.

This is good logic, but adding more branch-heavy behavior inside createSandbox() makes future maintenance/testing harder.

As per coding guidelines: "Limit cyclomatic complexity to 20 in JavaScript/TypeScript files, with target of 15".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/lib/onboard.ts` around lines 2825 - 2848, The fallback logic checking
pullAndResolveBaseImageDigest(), invoking runCapture(...) to inspect
SANDBOX_BASE_IMAGE and SANDBOX_BASE_TAG, and logging/errors should be extracted
out of createSandbox() into a new helper (e.g., ensureBaseImageAvailableOrExit
or resolveBaseImageWithFallback) that returns the resolved digest or null and
performs the local-check logging and process.exit(1) on hard failure; update
createSandbox() to call this helper and use its return value, keeping all
references to pullAndResolveBaseImageDigest, runCapture, SANDBOX_BASE_IMAGE and
SANDBOX_BASE_TAG inside the new function so the main createSandbox() body is
shorter and cyclomatic complexity is reduced.
test/onboard.test.ts (2)

4753-4764: Test name overstates what is being verified.

This case validates SANDBOX_BASE_IMAGE only; it does not exercise pullAndResolveBaseImageDigest directly. Consider renaming for precision.

✏️ Suggested rename
-it("regression `#1904`: pullAndResolveBaseImageDigest uses sandbox-base registry", () => {
+it("regression `#1904`: SANDBOX_BASE_IMAGE uses sandbox-base registry", () => {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/onboard.test.ts` around lines 4753 - 4764, The test title is misleading:
it claims to verify pullAndResolveBaseImageDigest but only asserts the
SANDBOX_BASE_IMAGE constant; update the test name/description to reflect that it
validates the SANDBOX_BASE_IMAGE value (e.g., "SANDBOX_BASE_IMAGE references
nemoclaw/sandbox-base and not openshell-community") so the behavior matches the
assertions, and leave the assertions on SANDBOX_BASE_IMAGE unchanged; reference
the test block containing the assertions and the SANDBOX_BASE_IMAGE symbol when
making the rename.

4766-4778: Order assertion is too broad and can false-pass.

The current indexOf scan is file-wide; it can match unrelated occurrences outside createSandbox. Scope the check to createSandbox source text first.

🎯 More robust scoped assertion
   const source = fs.readFileSync(
     path.join(import.meta.dirname, "..", "src", "lib", "onboard.ts"),
     "utf-8",
   );
-  const pullPos = source.indexOf("pullAndResolveBaseImageDigest()");
-  assert.ok(pullPos !== -1, "pullAndResolveBaseImageDigest() call not found in onboard.ts");
-  const patchPos = source.indexOf("patchStagedDockerfile(", pullPos);
+  const fnMatch = source.match(/async function createSandbox\([\s\S]*?\n\}/);
+  assert.ok(fnMatch, "createSandbox() definition not found in onboard.ts");
+  const fnSource = fnMatch[0];
+  const pullPos = fnSource.indexOf("pullAndResolveBaseImageDigest()");
+  assert.ok(pullPos !== -1, "pullAndResolveBaseImageDigest() call not found in createSandbox()");
+  const patchPos = fnSource.indexOf("patchStagedDockerfile(", pullPos);
   assert.ok(
     patchPos > pullPos,
     "pullAndResolveBaseImageDigest must be called BEFORE patchStagedDockerfile — regression `#1904`",
   );
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/onboard.test.ts` around lines 4766 - 4778, The current test searches the
whole file for pullAndResolveBaseImageDigest() and patchStagedDockerfile( which
can match unrelated code; instead extract the source of createSandbox and run
the ordering assertions inside that scope: locate the createSandbox start by
finding "createSandbox(" or its declaration token (e.g., "function
createSandbox" / "const createSandbox"), compute the function body bounds (e.g.,
find the matching closing brace or use a small regex to capture the function
block), then call indexOf("pullAndResolveBaseImageDigest()") and
indexOf("patchStagedDockerfile(", startAt=0) on that extracted createSandbox
source and assert that pullPos !== -1 and patchPos > pullPos to ensure
pullAndResolveBaseImageDigest is called before patchStagedDockerfile only within
createSandbox.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/lib/onboard.ts`:
- Around line 2842-2847: The early process.exit(1) in the offline-image failure
path (where SANDBOX_BASE_IMAGE and SANDBOX_BASE_TAG are logged) causes the
temporary staged build context to be left behind; change this path to run the
same cleanup used elsewhere (the run("rm -rf ...") cleanup) before exiting or
restructure the surrounding function to use a try/finally that always calls the
cleanup routine, then exit with the same non-zero code if the image is
unavailable; ensure you update the branch that currently calls process.exit(1)
so it invokes the existing cleanup logic (or the cleanup function) prior to
terminating.

---

Nitpick comments:
In `@src/lib/onboard.ts`:
- Around line 2825-2848: The fallback logic checking
pullAndResolveBaseImageDigest(), invoking runCapture(...) to inspect
SANDBOX_BASE_IMAGE and SANDBOX_BASE_TAG, and logging/errors should be extracted
out of createSandbox() into a new helper (e.g., ensureBaseImageAvailableOrExit
or resolveBaseImageWithFallback) that returns the resolved digest or null and
performs the local-check logging and process.exit(1) on hard failure; update
createSandbox() to call this helper and use its return value, keeping all
references to pullAndResolveBaseImageDigest, runCapture, SANDBOX_BASE_IMAGE and
SANDBOX_BASE_TAG inside the new function so the main createSandbox() body is
shorter and cyclomatic complexity is reduced.

In `@test/onboard.test.ts`:
- Around line 4753-4764: The test title is misleading: it claims to verify
pullAndResolveBaseImageDigest but only asserts the SANDBOX_BASE_IMAGE constant;
update the test name/description to reflect that it validates the
SANDBOX_BASE_IMAGE value (e.g., "SANDBOX_BASE_IMAGE references
nemoclaw/sandbox-base and not openshell-community") so the behavior matches the
assertions, and leave the assertions on SANDBOX_BASE_IMAGE unchanged; reference
the test block containing the assertions and the SANDBOX_BASE_IMAGE symbol when
making the rename.
- Around line 4766-4778: The current test searches the whole file for
pullAndResolveBaseImageDigest() and patchStagedDockerfile( which can match
unrelated code; instead extract the source of createSandbox and run the ordering
assertions inside that scope: locate the createSandbox start by finding
"createSandbox(" or its declaration token (e.g., "function createSandbox" /
"const createSandbox"), compute the function body bounds (e.g., find the
matching closing brace or use a small regex to capture the function block), then
call indexOf("pullAndResolveBaseImageDigest()") and
indexOf("patchStagedDockerfile(", startAt=0) on that extracted createSandbox
source and assert that pullPos !== -1 and patchPos > pullPos to ensure
pullAndResolveBaseImageDigest is called before patchStagedDockerfile only within
createSandbox.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 88b1deca-7792-4c5c-bc18-b036c8dab096

📥 Commits

Reviewing files that changed from the base of the PR and between 1aa450a and 4eb28f2.

📒 Files selected for processing (2)
  • src/lib/onboard.ts
  • test/onboard.test.ts

Comment thread src/lib/onboard.ts Outdated

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/lib/onboard.ts (1)

2825-2859: ⚠️ Potential issue | 🟡 Minor

Only resolve/pull sandbox-base when the staged Dockerfile actually uses it.

This now runs for built-in, --from, and agent Dockerfiles. For a custom Dockerfile with a different base image, it still does an unnecessary GHCR pull and can emit a false warning that the build may fail because ghcr.io/nvidia/nemoclaw/sandbox-base:latest is unavailable, even though that image is never referenced.

Suggested fix
-  const resolved = pullAndResolveBaseImageDigest();
-  if (resolved) {
+  const dockerfileText = fs.readFileSync(stagedDockerfile, "utf8");
+  const currentBaseImage = dockerfileText.match(/^ARG BASE_IMAGE=(.*)$/m)?.[1]?.trim() || "";
+  const usesSandboxBase =
+    currentBaseImage.startsWith(`${SANDBOX_BASE_IMAGE}:`) ||
+    currentBaseImage.startsWith(`${SANDBOX_BASE_IMAGE}@`) ||
+    dockerfileText.includes(`FROM ${SANDBOX_BASE_IMAGE}:`) ||
+    dockerfileText.includes(`FROM ${SANDBOX_BASE_IMAGE}@`);
+
+  const resolved = usesSandboxBase ? pullAndResolveBaseImageDigest() : null;
+  if (resolved) {
     console.log(`  Pinning base image to ${resolved.digest.slice(0, 19)}...`);
-  } else {
+  } else if (usesSandboxBase) {
     // Check if the image exists locally before falling back to unpinned :latest.
     // On a first-time install behind a firewall with no cached image, warn early
     // so the user knows the build will likely fail.
     const localCheck = runCapture(
       ["docker", "image", "inspect", `${SANDBOX_BASE_IMAGE}:${SANDBOX_BASE_TAG}`],
       { ignoreError: true },
     );

Based on learnings: patchStagedDockerfile is called on built-in, --from custom, and agent-generated Dockerfiles in createSandbox.


ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: db6fda70-e00d-4ee7-b98f-f772fe09dd6e

📥 Commits

Reviewing files that changed from the base of the PR and between 4eb28f2 and a3b93ef.

📒 Files selected for processing (1)
  • src/lib/onboard.ts

@ericksoa ericksoa merged commit c038795 into main Apr 16, 2026
23 of 27 checks passed
miyoungc added a commit that referenced this pull request Apr 17, 2026
Refresh user-facing docs against the 34 commits merged between v0.0.17
and v0.0.18. Highlights:

- Replace the Ollama 0.0.0.0 binding guidance with the new authenticated
  reverse proxy on 127.0.0.1:11435 (#1922).
- Document the compatible-endpoint provider defaulting to
  /v1/chat/completions and the NEMOCLAW_PREFERRED_API=openai-responses
  opt-in (#1984).
- Add the new nemoclaw upgrade-sandboxes command with --check, --auto,
  and --yes flags (#1943).
- Note the cross-sandbox messaging overlap warning and 409 detection in
  nemoclaw <name> status (#1953).
- Document the messaging-token rotation auto-rebuild flow (#1967).
- Cover new troubleshooting entries for the Ollama auth proxy, IPv6
  localhost resolution, orphan SSH port-forward cleanup on re-onboard,
  and rotated messaging credentials (#1978, #1950).
- Note tar failure exit code for nemoclaw debug --output (#1770) and the
  orphaned openshell process cleanup in nemoclaw uninstall (#1940).

Also:

- Extend docs/.docs-skip to exclude the experimental sandbox-mgmt
  shields and config commands (#1976).
- Fix a sphinx-autobuild infinite rebuild loop in docs/conf.py by
  writing docs/project.json only when its contents change.
- Bump the docs version switcher preferred entry to 0.0.18.
- Regenerate nemoclaw-user-* agent skills from docs/.

Signed-off-by: Miyoung Choi <miyoungc@nvidia.com>
Made-with: Cursor
miyoungc added a commit that referenced this pull request Apr 17, 2026
## Summary

Refresh user-facing documentation against the 34 commits merged between
v0.0.17 and v0.0.18, bump the docs version switcher to v0.0.18, and fix
a
`sphinx-autobuild` infinite-rebuild loop triggered by `docs/conf.py`.

## Changes

- **Ollama authenticated reverse proxy** (#1922): Replace the
`0.0.0.0:11434` guidance in `docs/inference/use-local-inference.md` with
the new token-gated proxy on `127.0.0.1:11435`, including persisted
token,
health-check exemption, and sandbox provider wiring. Replace the
matching
  troubleshooting entry in `docs/reference/troubleshooting.md`.
- **Compatible-endpoint default API path** (#1984): Document that the
compatible-endpoint provider now defaults to `/v1/chat/completions` and
  update `NEMOCLAW_PREFERRED_API` to describe `openai-responses` as the
  opt-in instead of `openai-completions`. Updates in
  `use-local-inference.md`, `switch-inference-providers.md`, and
  `troubleshooting.md`.
- **`nemoclaw upgrade-sandboxes` command** (#1943): Add a new reference
entry in `docs/reference/commands.md` covering `--check`, `--auto`, and
  `--yes` flags.
- **Messaging token rotation auto-rebuild** (#1967, #1953): Note the
  automatic rebuild behavior and cross-sandbox overlap warning in
  `docs/deployment/set-up-telegram-bridge.md`, `commands.md`, and
  `troubleshooting.md`.
- **Other troubleshooting additions**:
  - `localhost` → `127.0.0.1` IPv6 note (#1978)
  - Orphan SSH port-forward cleanup on re-onboard (#1950)
  - Orphan `openshell` process cleanup in `nemoclaw uninstall` (#1940)
  - Non-zero exit on tar failure in `nemoclaw debug --output` (#1770)
- **Skip list**: Extend `docs/.docs-skip` to exclude the experimental
  sandbox-mgmt shields and config commands feature (#1976), which was
  explicitly merged as not-yet-documented.
- **Build stability**: `docs/conf.py` now writes `docs/project.json`
only
when contents change, so `make docs-live` / `sphinx-autobuild` no longer
detects its own generated file as a source change and enters an infinite
  rebuild loop.
- **Version switcher**: Bump `docs/versions1.json` and
`docs/project.json`
preferred entry to v0.0.18 so this refresh renders under the new
version.
- **Agent skills**: Regenerate `nemoclaw-user-*` skills from `docs/`
with
  `scripts/docs-to-skills.py`.

## Type of Change

- [ ] Code change (feature, bug fix, or refactor)
- [ ] Code change with doc updates
- [x] Doc only (prose changes, no code sample modifications)
- [ ] Doc only (includes code sample changes)

## Verification

- [x] `npx prek run --all-files` passes (ran via pre-commit hook on
staged files)
- [ ] `npm test` passes
- [ ] Tests added or updated for new or changed behavior
- [x] No secrets, API keys, or credentials committed
- [x] Docs updated for user-facing behavior changes
- [x] `make docs` builds without warnings (doc changes only)
- [x] Doc pages follow the [style
guide](https://github.com/NVIDIA/NemoClaw/blob/main/docs/CONTRIBUTING.md)
(doc changes only)
- [ ] New doc pages include SPDX header and frontmatter (new pages only)

## AI Disclosure

- [x] AI-assisted — tool: Cursor

---

Signed-off-by: Miyoung Choi <miyoungc@nvidia.com>

Made with [Cursor](https://cursor.com)

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

## Release Notes

* **New Features**
* Added `nemoclaw upgrade-sandboxes` command to rebuild sandboxes when
base-image digests change.
* Introduced authenticated reverse proxy for local Ollama inference with
token-based access control.
* Automatic sandbox backup, recreation, and restore when messaging
credentials are updated.
* Cross-sandbox messaging token overlap detection with status warnings.

* **Improvements**
* Compatible-endpoint provider now defaults to `/v1/chat/completions`
API path.
* Enhanced troubleshooting documentation with new diagnostics sections.

* **Documentation**
  * Updated onboarding and configuration guides.
  * Expanded version documentation to 0.0.18.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Signed-off-by: Miyoung Choi <miyoungc@nvidia.com>
@wscurran wscurran added the bug-fix PR fixes a bug or regression label Jun 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug-fix PR fixes a bug or regression

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[All Platform][Upgrade][Github Issue #1904] The sandbox OpenClaw version is not upgraded after NemoClaw upgrade

2 participants