Change small local model to qwen3.5:9b by ericksoa · Pull Request #3 · NVIDIA/NemoClaw

ericksoa · 2026-03-16T18:26:07Z

Migrated from NVIDIA/openshell-openclaw-plugin#13 by @jacobtomlinson

qwen3.5:9b

jacobtomlinson · 2026-03-20T16:23:26Z

We should hold on this until decisions are made about which models are best to use here.

Add `openclaw nemoclaw onboard` command

… (#1305) ## Summary Fixes the four issues reported in #1114 — EACCES permission errors and missing gateway token when running inside the NemoClaw sandbox. ### Issue mapping | # | Reported error | Fix | |---|----------------|-----| | 1 | `EACCES: open '/sandbox/.openclaw/openclaw.json.*.tmp'` | `install_configure_guard` — intercepts `openclaw configure` with a clear error and directs users to `nemoclaw onboard --resume` on the host | | 2 | Same as #1 (different PID) | Same fix | | 3 | `EACCES: mkdir '/sandbox/.openclaw/credentials'` | Already resolved on main via #1519 (credentials symlink to `.openclaw-data/`) | | 4 | No WhatsApp QR code | Consequence of #3, also resolved by #1519 | ### Root cause (issues 1 & 2) OpenClaw's `configure` command performs atomic writes — it creates a temp file (`openclaw.json.PID.UUID.tmp`) in the same directory as the config. Since `/sandbox/.openclaw/` is Landlock read-only at the kernel level, file creation is rejected with EACCES. This is by design: the sandbox config is intentionally immutable at runtime. Rather than weakening Landlock (security regression), we intercept the command in the sandbox shell and guide users to the correct host-side workflow. ### Changes **1. `install_configure_guard()`** — Writes a shell function wrapper to `.bashrc`/`.profile` that intercepts `openclaw configure` and prints: ``` Error: 'openclaw configure' cannot modify config inside the sandbox. The sandbox config is read-only (Landlock enforced) for security. To change your configuration, exit the sandbox and run: nemoclaw onboard --resume This rebuilds the sandbox with your updated settings. ``` All other `openclaw` subcommands pass through to the real binary. **2. `export_gateway_token()`** — Reads `gateway.auth.token` from `openclaw.json` and exports it as `OPENCLAW_GATEWAY_TOKEN`, so interactive sessions (`openshell sandbox connect`) can authenticate with the gateway. Persists to `.bashrc`/`.profile` using idempotent marker blocks and cleans stale tokens on revocation. **3. `_read_gateway_token()` helper** — Shared Python snippet used by both `export_gateway_token` and `print_dashboard_urls` (deduplication, uses `with open()` context manager). All three are called in both root and non-root startup paths. ## Security properties preserved - `/sandbox/.openclaw` remains root-owned, Landlock read-only - `openclaw.json` remains chmod 444 (immutable) - No new attack surface — token is read-only from existing config - `command openclaw` bypass preserves all non-configure functionality Fixes #1114 Signed-off-by: Dongni Yang <dongniy@nvidia.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Signed-off-by: Dongni Yang <dongniy@nvidia.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

…IA#1114) (NVIDIA#1305) ## Summary Fixes the four issues reported in NVIDIA#1114 — EACCES permission errors and missing gateway token when running inside the NemoClaw sandbox. ### Issue mapping | # | Reported error | Fix | |---|----------------|-----| | 1 | `EACCES: open '/sandbox/.openclaw/openclaw.json.*.tmp'` | `install_configure_guard` — intercepts `openclaw configure` with a clear error and directs users to `nemoclaw onboard --resume` on the host | | 2 | Same as NVIDIA#1 (different PID) | Same fix | | 3 | `EACCES: mkdir '/sandbox/.openclaw/credentials'` | Already resolved on main via NVIDIA#1519 (credentials symlink to `.openclaw-data/`) | | 4 | No WhatsApp QR code | Consequence of NVIDIA#3, also resolved by NVIDIA#1519 | ### Root cause (issues 1 & 2) OpenClaw's `configure` command performs atomic writes — it creates a temp file (`openclaw.json.PID.UUID.tmp`) in the same directory as the config. Since `/sandbox/.openclaw/` is Landlock read-only at the kernel level, file creation is rejected with EACCES. This is by design: the sandbox config is intentionally immutable at runtime. Rather than weakening Landlock (security regression), we intercept the command in the sandbox shell and guide users to the correct host-side workflow. ### Changes **1. `install_configure_guard()`** — Writes a shell function wrapper to `.bashrc`/`.profile` that intercepts `openclaw configure` and prints: ``` Error: 'openclaw configure' cannot modify config inside the sandbox. The sandbox config is read-only (Landlock enforced) for security. To change your configuration, exit the sandbox and run: nemoclaw onboard --resume This rebuilds the sandbox with your updated settings. ``` All other `openclaw` subcommands pass through to the real binary. **2. `export_gateway_token()`** — Reads `gateway.auth.token` from `openclaw.json` and exports it as `OPENCLAW_GATEWAY_TOKEN`, so interactive sessions (`openshell sandbox connect`) can authenticate with the gateway. Persists to `.bashrc`/`.profile` using idempotent marker blocks and cleans stale tokens on revocation. **3. `_read_gateway_token()` helper** — Shared Python snippet used by both `export_gateway_token` and `print_dashboard_urls` (deduplication, uses `with open()` context manager). All three are called in both root and non-root startup paths. ## Security properties preserved - `/sandbox/.openclaw` remains root-owned, Landlock read-only - `openclaw.json` remains chmod 444 (immutable) - No new attack surface — token is read-only from existing config - `command openclaw` bypass preserves all non-configure functionality Fixes NVIDIA#1114 Signed-off-by: Dongni Yang <dongniy@nvidia.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Signed-off-by: Dongni Yang <dongniy@nvidia.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

- Guard runArgv/runArgvCapture against shell:true to prevent security bypass (finding #1) — throws if a caller attempts to re-enable shell interpretation. Added 2 tests. - Document the intentional bash -c exception in getOllamaWarmupCommand explaining why it's safe (finding NVIDIA#2). - Remove dead getOpenshellCommand() from policies.ts (finding NVIDIA#3). - Remove unused shellQuote import from nim.ts (finding NVIDIA#4). - Fix brittle indexOf assertion in onboard-readiness test (finding NVIDIA#5).

- Remove unused getForwardList() call from getActiveSandboxSessions — only pgrep/ps is needed for SSH session detection (warning #1) - Consolidate double-prompt in sandboxDestroy into single enriched confirmation prompt (warning #2) - Remove noisy cleanupGatewayAfterLastSandbox forward check that would always fire due to dashboard forward (warning #3) - Use word-boundary regex in parseSshProcesses to prevent false positives when sandbox names share prefixes (warning #4) - Export SessionClassification as named interface (suggestion #1) - Use cross-platform ps -axo instead of Linux-only pgrep -a for macOS compatibility (suggestion #2) - Add forwardCount to SessionClassification for future consumers - Add tests for word-boundary matching edge cases

- Deduplicate: timer now imports lockAgentConfig from shields.ts instead of reimplementing ~60 lines of kubectlExec + stat/lsattr verification inline. Removes the duplicated kubectlExec and K3S_CONTAINER constant from shields-timer.ts. - Fix timer state gap (Blocker #3): the !lockVerified path now explicitly writes updateState({ shieldsDown: true }) before exiting, rather than relying on the absence of an update from shieldsDown(). - Fix rollback state lie (CodeRabbit): shieldsDown rollback no longer marks shields as UP when policy restore or lock verification fails. If either fails, state stays shieldsDown: true with guidance for manual intervention. - Add lsattr format comment (Warning #3) for the flag-parsing line.

Eight findings from a fresh-context adversarial review. Resolutions: 1. Memoize applyOverlayfsAutoFix per upstream image. recoverGatewayRuntime's second call to getGatewayStartEnv now returns the cached patched-tag without re-running assessHost or re-attempting the build. (The reviewer's "45-minute retry storm" was overstated — pRetry captures gatewayEnv from outer scope, so the build attempt is once per startGatewayWithOptions, not once per retry — but the recovery-path redundancy is real and worth deduping.) 2. Bind the patched-image cache key to the upstream image's content digest. computePatchedTag now SHAs over (upstreamImage, upstreamDigest, snapshotter, dockerfile). ensurePatchedClusterImage resolves the digest via `docker image inspect <upstream>` (zero network cost when warm; air-gap-safe with pre-staged images). If the local upstream isn't there, a `docker manifest inspect` reachability probe (bounded by inspectTimeoutMs, default 30s) runs BEFORE the long pull, so air-gapped/restricted hosts fail in seconds with a documented error instead of hanging through a 10-minute pull timeout. New unit test: "differs when only the upstream digest changes". New unit test: "fails fast with a documented error when upstream is unreachable on a cache miss". 3. Same `docker manifest inspect` probe doubles as the air-gap UX fix from finding #3 — fast failure mode + actionable error message that points at the troubleshooting doc. 4. Exclude WSL2 hosts from hasNestedOverlayConflict. We don't have a confirmed reproducer there, and the WSL kernel's overlay story is different from bare Linux. Conservative: leave WSL on the upstream image. New unit test: "does not flag a WSL2 Linux host as a conflict". 5. applyOverlayfsAutoFix now logs a console.warn breadcrumb when assessHost throws, instead of silently returning null. Future regressions in host assessment won't make the auto-fix mysteriously stop firing without any user-visible signal. 6. Tighten the e2e negative phase. Was: "any non-zero install.sh exit" passes (SKIP on the canonical-error-string check). Now: requires at least one of three nested-overlay-failure signatures in the cluster log or the install log: - "overlayfs snapshotter cannot be enabled" (k3s init) - "CreateDiff: Canceled" (sandbox image build) - "failed to mount overlay" (catch-all) Otherwise FAIL. Distinguishes a real reproduction from unrelated flakes (NVIDIA_API_KEY rejection, GHCR rate-limit, daemon blip). 7. Docs note that switching the host's storage driver via daemon.json doesn't just kill running containers — it also rebuilds the entire local image graph, so previously-pulled images become unusable until re-pulled. Documented under the manual workaround. 8. parseDockerStorageDriver now falls back to the plain-text `Storage Driver: <name>` form. assessHost still passes `--format '{{json .}}'` (the canonical path), but a future caller injecting raw `docker info` output won't silently miss the conflict. New unit test for the plain-text fixture. Local sanity-build of the patched Dockerfile (with `ubuntu:24.04` as the UPSTREAM stand-in) still produces a working `fuse-overlayfs --version` binary in the final image. Refs #2481. Signed-off-by: Aaron Erickson <aerickson@nvidia.com>

shouldShowLine recognized classic Docker pull lines (Pulling from / hex-prefix layer status / Status:) but dropped BuildKit progress (#3 resolve … / #3 sha256:… 12.34MB / 45.67MB), so users on BuildKit- enabled engines saw the Pulling base image from registry... phase banner but no actual progress until the pull completed. Extract the BuildKit pull-line regex into a single BUILDKIT_PULL_LINE const used by both shouldShowLine and isPullLine — fixes the forwarding gap and removes the previous duplicate inline regex between the two predicates. Tighten the BuildKit test to assert sawProgress: true and that both emitted progress lines actually reach logLine, locking in the fix against silent regressions. Signed-off-by: latenighthackathon <latenighthackathon@users.noreply.github.com>

) ## Summary NemoClaw's sandbox create stream only recognized the legacy Docker builder format, so BuildKit output would not be treated as active build progress once OpenShell emits it. This adds BuildKit progress markers to the same parser path as the existing legacy builder output. It keeps the current legacy behavior and makes `#1 [internal] ...`, `#2 CACHED`, and `#3 DONE ...` visible as build progress. ## Changes - `src/lib/sandbox-create-stream.ts`: recognize BuildKit step and completion lines while tracking the build phase. - `src/lib/sandbox-create-stream.test.ts`: cover BuildKit progress output and verify it is streamed to the user. ## Testing - `npm run build:cli` passed - `npm run typecheck:cli` passed - `npm test -- src/lib/sandbox-create-stream.test.ts` passed - `npm test` was also attempted. The full suite is not green on current main in this environment; failures are in existing installer/onboard/legacy-guard tests outside this change. ## Evidence it works The new focused test feeds BuildKit-style output into `streamSandboxCreate` and verifies that the lines are logged, collected in output, and mark sandbox creation as having seen progress. Fixes #2311 Signed-off-by: Deepak Jain <deepujain@gmail.com>  ## Summary by CodeRabbit * **Bug Fixes** * Improved detection and display of BuildKit and upload progress so progress markers and completion states are recognized reliably. * **Refactor** * Centralized progress-detection logic for more consistent handling of build and upload output. * **Tests** * Added a test ensuring BuildKit-formatted progress lines are captured, included in output, and reported to the log callback.  Signed-off-by: Deepak Jain <deepujain@gmail.com> Co-authored-by: Carlos Villela <cvillela@nvidia.com>

## Summary Upgrades OpenClaw from **2026.4.9** to **2026.4.24** (latest stable, CalVer). ### Fixes in this PR 1. **Version bumps** — `Dockerfile.base`, `nemoclaw-blueprint/blueprint.yaml`, `agents/openclaw/manifest.yaml`, `src/lib/sandbox-version.test.ts`. 2. **Patch 4 updated** — OpenClaw 2026.4.24 restructured `replaceConfigFile` to first attempt `tryWriteSingleTopLevelIncludeMutation` (writes to a `$include` file like `plugins.json5`) before falling back to `writeConfigFile`. The old patch matched an exact tab-indented `writeConfigFile(params.nextConfig, {...})` string that no longer exists. Updated to match the new `if (!await tryWriteSingleTopLevelIncludeMutation(...)) await writeConfigFile(...)` block and wrap the entire write path in the OPENSHELL_SANDBOX-gated EACCES try/catch. 3. **`plugin-runtime-deps` symlink** — OpenClaw 2026.4.24 introduced lazy plugin runtime-dep installation (Jiti loader). The CLI writes to `~/.openclaw/plugin-runtime-deps/openclaw-<version>-<hash>/` on first invocation. NemoClaw locks `/sandbox/.openclaw` to `444 root:root`, so every bundled provider failed to load with `EACCES`. Fix: created the dir in the writable `.openclaw-data` tree and symlinked it from the immutable config tree, mirroring the existing pattern used for `logs`, `credentials`, `extensions`, etc. Added in both `Dockerfile.base` (canonical) and `Dockerfile` (idempotent fixup for stale GHCR base). 4. **Selective sandbox safety-net** — `_SANDBOX_SAFETY_NET` (a Node `--require` preload from `nemoclaw-start.sh`) used to be a catch-all swallow + `process.exit` interceptor. Rewritten to: (a) gate to gateway processes only (`OPENSHELL_SANDBOX=1` + `argv[2]==='gateway'`) so CLI commands keep default Node crash behaviour; (b) match documented known-benign patterns (currently `ciao`/mDNS — produced when bonjour's probe state machine cancels itself, since the sandbox netns has no multicast); (c) for unknown errors, log full stack but keep gateway alive (gateway is shared infrastructure, user-initiated actions must not take it down); (d) drop `process.exit` interception entirely. The CIAO guard's `uncaughtException` listener was similarly gated to gateway processes — registering one in CLI processes turns Node's default crash-on-uncaught into silent absorb, which would silently hang `openclaw agent`. 5. **Disable bonjour and qqbot bundled plugins** — both ship enabled-by-default in 2026.4.24 and break in the sandbox netns: - **bonjour**: introduced in 2026.4.15, uses `@homebridge/ciao` for mDNS announcement. Sandbox netns has no multicast — ciao's probe state machine fails at startup. - **qqbot**: has `stageRuntimeDependencies=true`, so its npm deps (`@tencent-connect/qqbot-connector`, `silk-wasm`, etc.) install on first load. The sandbox L7 proxy denies the registry URL with `403 policy_denied`, the install retries for ~6 minutes, and while channel loading is stuck the gateway can't service `openclaw agent` requests. Both disabled via `plugins.entries.<id>.enabled = false` in `scripts/generate-openclaw-config.py`. 6. **Build-context fix for `generate-openclaw-config.py`** — main's PR #2449 (commit `f5ee8a4d`) extracted the inline Python config-generator from Dockerfile into `scripts/generate-openclaw-config.py` and added `COPY scripts/generate-openclaw-config.py …` to Dockerfile, but did not update `src/lib/sandbox-build-context.ts` which curates the optimized build context for sandbox image builds. Without this, every nightly E2E job (and any sandbox onboard) fails with `COPY failed: file not found in build context`. Added the file to `stageOptimizedSandboxBuildContext()` next to `nemoclaw-start.sh` and added a test assertion so the staging stays in sync. ### Status Most recent un-rate-limited run (25015126555 with build-context fix): **13 of 18 jobs pass**. `sandbox-operations-e2e` still fails — only TC-SBX-02 (Connect & Chat) within it. All other TC-SBX cases (03, 04, 05, 06, 07, 08, 10, 11, 12) pass on `test-sbx-a`, confirming the gateway is functional. After the `sandbox-build-context.ts` fix and the qqbot disable, the failure mode of TC-SBX-02 changed from `SSH command timed out after 60s` to `Expected '42' in agent reply; reply=''` — same 60-90 second hang but now hitting the test's outer `run_with_timeout` rather than producing a stack trace. The test drops stderr (`2>/dev/null`), and the gateway-log streamer/snapshot infrastructure has been unable to capture `test-sbx-a`'s `/tmp/openclaw-998/openclaw-*.log` reliably (the post-test openshell state has no active gateway after TC-SBX-06's docker kill, and the streamer's connection to test-sbx-a races and gets `Connection refused`). Still root-causing. ### Notable upstream changes (2026.4.9 → 2026.4.24) - Google Meet bundled plugin, DeepSeek V4 Flash/Pro, realtime voice loops (Talk/Voice Call/Google Meet), Gemini Live, browser automation improvements. - Lighter startup: static model catalogs, manifest-backed model rows, **lazy provider dependencies** (the new plugin-runtime-deps mechanism — root cause of fix #3). - **Breaking:** Plugin SDK tool-result transforms migrated from `registerEmbeddedExtensionFactory()` to `registerAgentToolResultMiddleware()` — verified NemoClaw uses neither. - **Breaking:** Plugin registry migrated from `plugins.installs` config key to managed `plugins/installs.json` ledger — `openclaw doctor --fix` migrates automatically. - Config writes restructured to use single-file `$include` mutations before falling back to full config write (root cause of fix #2). - CVE-2026-41349, CVE-2026-22181 fixes; exec-approvals chat enablement (2026.4.22); cron `jobs-state.json` separation (2026.4.20). - bonjour mDNS plugin added in 2026.4.15 (root cause of fix #5a). ### User sandbox state migration on rebuild Existing user sandboxes upgrade via `nemoclaw <name> rebuild`. State (memory/, workspace/, agents/, extensions/, etc.) is backed up via tar, sandbox is destroyed and recreated with the new image, state is restored, `openclaw doctor --fix` runs post-restore. **Handled automatically:** memory, cron job definitions, plugin auto-discovery, plugin registry migration. **Existing reset behavior (not new):** exec-approvals, credentials, device pairing. **New minor behavior change:** cron runtime state (`jobs-state.json`) absent in pre-2026.4.20 backups — job execution history resets, jobs may re-fire once after upgrade. ## Test plan - [x] CI lint, typecheck, unit tests pass - [x] Docker base image and sandbox image build with all dist patches applied - [x] 13/18 nightly E2E jobs pass cleanly with all six fixes - [ ] **TC-SBX-02** — root cause for the residual `reply=''` hang under investigation; the gateway-log capture infrastructure needs to work reliably post-test before we can read what's happening server-side - [ ] Manual smoke test via `nemoclaw <sandbox> connect` interactive flow - [ ] Rebuild test: existing 2026.4.9 sandbox → rebuild → verify state preserved (rebuild-openclaw-e2e covers this)

… (NVIDIA#2404) ## Summary NemoClaw's sandbox create stream only recognized the legacy Docker builder format, so BuildKit output would not be treated as active build progress once OpenShell emits it. This adds BuildKit progress markers to the same parser path as the existing legacy builder output. It keeps the current legacy behavior and makes `NVIDIA#1 [internal] ...`, `NVIDIA#2 CACHED`, and `NVIDIA#3 DONE ...` visible as build progress. ## Changes - `src/lib/sandbox-create-stream.ts`: recognize BuildKit step and completion lines while tracking the build phase. - `src/lib/sandbox-create-stream.test.ts`: cover BuildKit progress output and verify it is streamed to the user. ## Testing - `npm run build:cli` passed - `npm run typecheck:cli` passed - `npm test -- src/lib/sandbox-create-stream.test.ts` passed - `npm test` was also attempted. The full suite is not green on current main in this environment; failures are in existing installer/onboard/legacy-guard tests outside this change. ## Evidence it works The new focused test feeds BuildKit-style output into `streamSandboxCreate` and verifies that the lines are logged, collected in output, and mark sandbox creation as having seen progress. Fixes NVIDIA#2311 Signed-off-by: Deepak Jain <deepujain@gmail.com>  ## Summary by CodeRabbit * **Bug Fixes** * Improved detection and display of BuildKit and upload progress so progress markers and completion states are recognized reliably. * **Refactor** * Centralized progress-detection logic for more consistent handling of build and upload output. * **Tests** * Added a test ensuring BuildKit-formatted progress lines are captured, included in output, and reported to the log callback.  Signed-off-by: Deepak Jain <deepujain@gmail.com> Co-authored-by: Carlos Villela <cvillela@nvidia.com>

## Summary Upgrades OpenClaw from **2026.4.9** to **2026.4.24** (latest stable, CalVer). ### Fixes in this PR 1. **Version bumps** — `Dockerfile.base`, `nemoclaw-blueprint/blueprint.yaml`, `agents/openclaw/manifest.yaml`, `src/lib/sandbox-version.test.ts`. 2. **Patch 4 updated** — OpenClaw 2026.4.24 restructured `replaceConfigFile` to first attempt `tryWriteSingleTopLevelIncludeMutation` (writes to a `$include` file like `plugins.json5`) before falling back to `writeConfigFile`. The old patch matched an exact tab-indented `writeConfigFile(params.nextConfig, {...})` string that no longer exists. Updated to match the new `if (!await tryWriteSingleTopLevelIncludeMutation(...)) await writeConfigFile(...)` block and wrap the entire write path in the OPENSHELL_SANDBOX-gated EACCES try/catch. 3. **`plugin-runtime-deps` symlink** — OpenClaw 2026.4.24 introduced lazy plugin runtime-dep installation (Jiti loader). The CLI writes to `~/.openclaw/plugin-runtime-deps/openclaw-<version>-<hash>/` on first invocation. NemoClaw locks `/sandbox/.openclaw` to `444 root:root`, so every bundled provider failed to load with `EACCES`. Fix: created the dir in the writable `.openclaw-data` tree and symlinked it from the immutable config tree, mirroring the existing pattern used for `logs`, `credentials`, `extensions`, etc. Added in both `Dockerfile.base` (canonical) and `Dockerfile` (idempotent fixup for stale GHCR base). 4. **Selective sandbox safety-net** — `_SANDBOX_SAFETY_NET` (a Node `--require` preload from `nemoclaw-start.sh`) used to be a catch-all swallow + `process.exit` interceptor. Rewritten to: (a) gate to gateway processes only (`OPENSHELL_SANDBOX=1` + `argv[2]==='gateway'`) so CLI commands keep default Node crash behaviour; (b) match documented known-benign patterns (currently `ciao`/mDNS — produced when bonjour's probe state machine cancels itself, since the sandbox netns has no multicast); (c) for unknown errors, log full stack but keep gateway alive (gateway is shared infrastructure, user-initiated actions must not take it down); (d) drop `process.exit` interception entirely. The CIAO guard's `uncaughtException` listener was similarly gated to gateway processes — registering one in CLI processes turns Node's default crash-on-uncaught into silent absorb, which would silently hang `openclaw agent`. 5. **Disable bonjour and qqbot bundled plugins** — both ship enabled-by-default in 2026.4.24 and break in the sandbox netns: - **bonjour**: introduced in 2026.4.15, uses `@homebridge/ciao` for mDNS announcement. Sandbox netns has no multicast — ciao's probe state machine fails at startup. - **qqbot**: has `stageRuntimeDependencies=true`, so its npm deps (`@tencent-connect/qqbot-connector`, `silk-wasm`, etc.) install on first load. The sandbox L7 proxy denies the registry URL with `403 policy_denied`, the install retries for ~6 minutes, and while channel loading is stuck the gateway can't service `openclaw agent` requests. Both disabled via `plugins.entries.<id>.enabled = false` in `scripts/generate-openclaw-config.py`. 6. **Build-context fix for `generate-openclaw-config.py`** — main's PR NVIDIA#2449 (commit `f5ee8a4d`) extracted the inline Python config-generator from Dockerfile into `scripts/generate-openclaw-config.py` and added `COPY scripts/generate-openclaw-config.py …` to Dockerfile, but did not update `src/lib/sandbox-build-context.ts` which curates the optimized build context for sandbox image builds. Without this, every nightly E2E job (and any sandbox onboard) fails with `COPY failed: file not found in build context`. Added the file to `stageOptimizedSandboxBuildContext()` next to `nemoclaw-start.sh` and added a test assertion so the staging stays in sync. ### Status Most recent un-rate-limited run (25015126555 with build-context fix): **13 of 18 jobs pass**. `sandbox-operations-e2e` still fails — only TC-SBX-02 (Connect & Chat) within it. All other TC-SBX cases (03, 04, 05, 06, 07, 08, 10, 11, 12) pass on `test-sbx-a`, confirming the gateway is functional. After the `sandbox-build-context.ts` fix and the qqbot disable, the failure mode of TC-SBX-02 changed from `SSH command timed out after 60s` to `Expected '42' in agent reply; reply=''` — same 60-90 second hang but now hitting the test's outer `run_with_timeout` rather than producing a stack trace. The test drops stderr (`2>/dev/null`), and the gateway-log streamer/snapshot infrastructure has been unable to capture `test-sbx-a`'s `/tmp/openclaw-998/openclaw-*.log` reliably (the post-test openshell state has no active gateway after TC-SBX-06's docker kill, and the streamer's connection to test-sbx-a races and gets `Connection refused`). Still root-causing. ### Notable upstream changes (2026.4.9 → 2026.4.24) - Google Meet bundled plugin, DeepSeek V4 Flash/Pro, realtime voice loops (Talk/Voice Call/Google Meet), Gemini Live, browser automation improvements. - Lighter startup: static model catalogs, manifest-backed model rows, **lazy provider dependencies** (the new plugin-runtime-deps mechanism — root cause of fix NVIDIA#3). - **Breaking:** Plugin SDK tool-result transforms migrated from `registerEmbeddedExtensionFactory()` to `registerAgentToolResultMiddleware()` — verified NemoClaw uses neither. - **Breaking:** Plugin registry migrated from `plugins.installs` config key to managed `plugins/installs.json` ledger — `openclaw doctor --fix` migrates automatically. - Config writes restructured to use single-file `$include` mutations before falling back to full config write (root cause of fix NVIDIA#2). - CVE-2026-41349, CVE-2026-22181 fixes; exec-approvals chat enablement (2026.4.22); cron `jobs-state.json` separation (2026.4.20). - bonjour mDNS plugin added in 2026.4.15 (root cause of fix #5a). ### User sandbox state migration on rebuild Existing user sandboxes upgrade via `nemoclaw <name> rebuild`. State (memory/, workspace/, agents/, extensions/, etc.) is backed up via tar, sandbox is destroyed and recreated with the new image, state is restored, `openclaw doctor --fix` runs post-restore. **Handled automatically:** memory, cron job definitions, plugin auto-discovery, plugin registry migration. **Existing reset behavior (not new):** exec-approvals, credentials, device pairing. **New minor behavior change:** cron runtime state (`jobs-state.json`) absent in pre-2026.4.20 backups — job execution history resets, jobs may re-fire once after upgrade. ## Test plan - [x] CI lint, typecheck, unit tests pass - [x] Docker base image and sandbox image build with all dist patches applied - [x] 13/18 nightly E2E jobs pass cleanly with all six fixes - [ ] **TC-SBX-02** — root cause for the residual `reply=''` hang under investigation; the gateway-log capture infrastructure needs to work reliably post-test before we can read what's happening server-side - [ ] Manual smoke test via `nemoclaw <sandbox> connect` interactive flow - [ ] Rebuild test: existing 2026.4.9 sandbox → rebuild → verify state preserved (rebuild-openclaw-e2e covers this)

- Fix recovery scripts in agent-runtime.ts that still used curl -sf on / instead of the new HTTP status code pattern on /health (#3) - Add device-auth-health-e2e to scorecard.needs (#8) - Use openshell-${SANDBOX_NAME} SSH host alias in E2E test (#7)

Resolve the two output threads in #3456 left after the core dead-loop fix landed via #3459 + #3434: Sub-bug #3 — `src/lib/onboard.ts` printed `nemoclaw <name> destroy --yes && nemoclaw onboard --gpu` with a literal `<name>` placeholder, and assumed at least one sandbox was registered. When the GPU-passthrough mismatch hit on the State B re-run path with an empty registry (the dead-loop case), the hint was not actionable. Replace with a registry-aware helper at `src/lib/onboard/gpu-recovery.ts` that renders the right shape: - empty registry → suggest `nemoclaw uninstall && nemoclaw onboard --gpu` - one sandbox → suggest destroy --yes --cleanup-gateway for that name - multiple sandboxes → list each, only the last gets --cleanup-gateway Sub-bug #4 — `src/lib/actions/uninstall/run-plan.ts` printed `Destroyed gateway 'nemoclaw' skipped` when the openshell destroy no-op'd (gateway already gone) — the "Destroyed … skipped" wording was self-contradictory. Extend `runOptional` with an `onSkip` option; route the gateway destroy to emit `Gateway 'nemoclaw' already removed or unreachable` on no-op. Tests: - `src/lib/onboard/gpu-recovery.test.ts` (6 tests): forbid literal `<name>` placeholder anywhere in the output; cover empty / single / multi-sandbox cases; defensive filter on whitespace names so a `nemoclaw destroy` rendering can never happen. - `src/lib/actions/uninstall/run-plan.test.ts`: assert the new "already removed or unreachable" wording and the absence of the "Destroyed gateway 'nemoclaw' skipped" string. The core dead loop itself (sub-bugs #1, #2 and State B GPU mismatch) is already addressed by #3459 + #3434 + #3483; #3456 will close once this lands. See the #3456 status comment for the full mapping. Refs #3456. Mirrors (and tightens) the approach in the closed PR #3464, which left the literal `<name>` placeholder in tests per CodeRabbit feedback that was never addressed. Signed-off-by: Charan Jagwani <charjags100@gmail.com>

…3520) > **Draft for visibility.** Issue-autopilot Stages 4-5 of #3456. Will mark ready once batch self-review + CI complete. ## Summary Closes the two remaining output threads in #3456 after the core dead-loop fix already landed on `main` (via #3459, #3434, #3483). Full sub-bug mapping in the [#3456 status comment](#3456 (comment)). - **Sub-bug #3** — `nemoclaw <name> destroy --yes` recovery hint replaced with a registry-aware helper. - **Sub-bug #4** — `Destroyed gateway 'nemoclaw' skipped` self-contradictory wording replaced with `Gateway 'nemoclaw' already removed or unreachable`. ## Acceptance criteria mapping | Sub-bug | Resolution | Evidence | |---|---|---| | #1 dead loop | Already fixed on main (#3459) | out of scope | | #2 firewall diagnostic | Already fixed on main (#3459) | out of scope | | **#3** literal `<name>` placeholder | **This PR** | `src/lib/onboard/gpu-recovery.ts` + `onboard.ts:10387-10405` | | **#4** misleading "skipped" wording | **This PR** | `src/lib/actions/uninstall/run-plan.ts:210-228, 407-414` | | #5 uninstall residuals | Already fixed on main (#3483) | out of scope | ## Behavior matrix `gpuPassthroughRecoveryLines(names)`: | Input | Suggestion | |---|---| | `null` / `[]` | `nemoclaw uninstall && nemoclaw onboard --gpu` | | one sandbox | `nemoclaw <name> destroy --yes --cleanup-gateway && nemoclaw onboard --gpu` | | many sandboxes | each `destroy --yes`, only the last gets `--cleanup-gateway` | ## Test plan ``` npm run typecheck:cli npx vitest run src/lib/onboard/gpu-recovery.test.ts src/lib/actions/uninstall/run-plan.test.ts ``` 22 tests pass (6 new + 16 existing). ## Notes for reviewers - This is the work [#3464 attempted](#3464); that PR was closed without merging after CodeRabbit asked for the `<name>` placeholder to be forbidden in tests via negative assertion. This PR adopts that refinement. - `runOptional` extension is backwards-compatible — existing callers without `onSkip` get the original wording. Closes #3456 once merged. --------- Signed-off-by: Charan Jagwani <charjags100@gmail.com> Co-authored-by: Charan Jagwani <charjags100@gmail.com> Co-authored-by: Carlos Villela <cvillela@nvidia.com>

) Advisor finding: scenarios with expectedFailure metadata declared phase/errorClass/forbiddenSideEffects, but nothing in the typed runner inspected observed phase results to verify the right phase failed for the right reason. A scenario named ubuntu-no-docker-preflight-negative could fail because DNS broke and the run would still show 'failed' without catching the mismatch. Add framework-owned negative-scenario contract verification, in the spirit of redaction.ts and context.ts (typed orchestrator infra, not shell): - types.ts: ExpectedFailureContract typed shape replaces the prior Record<string, unknown> on ScenarioDefinition.expectedFailure and RunPlan.expectedFailure. Adds ExpectedFailurePhase (PhaseName | 'preflight') so manifests speak the user vocabulary while internal PhaseName stays narrow. Adds NegativeContractPhase / PhaseResultName so the synthetic phase result the runner emits cannot accidentally be declared by a scenario builder. - orchestrators/negative-matcher.ts (new): pure function evaluateNegativeContract(plan, results) returning NegativeContractResult with outcome in {matched, no-failure-observed, wrong-phase, wrong-error-class}. Resolves expected.phase='preflight' to the onboarding orchestrator (where preflight assertions live). Substring-with-case-fold, separator-tolerant errorClass match. Excludes the runtime side-effect probe step from observed-failure detection so the matcher is not confused by its own enforcement scaffolding. - orchestrators/runner.ts: after phases run, if plan.expectedFailure is set, call evaluateNegativeContract and append a synthetic PhaseResult with phase='negative-contract'. Emits .e2e/negative-contract.json artifact alongside per-phase results. Positive scenarios are untouched. - run.ts: planFailed() consults the synthetic contract phase for negative scenarios. A negative scenario is green iff the contract matched AND the runtime control group's required no-side-effects step passed. Until the forbidden-side-effect probe lands the required pending step keeps that piece red, so matched-failure-mode alone still cannot flip a negative scenario green. - builder.ts / scenarios/baseline.ts: thread the typed contract through the builder API and the canonical input shape. - 15 new tests in e2e-negative-matcher.test.ts cover: matched, preflight->onboarding mapping, no-failure-observed, wrong-phase, wrong-errorClass, side-effect probe step ignored, case-insensitive matching, runner integration (matched + mismatched + positive unaffected), registry contract (every negative scenario opts into the side-effect probe step), and compiler validation rejects bad shapes. Spec ownership boundaries kept honest: - Failure injection (uninstalling docker, planting a bad key, occupying a port) stays runner-environment prep, not framework code. Matcher only inspects observed results. - Forbidden-side-effect verification stays the expectedFailureNoSideEffectsProbe's job. The matcher reports phase + errorClass independently; the required pending step from cc6b7a2 keeps the side-effect axis visibly red until the probe lands. 354 framework tests pass (15 new). tsc clean. Signed-off-by: Julie Yaunches <jyaunches@nvidia.com>

NVIDIA#4538) Two acceptance gaps from NVIDIA#4538 not closed by the original PR: - Troubleshooting: contrast the three relevant perm states (mutable default 2770/660, shields-up locked 444/755 root, the 700/600 drift) so issue ask NVIDIA#3, "docs cover both mutable and locked-down", is actually answered. Prior copy only documented mutable. - mutable-config-perms: explain why NemoClaw uses 2770/660 vs the issue's expected 2775/664 (gateway shares the sandbox group, so the "other" bit is intentionally dropped). The predicate test already rejects 664; the rationale belonged in code. Signed-off-by: Charan Jagwani <cjagwani@nvidia.com>

Pass 1 of phase-5 convergence on dispatch 27283289318. First blocking assertion: openshell-installed (#3 in chain), failing with `spawn openshell ENOENT`. Root cause: shell-probe.ts:127-129 \u2014 when inheritEnv is unset, the child gets only options.env. My prereq probes passed no env, so spawn had no PATH to resolve `openshell` against. The onboard runs worked only because they used inheritEnv:true. Fix: use buildAvailabilityProbeEnv() everywhere (framework allowlist incl. PATH/HOME/CI; explicitly excludes NVIDIA_API_KEY). Layer the secret explicitly only on the first onboard; the resume run's env deliberately omits it to test credential hydration from the session file \u2014 this is the typed expression of the bash test's `env -u NVIDIA_API_KEY` invariant. Also drops inheritEnv:true on the onboard runs in favor of the same allowlist composition pattern, matching OnboardingPhaseFixture.commandEnv. Refs: #4348, #5098

Change small local model to

653240c

qwen3.5:9b

jacobtomlinson closed this Mar 20, 2026

jessesanford pushed a commit to jessesanford/NemoClaw that referenced this pull request Mar 24, 2026

Merge pull request NVIDIA#3 from NVIDIA/dnandakumar-onbaording

ef02d3d

Add `openclaw nemoclaw onboard` command

zNeill mentioned this pull request Mar 26, 2026

[MacOS] Gateway fails to proxy inference requests to integrate.api.nvidia.com from sandbox #997

Open

drmarcopapa mentioned this pull request Mar 31, 2026

npm install inside sandbox extremely slow — 310s timeouts on FETCH_ERROR retries #1043

Closed

2 tasks

prekshivyas mentioned this pull request Apr 6, 2026

docs: split version-agnostic OpenShell lifecycle guidance #1263

Merged

16 tasks

Dongni-Yang mentioned this pull request Apr 7, 2026

fix(sandbox): guard openclaw configure and export gateway token (#1114) #1305

Merged

This was referenced Apr 16, 2026

fix(onboard): propagate rotated messaging credentials to sandbox L7 proxy #1967

Merged

fix(shields): verify config lock and fail hard on re-lock failure #2066

Merged

wscurran mentioned this pull request Apr 24, 2026

fix(cli): recognize BuildKit sandbox build progress (Fixes #2311) #2404

Merged

ericksoa mentioned this pull request Apr 26, 2026

chore: upgrade OpenClaw from 2026.4.9 to 2026.4.24 #2484

Merged

6 tasks

jyaunches mentioned this pull request May 6, 2026

Brev onboard UI: OpenClaw preflight false red, openshell exec argv, SSE/nginx + cloudflared EOF #2258

Closed

2 tasks

This was referenced May 7, 2026

fix(preflight): replace reinvented OpenClaw health check with NemoClaw CLI calls #3208

Closed

feat(e2e): introduce scenario-based setup matrix and runner #3290

Closed

cjagwani mentioned this pull request May 14, 2026

[All Platforms][Onboard] ./install.sh and nemoclaw onboard enter dead loop — sandbox can't reach gateway, "firewall" hint is misleading #3456

Closed

cjagwani mentioned this pull request May 14, 2026

fix(onboard,uninstall): replace misleading recovery messages (#3456) #3520

Merged

cjagwani mentioned this pull request May 14, 2026

[Linux][Uninstall] nemoclaw uninstall leaves ~/.local/state/nemoclaw/ behind (residual of #3456 #5c) #3535

Closed

3 tasks

hulynn mentioned this pull request May 28, 2026

[All Platforms][Docs] quickstart.mdx wizard description still out of sync with v0.0.53 (follow-up of NVB 6187477) #4417

Closed

PrachiShevate-nv mentioned this pull request May 29, 2026

[All Platforms][Policy&Network] Local vLLM sandbox can run inference but direct curl to host alias is policy_denied (host.docker.internal blocked) #4542

Open

Dongni-Yang mentioned this pull request Jun 2, 2026

fix(onboard): don't leave a default sandbox when cancelling at policy presets #4642

Merged

4 tasks

prekshivyas mentioned this pull request Jun 3, 2026

[WSL2 x86_64][Sandbox] OpenShell supervisor fails to reconnect to GPU-patched sandbox container; sandbox enters Error phase #4664

Closed

cv mentioned this pull request Jun 5, 2026

feat(hermes): expose built-in dashboard #4811

Merged

12 tasks

jyaunches mentioned this pull request Jun 10, 2026

test(e2e): add Vitest live coverage for onboard resume #5147

Merged

4 tasks

This was referenced Jun 12, 2026

[All Platforms][Docs] Commands reference: broken link + CLI mismatch for tunnel status and sessions #5080

Closed

docs: fix commands reference install-plugins link (#5080) #5193

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Change small local model to qwen3.5:9b#3

Change small local model to qwen3.5:9b#3
ericksoa wants to merge 1 commit into
mainfrom
change-small-mode-qwen

ericksoa commented Mar 16, 2026

Uh oh!

jacobtomlinson commented Mar 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

ericksoa commented Mar 16, 2026

Uh oh!

jacobtomlinson commented Mar 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants