Bootstrap sandbox OpenClaw config after launch and migrate by ericksoa · Pull Request #5 · NVIDIA/NemoClaw

ericksoa · 2026-03-16T18:26:42Z

Migrated from NVIDIA/openshell-openclaw-plugin#12 by @vincentkoc

Summary

run openclaw setup inside the sandbox after a fresh launch so NemoClaw creates openclaw.json, the default workspace, and session directories before first use
run the same sandbox bootstrap step after migrate copies host state so config-less installs still get a valid sandbox config
add a Docker E2E regression with a fake openshell harness that proves the bootstrap path invokes openclaw setup and materializes openclaw.json

Testing

cd nemoclaw && npm install
cd nemoclaw && npm run build
cd nemoclaw && npx eslint src/commands/launch.ts src/commands/migrate.ts src/commands/sandbox-bootstrap.ts
docker build -f test/Dockerfile.sandbox -t nemoclaw-e2e .
docker run --rm -v "$PWD/test/e2e-test.sh:/tmp/e2e-test.sh:ro" nemoclaw-e2e /tmp/e2e-test.sh

Signed-off-by: Vincent Koc <vincentkoc@ieee.org>

(cherry picked from commit 0662c8cc12ca42c6791a771be5d3e5f4fe14bbd9)

Signed-off-by: Vincent Koc <vincentkoc@ieee.org>

wscurran · 2026-03-20T23:18:23Z

Thanks for enhancing the sandbox setup process by running openclaw setup after launch and migrate, ensuring that NemoClaw creates the necessary OpenClaw config and directories for a smooth user experience.

cv · 2026-03-21T19:28:14Z

Hey @ericksoa, wanted to touch base on this PR as well. I see you've been active with several contributions and we really appreciate that. The codebase has changed quite a bit since this was opened — we've landed CI checks and a number of new features that may affect the sandbox bootstrap flow. When you have a moment, a rebase onto main would help us review this with the current state of things in mind. No rush, just whenever you get a chance!

…migrate-hooks Fix NemoClaw migrate state sync for hooks and local assets

- Guard runArgv/runArgvCapture against shell:true to prevent security bypass (finding #1) — throws if a caller attempts to re-enable shell interpretation. Added 2 tests. - Document the intentional bash -c exception in getOllamaWarmupCommand explaining why it's safe (finding NVIDIA#2). - Remove dead getOpenshellCommand() from policies.ts (finding NVIDIA#3). - Remove unused shellQuote import from nim.ts (finding NVIDIA#4). - Fix brittle indexOf assertion in onboard-readiness test (finding NVIDIA#5).

Lands shared fixtures, helpers, assertion helpers, install-method splits, conventions + lint, and the parity-compare CI harness that unblock the per-wave migration phases (2\u201312). Deliverables (per specs/2026-05-11_e2e-test-migration/spec.md Phase 1): Fixtures (test/e2e/lib/fixtures/): - fake-openai.sh: local OpenAI-compatible endpoint (Risk NVIDIA#2 mitigation) - fake-{telegram,discord,slack}.sh: messaging stubs via shared _fake-http-stub.sh harness - older-base-image.sh: tagged ghcr base-image Dockerfile generator Helpers (test/e2e/lib/): - logging.sh: canonical e2e_{section,info,pass,fail} with stable PASS:/FAIL:/=== Phase markers (absorbs reuse category #1) - sandbox-exec.sh: canonical nemoclaw-shell wrapper with safe quoting, exit-code propagation, and dry-run short-circuit (category NVIDIA#10) - env.sh: auto-sources logging.sh so every consumer gets it for free Assertion helpers (test/e2e/lib/assert/): - inference-works.sh: chat-completion round-trip - no-credentials-leaked.sh: credential-pattern scan - policy-preset-applied.sh: gateway policy preset verification - messaging-bridge-reachable.sh: L7 proxy / bridge reachability Install dispatcher splits (test/e2e/lib/setup/): - install-{repo,curl,ollama,launchable}.sh (four profiles) - install.sh: dispatcher routes by profile/method name (category NVIDIA#5) Runtime probe wiring: - run-scenario.sh: adds --validate-only flag (probe-only, no setup) - resolver/index.ts: E2E_PROBE_OVERRIDES_JSON escape hatch for keys with embedded underscores (e.g. security.policy_engine) Convention lint + parity harness: - scripts/e2e/lint-conventions.ts: enforces 6 conventions on suite step scripts + requires parity-map.yaml entries for legacy scripts - scripts/e2e/compare-parity.sh: diffs legacy vs scenario PASS/FAIL via parity-map.yaml; flaky: true marker supported (Risk NVIDIA#4) - test/e2e/parity-map.yaml: seeded with one entry per existing legacy script; migration phases 2\u201312 append assertion mappings - .github/workflows/e2e-parity-compare.yaml: dispatches legacy script + migrated scenario on same runner and diffs outcomes Tests (all passing, 41 total): - test/e2e-lib-helpers.test.ts: +18 tests (1.A\u20131.E) - test/e2e-convention-lint.test.ts: +11 tests (1.G\u20131.H) - test/e2e-expected-state-validator.test.ts: +2 tests (1.F) No regressions: full cli Vitest project (3258 tests) still green.

…3520) > **Draft for visibility.** Issue-autopilot Stages 4-5 of #3456. Will mark ready once batch self-review + CI complete. ## Summary Closes the two remaining output threads in #3456 after the core dead-loop fix already landed on `main` (via #3459, #3434, #3483). Full sub-bug mapping in the [#3456 status comment](#3456 (comment)). - **Sub-bug #3** — `nemoclaw <name> destroy --yes` recovery hint replaced with a registry-aware helper. - **Sub-bug #4** — `Destroyed gateway 'nemoclaw' skipped` self-contradictory wording replaced with `Gateway 'nemoclaw' already removed or unreachable`. ## Acceptance criteria mapping | Sub-bug | Resolution | Evidence | |---|---|---| | #1 dead loop | Already fixed on main (#3459) | out of scope | | #2 firewall diagnostic | Already fixed on main (#3459) | out of scope | | **#3** literal `<name>` placeholder | **This PR** | `src/lib/onboard/gpu-recovery.ts` + `onboard.ts:10387-10405` | | **#4** misleading "skipped" wording | **This PR** | `src/lib/actions/uninstall/run-plan.ts:210-228, 407-414` | | #5 uninstall residuals | Already fixed on main (#3483) | out of scope | ## Behavior matrix `gpuPassthroughRecoveryLines(names)`: | Input | Suggestion | |---|---| | `null` / `[]` | `nemoclaw uninstall && nemoclaw onboard --gpu` | | one sandbox | `nemoclaw <name> destroy --yes --cleanup-gateway && nemoclaw onboard --gpu` | | many sandboxes | each `destroy --yes`, only the last gets `--cleanup-gateway` | ## Test plan ``` npm run typecheck:cli npx vitest run src/lib/onboard/gpu-recovery.test.ts src/lib/actions/uninstall/run-plan.test.ts ``` 22 tests pass (6 new + 16 existing). ## Notes for reviewers - This is the work [#3464 attempted](#3464); that PR was closed without merging after CodeRabbit asked for the `<name>` placeholder to be forbidden in tests via negative assertion. This PR adopts that refinement. - `runOptional` extension is backwards-compatible — existing callers without `onSkip` get the original wording. Closes #3456 once merged. --------- Signed-off-by: Charan Jagwani <charjags100@gmail.com> Co-authored-by: Charan Jagwani <charjags100@gmail.com> Co-authored-by: Carlos Villela <cvillela@nvidia.com>

Audit found ~20 issues across 8 docs; fixed in this commit: chad-devflow.md: - Removed stale chad-spawn-poll cron table row (moved to host launchd watchdog 2026-05-14; was double-listed in cron table + watchdogs section) - Added chad-experiment-night cron row - Added chad-webui / chad-webui-mcp / chad-experiment rows to the helper-binaries section - Noted chad-shim.py v0.2+ per-operator routing + the new chad-shim-watchdog supervision chad-autonomy.md: - Updated "five loops" → "six loops" (added experiment lifecycle as loop 6, fully cross-referenced to chad-experiments.md) - Frontmatter description updated chad-skills.md: - Renamed "Chad-managed skills (3)" → "Chad-managed skills, repo-shipped (3)" for clarity - Added new section "Runtime-synced chad-managed skills (2)" cataloging openwebui (60-cmd control surface) and chad-experiment (lifecycle methodology) - Updated total skill count from "≈40 + 3" → "~45" chad-readme.md § 7: - Updated "nine cron jobs" → "eleven openclaw cron + three host-side launchd watchdogs", restructured the table accordingly - Removed spawn-poll cron row (now a host watchdog) - Added experiment-night, chad-proposal-apply, chad-skill-watch, gbrain-prune rows that were missing - Added new "host-side launchd watchdogs" subsection (3 watchdogs) - Section-226 chad-spawn-poll reference now points at the watchdog with cost rationale log-locations.md: - Added agent-inbox.jsonl to Cron + queue section - New section "Experiments (autonomous lifecycle, 2026-05-14+)" with config/ledger/active/archive paths - New section "Host-side watchdogs" indexing the three host log files + per-launchd stdout/stderr workspace/workspace-files.md: - Added rows for identities/, bin/, skills/openwebui+chad-experiment, state/experiments/*, state/agent-inbox.jsonl to the runtime-state table resources/backup-policy.md: - New subsection 2.2b "Sandbox Runtime State" covering 9 paths added 2026-05-13/14 with priority + recreatability columns. Distinguishes CRITICAL (experiments/config, experiments/active — losing strands artifacts in OpenWebUI) from MEDIUM (bin/) from LOW (agent-inbox) Discoveries flagged for follow-up (not patched in this commit): - self-improve cron has 3 consecutive errors; should be unstuck by yesterday's idleTimeoutSeconds 180→600 fix but won't retry until Sun - memory-curator cron has 2 consecutive errors, same root cause - chad-skill-watch hit error today (9h ago); needs investigation - experiment-night fired manually but errors on the deferred NVIDIA#5 "delivery channel" issue — fires but reports as error Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…re pending steps Addresses PR Review Advisor finding #2: 'Required security probes and expected-failure checks still skip without failing live runs'. Same architectural failure mode as findings #5 (E2E_DRY_RUN) and #6 (phase-1-skeleton): the typed framework declared a contract, but a code path silently produced non-failing skips that contradicted this PR's 'one mode, no fake green' invariant. Three suites (security-shields, security-policy, security-injection) were wired as kind: 'probe' steps; the orchestrator marked them status: 'skipped' because the probe registry hasn't landed yet. The expected-failure side-effect validator (runtime.expected-failure.no- side-effects) was wired as kind: 'pending'. run.ts only sets process.exitCode = 1 on status === 'failed', so a live run could omit security shields, network policy, prompt-injection blocking, AND the negative-scenario forbidden-side-effect contract while still appearing non-failing. Fix follows the framework spirit: declare requirement in the typed metadata, enforce it once at the orchestrator boundary. Typed contract extension (types.ts): AssertionStep.required?: boolean When true, a probe/pending step that resolves as 'skipped' is reclassified as 'failed' by the phase orchestrator. Defaults to false (existing behavior preserved for diagnostics, docs-validation, and other non-security probes). Orchestrator (phase.ts): executeStep: probe and pending branches now check step.required and return status: 'failed' with a 'required <kind> not <state>' message when set. Non-required probes/pending steps continue to surface as visible skipped results so unrelated gaps stay diagnostic. Registry (assertions/registry.ts): - probeStep / pendingStep helpers grew an options arg with required?: boolean (matches the secretEnv pattern: options come in via typed objects, not positional args). - security-shields, security-policy, security-injection probes annotated required: true. These are the suites the run is not safe without; failing closed beats fake green. - runtime.expected-failure.no-side-effects pending step annotated required: true. Negative scenarios cannot honestly pass while their core side-effect contract is unimplemented. - diagnostics, docs-validation probes intentionally remain non-required: they are informational, not safety-critical, and will switch to required when the probe registry lands and they have real implementations. run.ts: no change needed. The new failed-status flows through the existing exit-code path (process.exitCode = 1 when any phase status is failed), so the workflow correctly reports red for missing required probes/pending steps. Tests (e2e-phase-orchestrators.test.ts, +5): - test_required_probe_step_that_is_unregistered_fails_the_phase - test_non_required_probe_step_continues_to_skip_visibly - test_required_pending_step_fails_closed - test_security_suite_groups_in_registry_mark_their_steps_as_required - test_expected_failure_no_side_effects_step_in_registry_is_required 318/318 e2e-scenario framework tests pass. Until the probe registry follow-up PR registers actual implementations for shieldsConfigProbe / networkPolicyProbe / injectionBlockedProbe / expectedFailureNoSideEffectsProbe, scenarios that include those suites/expected-failure groups will fail loudly with 'required probe not registered' or 'required pending step not implemented'. That is the correct intermediate state: visible red until real, instead of green-by-default. Signed-off-by: Julie Yaunches <jyaunches@nvidia.com>

vincentkoc and others added 7 commits March 16, 2026 01:04

Bootstrap OpenClaw config inside sandbox

3deb7b9

Signed-off-by: Vincent Koc <vincentkoc@ieee.org>

Test launch bootstrap through cliLaunch

ab26163

Signed-off-by: Vincent Koc <vincentkoc@ieee.org>

Bootstrap: support headless sandbox gateway setup

d5fec8d

Signed-off-by: Vincent Koc <vincentkoc@ieee.org>

Setup: bootstrap OpenClaw inside new sandboxes

9d91690

Signed-off-by: Vincent Koc <vincentkoc@ieee.org>

refactor: replace sandbox systemd with pid-managed gateway

72babb6

Signed-off-by: Vincent Koc <vincentkoc@ieee.org>

feat: cache nemoclaw runtime prompt updates

15a41c1

(cherry picked from commit 0662c8cc12ca42c6791a771be5d3e5f4fe14bbd9)

Refactor: rework sandbox bootstrap around runtime context

b4c66bf

Signed-off-by: Vincent Koc <vincentkoc@ieee.org>

wscurran added enhancement: feature integration: openclaw OpenClaw integration behavior and removed OpenClaw Plugin labels Mar 20, 2026

ericksoa closed this Mar 22, 2026

jessesanford pushed a commit to jessesanford/NemoClaw that referenced this pull request Mar 24, 2026

Merge pull request NVIDIA#5 from vincentkoc/vincentkoc-code/nemoclaw-…

33f3e1c

…migrate-hooks Fix NemoClaw migrate state sync for hooks and local assets

zNeill mentioned this pull request Mar 26, 2026

[Linux] Policy edge case: npm install blocked (403) - registry.npmjs.org not GET-allowed in sandbox #993

Closed

jyaunches mentioned this pull request May 11, 2026

ci(e2e): integrate Brev ephemeral E2E into nightly + refresh test suites #3352

Open

20 tasks

This was referenced May 14, 2026

[All Platforms][Onboard] ./install.sh and nemoclaw onboard enter dead loop — sandbox can't reach gateway, "firewall" hint is misleading #3456

Closed

fix(onboard,uninstall): replace misleading recovery messages (#3456) #3520

Merged

wscurran added area: cli Command line interface, flags, terminal UX, or output feature PR adds or expands user-visible functionality and removed priority: medium labels Jun 3, 2026

cv mentioned this pull request Jun 5, 2026

feat(hermes): expose built-in dashboard #4811

Merged

12 tasks

This was referenced Jun 9, 2026

Vitest E2E coverage: assert NemoClaw's high-value behavioral catalog #4999

Open

test(e2e): add Vitest live coverage for onboard resume #5147

Merged

Epic: Migrate legacy bash E2E into the Vitest E2E system #5098

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bootstrap sandbox OpenClaw config after launch and migrate#5

Bootstrap sandbox OpenClaw config after launch and migrate#5
ericksoa wants to merge 7 commits into
mainfrom
vincentkoc-openclaw-json-fix

ericksoa commented Mar 16, 2026

Uh oh!

wscurran commented Mar 20, 2026

Uh oh!

cv commented Mar 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

ericksoa commented Mar 16, 2026

Summary

Testing

Uh oh!

wscurran commented Mar 20, 2026

Uh oh!

cv commented Mar 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants