Bootstrap sandbox OpenClaw config after launch and migrate#5
Closed
ericksoa wants to merge 7 commits into
Closed
Conversation
Signed-off-by: Vincent Koc <vincentkoc@ieee.org>
Signed-off-by: Vincent Koc <vincentkoc@ieee.org>
Signed-off-by: Vincent Koc <vincentkoc@ieee.org>
Signed-off-by: Vincent Koc <vincentkoc@ieee.org>
Signed-off-by: Vincent Koc <vincentkoc@ieee.org>
(cherry picked from commit 0662c8cc12ca42c6791a771be5d3e5f4fe14bbd9)
Signed-off-by: Vincent Koc <vincentkoc@ieee.org>
Contributor
|
Thanks for enhancing the sandbox setup process by running |
Collaborator
|
Hey @ericksoa, wanted to touch base on this PR as well. I see you've been active with several contributions and we really appreciate that. The codebase has changed quite a bit since this was opened — we've landed CI checks and a number of new features that may affect the sandbox bootstrap flow. When you have a moment, a rebase onto main would help us review this with the current state of things in mind. No rush, just whenever you get a chance! |
jessesanford
pushed a commit
to jessesanford/NemoClaw
that referenced
this pull request
Mar 24, 2026
…migrate-hooks Fix NemoClaw migrate state sync for hooks and local assets
jyaunches
pushed a commit
to jyaunches/NemoClaw
that referenced
this pull request
Apr 14, 2026
- Guard runArgv/runArgvCapture against shell:true to prevent security bypass (finding #1) — throws if a caller attempts to re-enable shell interpretation. Added 2 tests. - Document the intentional bash -c exception in getOllamaWarmupCommand explaining why it's safe (finding NVIDIA#2). - Remove dead getOpenshellCommand() from policies.ts (finding NVIDIA#3). - Remove unused shellQuote import from nim.ts (finding NVIDIA#4). - Fix brittle indexOf assertion in onboard-readiness test (finding NVIDIA#5).
This was referenced May 6, 2026
jyaunches
added a commit
to jyaunches/NemoClaw
that referenced
this pull request
May 11, 2026
Lands shared fixtures, helpers, assertion helpers, install-method splits, conventions + lint, and the parity-compare CI harness that unblock the per-wave migration phases (2\u201312). Deliverables (per specs/2026-05-11_e2e-test-migration/spec.md Phase 1): Fixtures (test/e2e/lib/fixtures/): - fake-openai.sh: local OpenAI-compatible endpoint (Risk NVIDIA#2 mitigation) - fake-{telegram,discord,slack}.sh: messaging stubs via shared _fake-http-stub.sh harness - older-base-image.sh: tagged ghcr base-image Dockerfile generator Helpers (test/e2e/lib/): - logging.sh: canonical e2e_{section,info,pass,fail} with stable PASS:/FAIL:/=== Phase markers (absorbs reuse category #1) - sandbox-exec.sh: canonical nemoclaw-shell wrapper with safe quoting, exit-code propagation, and dry-run short-circuit (category NVIDIA#10) - env.sh: auto-sources logging.sh so every consumer gets it for free Assertion helpers (test/e2e/lib/assert/): - inference-works.sh: chat-completion round-trip - no-credentials-leaked.sh: credential-pattern scan - policy-preset-applied.sh: gateway policy preset verification - messaging-bridge-reachable.sh: L7 proxy / bridge reachability Install dispatcher splits (test/e2e/lib/setup/): - install-{repo,curl,ollama,launchable}.sh (four profiles) - install.sh: dispatcher routes by profile/method name (category NVIDIA#5) Runtime probe wiring: - run-scenario.sh: adds --validate-only flag (probe-only, no setup) - resolver/index.ts: E2E_PROBE_OVERRIDES_JSON escape hatch for keys with embedded underscores (e.g. security.policy_engine) Convention lint + parity harness: - scripts/e2e/lint-conventions.ts: enforces 6 conventions on suite step scripts + requires parity-map.yaml entries for legacy scripts - scripts/e2e/compare-parity.sh: diffs legacy vs scenario PASS/FAIL via parity-map.yaml; flaky: true marker supported (Risk NVIDIA#4) - test/e2e/parity-map.yaml: seeded with one entry per existing legacy script; migration phases 2\u201312 append assertion mappings - .github/workflows/e2e-parity-compare.yaml: dispatches legacy script + migrated scenario on same runner and diffs outcomes Tests (all passing, 41 total): - test/e2e-lib-helpers.test.ts: +18 tests (1.A\u20131.E) - test/e2e-convention-lint.test.ts: +11 tests (1.G\u20131.H) - test/e2e-expected-state-validator.test.ts: +2 tests (1.F) No regressions: full cli Vitest project (3258 tests) still green.
20 tasks
cv
added a commit
that referenced
this pull request
May 14, 2026
…3520) > **Draft for visibility.** Issue-autopilot Stages 4-5 of #3456. Will mark ready once batch self-review + CI complete. ## Summary Closes the two remaining output threads in #3456 after the core dead-loop fix already landed on `main` (via #3459, #3434, #3483). Full sub-bug mapping in the [#3456 status comment](#3456 (comment)). - **Sub-bug #3** — `nemoclaw <name> destroy --yes` recovery hint replaced with a registry-aware helper. - **Sub-bug #4** — `Destroyed gateway 'nemoclaw' skipped` self-contradictory wording replaced with `Gateway 'nemoclaw' already removed or unreachable`. ## Acceptance criteria mapping | Sub-bug | Resolution | Evidence | |---|---|---| | #1 dead loop | Already fixed on main (#3459) | out of scope | | #2 firewall diagnostic | Already fixed on main (#3459) | out of scope | | **#3** literal `<name>` placeholder | **This PR** | `src/lib/onboard/gpu-recovery.ts` + `onboard.ts:10387-10405` | | **#4** misleading "skipped" wording | **This PR** | `src/lib/actions/uninstall/run-plan.ts:210-228, 407-414` | | #5 uninstall residuals | Already fixed on main (#3483) | out of scope | ## Behavior matrix `gpuPassthroughRecoveryLines(names)`: | Input | Suggestion | |---|---| | `null` / `[]` | `nemoclaw uninstall && nemoclaw onboard --gpu` | | one sandbox | `nemoclaw <name> destroy --yes --cleanup-gateway && nemoclaw onboard --gpu` | | many sandboxes | each `destroy --yes`, only the last gets `--cleanup-gateway` | ## Test plan ``` npm run typecheck:cli npx vitest run src/lib/onboard/gpu-recovery.test.ts src/lib/actions/uninstall/run-plan.test.ts ``` 22 tests pass (6 new + 16 existing). ## Notes for reviewers - This is the work [#3464 attempted](#3464); that PR was closed without merging after CodeRabbit asked for the `<name>` placeholder to be forbidden in tests via negative assertion. This PR adopts that refinement. - `runOptional` extension is backwards-compatible — existing callers without `onSkip` get the original wording. Closes #3456 once merged. --------- Signed-off-by: Charan Jagwani <charjags100@gmail.com> Co-authored-by: Charan Jagwani <charjags100@gmail.com> Co-authored-by: Carlos Villela <cvillela@nvidia.com>
tantodefi
added a commit
to tantodefi/NemoClaw
that referenced
this pull request
May 14, 2026
Audit found ~20 issues across 8 docs; fixed in this commit:
chad-devflow.md:
- Removed stale chad-spawn-poll cron table row (moved to host launchd
watchdog 2026-05-14; was double-listed in cron table + watchdogs
section)
- Added chad-experiment-night cron row
- Added chad-webui / chad-webui-mcp / chad-experiment rows to the
helper-binaries section
- Noted chad-shim.py v0.2+ per-operator routing + the new
chad-shim-watchdog supervision
chad-autonomy.md:
- Updated "five loops" → "six loops" (added experiment lifecycle as
loop 6, fully cross-referenced to chad-experiments.md)
- Frontmatter description updated
chad-skills.md:
- Renamed "Chad-managed skills (3)" → "Chad-managed skills,
repo-shipped (3)" for clarity
- Added new section "Runtime-synced chad-managed skills (2)"
cataloging openwebui (60-cmd control surface) and chad-experiment
(lifecycle methodology)
- Updated total skill count from "≈40 + 3" → "~45"
chad-readme.md § 7:
- Updated "nine cron jobs" → "eleven openclaw cron + three host-side
launchd watchdogs", restructured the table accordingly
- Removed spawn-poll cron row (now a host watchdog)
- Added experiment-night, chad-proposal-apply, chad-skill-watch,
gbrain-prune rows that were missing
- Added new "host-side launchd watchdogs" subsection (3 watchdogs)
- Section-226 chad-spawn-poll reference now points at the watchdog
with cost rationale
log-locations.md:
- Added agent-inbox.jsonl to Cron + queue section
- New section "Experiments (autonomous lifecycle, 2026-05-14+)" with
config/ledger/active/archive paths
- New section "Host-side watchdogs" indexing the three host log files
+ per-launchd stdout/stderr
workspace/workspace-files.md:
- Added rows for identities/, bin/, skills/openwebui+chad-experiment,
state/experiments/*, state/agent-inbox.jsonl to the runtime-state
table
resources/backup-policy.md:
- New subsection 2.2b "Sandbox Runtime State" covering 9 paths added
2026-05-13/14 with priority + recreatability columns. Distinguishes
CRITICAL (experiments/config, experiments/active — losing strands
artifacts in OpenWebUI) from MEDIUM (bin/) from LOW (agent-inbox)
Discoveries flagged for follow-up (not patched in this commit):
- self-improve cron has 3 consecutive errors; should be unstuck by
yesterday's idleTimeoutSeconds 180→600 fix but won't retry until Sun
- memory-curator cron has 2 consecutive errors, same root cause
- chad-skill-watch hit error today (9h ago); needs investigation
- experiment-night fired manually but errors on the deferred NVIDIA#5
"delivery channel" issue — fires but reports as error
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
jyaunches
added a commit
that referenced
this pull request
May 28, 2026
…re pending steps Addresses PR Review Advisor finding #2: 'Required security probes and expected-failure checks still skip without failing live runs'. Same architectural failure mode as findings #5 (E2E_DRY_RUN) and #6 (phase-1-skeleton): the typed framework declared a contract, but a code path silently produced non-failing skips that contradicted this PR's 'one mode, no fake green' invariant. Three suites (security-shields, security-policy, security-injection) were wired as kind: 'probe' steps; the orchestrator marked them status: 'skipped' because the probe registry hasn't landed yet. The expected-failure side-effect validator (runtime.expected-failure.no- side-effects) was wired as kind: 'pending'. run.ts only sets process.exitCode = 1 on status === 'failed', so a live run could omit security shields, network policy, prompt-injection blocking, AND the negative-scenario forbidden-side-effect contract while still appearing non-failing. Fix follows the framework spirit: declare requirement in the typed metadata, enforce it once at the orchestrator boundary. Typed contract extension (types.ts): AssertionStep.required?: boolean When true, a probe/pending step that resolves as 'skipped' is reclassified as 'failed' by the phase orchestrator. Defaults to false (existing behavior preserved for diagnostics, docs-validation, and other non-security probes). Orchestrator (phase.ts): executeStep: probe and pending branches now check step.required and return status: 'failed' with a 'required <kind> not <state>' message when set. Non-required probes/pending steps continue to surface as visible skipped results so unrelated gaps stay diagnostic. Registry (assertions/registry.ts): - probeStep / pendingStep helpers grew an options arg with required?: boolean (matches the secretEnv pattern: options come in via typed objects, not positional args). - security-shields, security-policy, security-injection probes annotated required: true. These are the suites the run is not safe without; failing closed beats fake green. - runtime.expected-failure.no-side-effects pending step annotated required: true. Negative scenarios cannot honestly pass while their core side-effect contract is unimplemented. - diagnostics, docs-validation probes intentionally remain non-required: they are informational, not safety-critical, and will switch to required when the probe registry lands and they have real implementations. run.ts: no change needed. The new failed-status flows through the existing exit-code path (process.exitCode = 1 when any phase status is failed), so the workflow correctly reports red for missing required probes/pending steps. Tests (e2e-phase-orchestrators.test.ts, +5): - test_required_probe_step_that_is_unregistered_fails_the_phase - test_non_required_probe_step_continues_to_skip_visibly - test_required_pending_step_fails_closed - test_security_suite_groups_in_registry_mark_their_steps_as_required - test_expected_failure_no_side_effects_step_in_registry_is_required 318/318 e2e-scenario framework tests pass. Until the probe registry follow-up PR registers actual implementations for shieldsConfigProbe / networkPolicyProbe / injectionBlockedProbe / expectedFailureNoSideEffectsProbe, scenarios that include those suites/expected-failure groups will fail loudly with 'required probe not registered' or 'required pending step not implemented'. That is the correct intermediate state: visible red until real, instead of green-by-default. Signed-off-by: Julie Yaunches <jyaunches@nvidia.com>
12 tasks
This was referenced Jun 9, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Migrated from NVIDIA/openshell-openclaw-plugin#12 by @vincentkoc
Summary
openclaw setupinside the sandbox after a freshlaunchso NemoClaw createsopenclaw.json, the default workspace, and session directories before first usemigratecopies host state so config-less installs still get a valid sandbox configopenshellharness that proves the bootstrap path invokesopenclaw setupand materializesopenclaw.jsonTesting
cd nemoclaw && npm installcd nemoclaw && npm run buildcd nemoclaw && npx eslint src/commands/launch.ts src/commands/migrate.ts src/commands/sandbox-bootstrap.tsdocker build -f test/Dockerfile.sandbox -t nemoclaw-e2e .docker run --rm -v "$PWD/test/e2e-test.sh:/tmp/e2e-test.sh:ro" nemoclaw-e2e /tmp/e2e-test.sh