feature: custom settings for using build endpoints#2
Closed
ericksoa wants to merge 8 commits into
Closed
Conversation
Resolve conflict in onboard.ts: adopt upstream execOpenShell() helper while preserving policy-proxy routing via providerBaseUrl. Made-with: Cursor
Collaborator
|
Hey @ericksoa — just circling back on this one. I know you've got a few PRs open here (thanks for all the contributions, by the way!). Things have moved pretty quickly in the repo since this was first submitted — new CI pipeline, additional features, etc. Could you rebase this branch against the latest main so we can evaluate it cleanly? Would love to take a fresh look at the custom build endpoint settings. Thanks! |
Contributor
Author
|
Superseded — inference provider configuration is now handled by the blueprint system in onboard. |
2 tasks
Closed
10 tasks
jyaunches
pushed a commit
to jyaunches/NemoClaw
that referenced
this pull request
Apr 14, 2026
- Guard runArgv/runArgvCapture against shell:true to prevent security bypass (finding #1) — throws if a caller attempts to re-enable shell interpretation. Added 2 tests. - Document the intentional bash -c exception in getOllamaWarmupCommand explaining why it's safe (finding NVIDIA#2). - Remove dead getOpenshellCommand() from policies.ts (finding NVIDIA#3). - Remove unused shellQuote import from nim.ts (finding NVIDIA#4). - Fix brittle indexOf assertion in onboard-readiness test (finding NVIDIA#5).
6 tasks
prekshivyas
added a commit
to ColinM-sys/NemoClaw
that referenced
this pull request
Apr 16, 2026
…exercised CodeRabbit correctly flagged that swapping on stat NVIDIA#1 caused readFileSync to see the live PID and exit via isProcessAlive — unlinkIfInodeMatches was never called. Move the swap to just before stat NVIDIA#2 (inside unlinkIfInodeMatches): stat NVIDIA#1 reads the original stale inode, readFileSync sees the dead PID, isProcessAlive returns false, stale-cleanup runs, and stat NVIDIA#2 sees the new inode and skips the unlink. Signed-off-by: Prekshi Vyas <prekshiv@nvidia.com>
5 tasks
prekshivyas
added a commit
to ColinM-sys/NemoClaw
that referenced
this pull request
Apr 16, 2026
…exercised CodeRabbit correctly flagged that swapping on stat NVIDIA#1 caused readFileSync to see the live PID and exit via isProcessAlive — unlinkIfInodeMatches was never called. Move the swap to just before stat NVIDIA#2 (inside unlinkIfInodeMatches): stat NVIDIA#1 reads the original stale inode, readFileSync sees the dead PID, isProcessAlive returns false, stale-cleanup runs, and stat NVIDIA#2 sees the new inode and skips the unlink. Use write-to-temp + rename instead of unlink + recreate to guarantee a different inode even on tmpfs/overlayfs which can reuse inodes. Signed-off-by: Prekshi Vyas <prekshiv@nvidia.com>
13 tasks
4 tasks
jyaunches
pushed a commit
that referenced
this pull request
Apr 20, 2026
- Remove unused getForwardList() call from getActiveSandboxSessions — only pgrep/ps is needed for SSH session detection (warning #1) - Consolidate double-prompt in sandboxDestroy into single enriched confirmation prompt (warning #2) - Remove noisy cleanupGatewayAfterLastSandbox forward check that would always fire due to dashboard forward (warning #3) - Use word-boundary regex in parseSshProcesses to prevent false positives when sandbox names share prefixes (warning #4) - Export SessionClassification as named interface (suggestion #1) - Use cross-platform ps -axo instead of Linux-only pgrep -a for macOS compatibility (suggestion #2) - Add forwardCount to SessionClassification for future consumers - Add tests for word-boundary matching edge cases
This was referenced Apr 23, 2026
latenighthackathon
referenced
this pull request
in latenighthackathon/NemoClaw
Apr 24, 2026
Parallels the bot-token check: adds appTokenFormat / appTokenFormatHint to ChannelDef, pins Slack's app token to ^xapp-[A-Za-z0-9_-]+$, and wires the check into setupMessagingChannels so a bogus xapp- prompt drops the channel from the enabled set and skips saveCredential for SLACK_APP_TOKEN. Note: if the bot token passes but the app token fails, the bot token stays persisted (it was already saved before the app-token prompt). That's acceptable — on a retry onboard the bot token lights up as "already configured" and only the app-token prompt fires again. Extends the existing regex test with xapp- valid/invalid cases and adds a second interactive integration test covering the bot-valid + app-invalid path. Per @jyaunches suggestion #2 on NVIDIA#2130. Signed-off-by: latenighthackathon <latenighthackathon@users.noreply.github.com>
6 tasks
latenighthackathon
referenced
this pull request
in latenighthackathon/NemoClaw
Apr 24, 2026
Parallels the bot-token check: adds appTokenFormat / appTokenFormatHint to ChannelDef, pins Slack's app token to ^xapp-[A-Za-z0-9_-]+$, and wires the check into setupMessagingChannels so a bogus xapp- prompt drops the channel from the enabled set and skips saveCredential for SLACK_APP_TOKEN. Note: if the bot token passes but the app token fails, the bot token stays persisted (it was already saved before the app-token prompt). That's acceptable — on a retry onboard the bot token lights up as "already configured" and only the app-token prompt fires again. Extends the existing regex test with xapp- valid/invalid cases and adds a second interactive integration test covering the bot-valid + app-invalid path. Per @jyaunches suggestion #2 on NVIDIA#2130. Signed-off-by: latenighthackathon <latenighthackathon@users.noreply.github.com>
6 tasks
This was referenced Apr 27, 2026
12 tasks
12 tasks
jyaunches
added a commit
to jyaunches/NemoClaw
that referenced
this pull request
May 11, 2026
Lands shared fixtures, helpers, assertion helpers, install-method splits, conventions + lint, and the parity-compare CI harness that unblock the per-wave migration phases (2\u201312). Deliverables (per specs/2026-05-11_e2e-test-migration/spec.md Phase 1): Fixtures (test/e2e/lib/fixtures/): - fake-openai.sh: local OpenAI-compatible endpoint (Risk NVIDIA#2 mitigation) - fake-{telegram,discord,slack}.sh: messaging stubs via shared _fake-http-stub.sh harness - older-base-image.sh: tagged ghcr base-image Dockerfile generator Helpers (test/e2e/lib/): - logging.sh: canonical e2e_{section,info,pass,fail} with stable PASS:/FAIL:/=== Phase markers (absorbs reuse category #1) - sandbox-exec.sh: canonical nemoclaw-shell wrapper with safe quoting, exit-code propagation, and dry-run short-circuit (category NVIDIA#10) - env.sh: auto-sources logging.sh so every consumer gets it for free Assertion helpers (test/e2e/lib/assert/): - inference-works.sh: chat-completion round-trip - no-credentials-leaked.sh: credential-pattern scan - policy-preset-applied.sh: gateway policy preset verification - messaging-bridge-reachable.sh: L7 proxy / bridge reachability Install dispatcher splits (test/e2e/lib/setup/): - install-{repo,curl,ollama,launchable}.sh (four profiles) - install.sh: dispatcher routes by profile/method name (category NVIDIA#5) Runtime probe wiring: - run-scenario.sh: adds --validate-only flag (probe-only, no setup) - resolver/index.ts: E2E_PROBE_OVERRIDES_JSON escape hatch for keys with embedded underscores (e.g. security.policy_engine) Convention lint + parity harness: - scripts/e2e/lint-conventions.ts: enforces 6 conventions on suite step scripts + requires parity-map.yaml entries for legacy scripts - scripts/e2e/compare-parity.sh: diffs legacy vs scenario PASS/FAIL via parity-map.yaml; flaky: true marker supported (Risk NVIDIA#4) - test/e2e/parity-map.yaml: seeded with one entry per existing legacy script; migration phases 2\u201312 append assertion mappings - .github/workflows/e2e-parity-compare.yaml: dispatches legacy script + migrated scenario on same runner and diffs outcomes Tests (all passing, 41 total): - test/e2e-lib-helpers.test.ts: +18 tests (1.A\u20131.E) - test/e2e-convention-lint.test.ts: +11 tests (1.G\u20131.H) - test/e2e-expected-state-validator.test.ts: +2 tests (1.F) No regressions: full cli Vitest project (3258 tests) still green.
cjagwani
pushed a commit
that referenced
this pull request
May 14, 2026
Resolve the two output threads in #3456 left after the core dead-loop fix landed via #3459 + #3434: Sub-bug #3 — `src/lib/onboard.ts` printed `nemoclaw <name> destroy --yes && nemoclaw onboard --gpu` with a literal `<name>` placeholder, and assumed at least one sandbox was registered. When the GPU-passthrough mismatch hit on the State B re-run path with an empty registry (the dead-loop case), the hint was not actionable. Replace with a registry-aware helper at `src/lib/onboard/gpu-recovery.ts` that renders the right shape: - empty registry → suggest `nemoclaw uninstall && nemoclaw onboard --gpu` - one sandbox → suggest destroy --yes --cleanup-gateway for that name - multiple sandboxes → list each, only the last gets --cleanup-gateway Sub-bug #4 — `src/lib/actions/uninstall/run-plan.ts` printed `Destroyed gateway 'nemoclaw' skipped` when the openshell destroy no-op'd (gateway already gone) — the "Destroyed … skipped" wording was self-contradictory. Extend `runOptional` with an `onSkip` option; route the gateway destroy to emit `Gateway 'nemoclaw' already removed or unreachable` on no-op. Tests: - `src/lib/onboard/gpu-recovery.test.ts` (6 tests): forbid literal `<name>` placeholder anywhere in the output; cover empty / single / multi-sandbox cases; defensive filter on whitespace names so a `nemoclaw destroy` rendering can never happen. - `src/lib/actions/uninstall/run-plan.test.ts`: assert the new "already removed or unreachable" wording and the absence of the "Destroyed gateway 'nemoclaw' skipped" string. The core dead loop itself (sub-bugs #1, #2 and State B GPU mismatch) is already addressed by #3459 + #3434 + #3483; #3456 will close once this lands. See the #3456 status comment for the full mapping. Refs #3456. Mirrors (and tightens) the approach in the closed PR #3464, which left the literal `<name>` placeholder in tests per CodeRabbit feedback that was never addressed. Signed-off-by: Charan Jagwani <charjags100@gmail.com>
cv
added a commit
that referenced
this pull request
May 14, 2026
…3520) > **Draft for visibility.** Issue-autopilot Stages 4-5 of #3456. Will mark ready once batch self-review + CI complete. ## Summary Closes the two remaining output threads in #3456 after the core dead-loop fix already landed on `main` (via #3459, #3434, #3483). Full sub-bug mapping in the [#3456 status comment](#3456 (comment)). - **Sub-bug #3** — `nemoclaw <name> destroy --yes` recovery hint replaced with a registry-aware helper. - **Sub-bug #4** — `Destroyed gateway 'nemoclaw' skipped` self-contradictory wording replaced with `Gateway 'nemoclaw' already removed or unreachable`. ## Acceptance criteria mapping | Sub-bug | Resolution | Evidence | |---|---|---| | #1 dead loop | Already fixed on main (#3459) | out of scope | | #2 firewall diagnostic | Already fixed on main (#3459) | out of scope | | **#3** literal `<name>` placeholder | **This PR** | `src/lib/onboard/gpu-recovery.ts` + `onboard.ts:10387-10405` | | **#4** misleading "skipped" wording | **This PR** | `src/lib/actions/uninstall/run-plan.ts:210-228, 407-414` | | #5 uninstall residuals | Already fixed on main (#3483) | out of scope | ## Behavior matrix `gpuPassthroughRecoveryLines(names)`: | Input | Suggestion | |---|---| | `null` / `[]` | `nemoclaw uninstall && nemoclaw onboard --gpu` | | one sandbox | `nemoclaw <name> destroy --yes --cleanup-gateway && nemoclaw onboard --gpu` | | many sandboxes | each `destroy --yes`, only the last gets `--cleanup-gateway` | ## Test plan ``` npm run typecheck:cli npx vitest run src/lib/onboard/gpu-recovery.test.ts src/lib/actions/uninstall/run-plan.test.ts ``` 22 tests pass (6 new + 16 existing). ## Notes for reviewers - This is the work [#3464 attempted](#3464); that PR was closed without merging after CodeRabbit asked for the `<name>` placeholder to be forbidden in tests via negative assertion. This PR adopts that refinement. - `runOptional` extension is backwards-compatible — existing callers without `onSkip` get the original wording. Closes #3456 once merged. --------- Signed-off-by: Charan Jagwani <charjags100@gmail.com> Co-authored-by: Charan Jagwani <charjags100@gmail.com> Co-authored-by: Carlos Villela <cvillela@nvidia.com>
3 tasks
This was referenced May 21, 2026
ericksoa
added a commit
that referenced
this pull request
May 22, 2026
…#4020) ## Summary Adds `classifyGatewayFailure` and wires it into `showSandboxStatus`'s final fallback branch so `nemoclaw <name> status` prints a clearly-named failure layer header before the existing actionable hints. Closes the UX gap split out of #2666 / #3270. ## Related Issue Fixes #3271. Supersedes #3309 (kagura-agent), which implemented the same feature but missed the `docker ps -a` existence check that AC #2 explicitly requires (CodeRabbit major finding on that PR). ## Changes - `src/lib/actions/sandbox/gateway-failure-classifier.ts`: new module exposing `classifyGatewayFailure(sandboxName, { runners? })` with injectable runners (`dockerInfo`, `dockerIsRunning`, `dockerExists`, `portProbe`) plus `getLayerHeader(layer)`. - Layers: `docker_unreachable`, `container_missing` (new, distinct from `container_exited` per AC #2), `container_exited_port_conflict`, `container_exited`, `gateway_unreachable`. - Default runners go through `src/lib/adapters/docker` (`dockerInfo`, `dockerCapture`) to satisfy the docker-abstraction guard. - `src/lib/actions/sandbox/status.ts`: calls the classifier and prints the layer header before `printGatewayLifecycleHint` in the final fallback branch. ## Type of Change - [x] Code change (feature, bug fix, or refactor) - [ ] Code change with doc updates - [ ] Doc only (prose changes, no code sample modifications) - [ ] Doc only (includes code sample changes) ## Verification - [x] Unit tests in isolation: `npx vitest run test/gateway-failure-classifier.test.ts` → 8/8 pass (per-layer, including `container_missing` and short-circuit behavior). - [x] Subprocess test in isolation: `npx vitest run test/repro-2666-silent-list-status.test.ts` → 7/7 pass, including the new "`nemoclaw <name> status` prints the `container_exited_port_conflict` layer header (#3271)" test which spawns the real CLI against a fake docker stack + a real TCP listener holding the gateway port. - [x] `test/docker-abstraction-guard.test.ts` passes — no direct `execSync("docker …")` outside `src/lib/adapters/docker`. - [x] Tests added or updated for new or changed behavior - [x] No secrets, API keys, or credentials committed - [ ] Docs updated for user-facing behavior changes (status output is a UX polish, not a contract change) - [ ] `make docs` builds without warnings (no doc changes)⚠️ Committed with `--no-verify` (user-authorized): the pre-commit `Test (CLI)` hook (full vitest with v8 coverage) hits unrelated timeout flakes on this macOS workstation (Defender + Spotlight + iMessage indexer contention). The new tests in this PR pass cleanly in isolation. CI on Linux runners is the authoritative gate. ## Definition of Done (from #3271) - [x] `status` prints a clearly-named layer header in each classified state (5 layers, expanded from the original 4 to split `container_missing` from `container_exited`). - [x] Classifier has unit tests per layer. - [x] Repro subprocess test extended to assert the named layer for the container-stopped + foreign-port-holder scenario. 🤖 Generated with [Claude Code](https://claude.com/claude-code) <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **New Features** * Added smarter gateway failure diagnostics that identify unreachable Docker, missing or exited gateway containers, and port conflicts; includes clear failure headers. * **Bug Fixes** * Status command now shows the appropriate failure header before guidance and exits with a non-zero status when verification fails. * **Tests** * Added unit and end-to-end tests covering diagnostics, header ordering, and port-conflict scenarios. <!-- review_stack_entry_start --> [](https://app.coderabbit.ai/change-stack/NVIDIA/NemoClaw/pull/4020?utm_source=github_walkthrough&utm_medium=github&utm_campaign=change_stack) <!-- review_stack_entry_end --> <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Signed-off-by: Charan Jagwani <cjagwani@nvidia.com> Signed-off-by: Aaron Erickson <aerickson@nvidia.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: Aaron Erickson <aerickson@nvidia.com>
jyaunches
added a commit
that referenced
this pull request
May 28, 2026
…re pending steps Addresses PR Review Advisor finding #2: 'Required security probes and expected-failure checks still skip without failing live runs'. Same architectural failure mode as findings #5 (E2E_DRY_RUN) and #6 (phase-1-skeleton): the typed framework declared a contract, but a code path silently produced non-failing skips that contradicted this PR's 'one mode, no fake green' invariant. Three suites (security-shields, security-policy, security-injection) were wired as kind: 'probe' steps; the orchestrator marked them status: 'skipped' because the probe registry hasn't landed yet. The expected-failure side-effect validator (runtime.expected-failure.no- side-effects) was wired as kind: 'pending'. run.ts only sets process.exitCode = 1 on status === 'failed', so a live run could omit security shields, network policy, prompt-injection blocking, AND the negative-scenario forbidden-side-effect contract while still appearing non-failing. Fix follows the framework spirit: declare requirement in the typed metadata, enforce it once at the orchestrator boundary. Typed contract extension (types.ts): AssertionStep.required?: boolean When true, a probe/pending step that resolves as 'skipped' is reclassified as 'failed' by the phase orchestrator. Defaults to false (existing behavior preserved for diagnostics, docs-validation, and other non-security probes). Orchestrator (phase.ts): executeStep: probe and pending branches now check step.required and return status: 'failed' with a 'required <kind> not <state>' message when set. Non-required probes/pending steps continue to surface as visible skipped results so unrelated gaps stay diagnostic. Registry (assertions/registry.ts): - probeStep / pendingStep helpers grew an options arg with required?: boolean (matches the secretEnv pattern: options come in via typed objects, not positional args). - security-shields, security-policy, security-injection probes annotated required: true. These are the suites the run is not safe without; failing closed beats fake green. - runtime.expected-failure.no-side-effects pending step annotated required: true. Negative scenarios cannot honestly pass while their core side-effect contract is unimplemented. - diagnostics, docs-validation probes intentionally remain non-required: they are informational, not safety-critical, and will switch to required when the probe registry lands and they have real implementations. run.ts: no change needed. The new failed-status flows through the existing exit-code path (process.exitCode = 1 when any phase status is failed), so the workflow correctly reports red for missing required probes/pending steps. Tests (e2e-phase-orchestrators.test.ts, +5): - test_required_probe_step_that_is_unregistered_fails_the_phase - test_non_required_probe_step_continues_to_skip_visibly - test_required_pending_step_fails_closed - test_security_suite_groups_in_registry_mark_their_steps_as_required - test_expected_failure_no_side_effects_step_in_registry_is_required 318/318 e2e-scenario framework tests pass. Until the probe registry follow-up PR registers actual implementations for shieldsConfigProbe / networkPolicyProbe / injectionBlockedProbe / expectedFailureNoSideEffectsProbe, scenarios that include those suites/expected-failure groups will fail loudly with 'required probe not registered' or 'required pending step not implemented'. That is the correct intermediate state: visible red until real, instead of green-by-default. Signed-off-by: Julie Yaunches <jyaunches@nvidia.com>
4 tasks
12 tasks
12 tasks
This was referenced Jun 10, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Migrated from NVIDIA/openshell-openclaw-plugin#18 by @nv-kasikritc