Skip to content

feature: custom settings for using build endpoints#2

Closed
ericksoa wants to merge 8 commits into
mainfrom
feat/custom-settings-build-endpoint
Closed

feature: custom settings for using build endpoints#2
ericksoa wants to merge 8 commits into
mainfrom
feat/custom-settings-build-endpoint

Conversation

@ericksoa

Copy link
Copy Markdown
Contributor

Migrated from NVIDIA/openshell-openclaw-plugin#18 by @nv-kasikritc

@cv

cv commented Mar 21, 2026

Copy link
Copy Markdown
Collaborator

Hey @ericksoa — just circling back on this one. I know you've got a few PRs open here (thanks for all the contributions, by the way!). Things have moved pretty quickly in the repo since this was first submitted — new CI pipeline, additional features, etc. Could you rebase this branch against the latest main so we can evaluate it cleanly? Would love to take a fresh look at the custom build endpoint settings. Thanks!

@ericksoa

Copy link
Copy Markdown
Contributor Author

Superseded — inference provider configuration is now handled by the blueprint system in onboard.

@ericksoa ericksoa closed this Mar 22, 2026
jyaunches pushed a commit to jyaunches/NemoClaw that referenced this pull request Apr 14, 2026
- Guard runArgv/runArgvCapture against shell:true to prevent security
  bypass (finding #1) — throws if a caller attempts to re-enable shell
  interpretation. Added 2 tests.
- Document the intentional bash -c exception in getOllamaWarmupCommand
  explaining why it's safe (finding NVIDIA#2).
- Remove dead getOpenshellCommand() from policies.ts (finding NVIDIA#3).
- Remove unused shellQuote import from nim.ts (finding NVIDIA#4).
- Fix brittle indexOf assertion in onboard-readiness test (finding NVIDIA#5).
prekshivyas added a commit to ColinM-sys/NemoClaw that referenced this pull request Apr 16, 2026
…exercised

CodeRabbit correctly flagged that swapping on stat NVIDIA#1 caused
readFileSync to see the live PID and exit via isProcessAlive —
unlinkIfInodeMatches was never called. Move the swap to just before
stat NVIDIA#2 (inside unlinkIfInodeMatches): stat NVIDIA#1 reads the original
stale inode, readFileSync sees the dead PID, isProcessAlive returns
false, stale-cleanup runs, and stat NVIDIA#2 sees the new inode and skips
the unlink.

Signed-off-by: Prekshi Vyas <prekshiv@nvidia.com>
prekshivyas added a commit to ColinM-sys/NemoClaw that referenced this pull request Apr 16, 2026
…exercised

CodeRabbit correctly flagged that swapping on stat NVIDIA#1 caused
readFileSync to see the live PID and exit via isProcessAlive —
unlinkIfInodeMatches was never called. Move the swap to just before
stat NVIDIA#2 (inside unlinkIfInodeMatches): stat NVIDIA#1 reads the original
stale inode, readFileSync sees the dead PID, isProcessAlive returns
false, stale-cleanup runs, and stat NVIDIA#2 sees the new inode and skips
the unlink.

Use write-to-temp + rename instead of unlink + recreate to guarantee
a different inode even on tmpfs/overlayfs which can reuse inodes.

Signed-off-by: Prekshi Vyas <prekshiv@nvidia.com>
jyaunches pushed a commit that referenced this pull request Apr 20, 2026
- Remove unused getForwardList() call from getActiveSandboxSessions —
  only pgrep/ps is needed for SSH session detection (warning #1)
- Consolidate double-prompt in sandboxDestroy into single enriched
  confirmation prompt (warning #2)
- Remove noisy cleanupGatewayAfterLastSandbox forward check that would
  always fire due to dashboard forward (warning #3)
- Use word-boundary regex in parseSshProcesses to prevent false positives
  when sandbox names share prefixes (warning #4)
- Export SessionClassification as named interface (suggestion #1)
- Use cross-platform ps -axo instead of Linux-only pgrep -a for macOS
  compatibility (suggestion #2)
- Add forwardCount to SessionClassification for future consumers
- Add tests for word-boundary matching edge cases
latenighthackathon referenced this pull request in latenighthackathon/NemoClaw Apr 24, 2026
Parallels the bot-token check: adds appTokenFormat / appTokenFormatHint
to ChannelDef, pins Slack's app token to ^xapp-[A-Za-z0-9_-]+$, and
wires the check into setupMessagingChannels so a bogus xapp- prompt
drops the channel from the enabled set and skips saveCredential for
SLACK_APP_TOKEN.

Note: if the bot token passes but the app token fails, the bot token
stays persisted (it was already saved before the app-token prompt).
That's acceptable — on a retry onboard the bot token lights up as
"already configured" and only the app-token prompt fires again.

Extends the existing regex test with xapp- valid/invalid cases and
adds a second interactive integration test covering the
bot-valid + app-invalid path.

Per @jyaunches suggestion #2 on NVIDIA#2130.

Signed-off-by: latenighthackathon <latenighthackathon@users.noreply.github.com>
latenighthackathon referenced this pull request in latenighthackathon/NemoClaw Apr 24, 2026
Parallels the bot-token check: adds appTokenFormat / appTokenFormatHint
to ChannelDef, pins Slack's app token to ^xapp-[A-Za-z0-9_-]+$, and
wires the check into setupMessagingChannels so a bogus xapp- prompt
drops the channel from the enabled set and skips saveCredential for
SLACK_APP_TOKEN.

Note: if the bot token passes but the app token fails, the bot token
stays persisted (it was already saved before the app-token prompt).
That's acceptable — on a retry onboard the bot token lights up as
"already configured" and only the app-token prompt fires again.

Extends the existing regex test with xapp- valid/invalid cases and
adds a second interactive integration test covering the
bot-valid + app-invalid path.

Per @jyaunches suggestion #2 on NVIDIA#2130.

Signed-off-by: latenighthackathon <latenighthackathon@users.noreply.github.com>
jyaunches added a commit to jyaunches/NemoClaw that referenced this pull request May 11, 2026
Lands shared fixtures, helpers, assertion helpers, install-method
splits, conventions + lint, and the parity-compare CI harness that
unblock the per-wave migration phases (2\u201312).

Deliverables (per specs/2026-05-11_e2e-test-migration/spec.md Phase 1):

Fixtures (test/e2e/lib/fixtures/):
- fake-openai.sh: local OpenAI-compatible endpoint (Risk NVIDIA#2 mitigation)
- fake-{telegram,discord,slack}.sh: messaging stubs via shared
  _fake-http-stub.sh harness
- older-base-image.sh: tagged ghcr base-image Dockerfile generator

Helpers (test/e2e/lib/):
- logging.sh: canonical e2e_{section,info,pass,fail} with stable
  PASS:/FAIL:/=== Phase markers (absorbs reuse category #1)
- sandbox-exec.sh: canonical nemoclaw-shell wrapper with safe quoting,
  exit-code propagation, and dry-run short-circuit (category NVIDIA#10)
- env.sh: auto-sources logging.sh so every consumer gets it for free

Assertion helpers (test/e2e/lib/assert/):
- inference-works.sh: chat-completion round-trip
- no-credentials-leaked.sh: credential-pattern scan
- policy-preset-applied.sh: gateway policy preset verification
- messaging-bridge-reachable.sh: L7 proxy / bridge reachability

Install dispatcher splits (test/e2e/lib/setup/):
- install-{repo,curl,ollama,launchable}.sh (four profiles)
- install.sh: dispatcher routes by profile/method name (category NVIDIA#5)

Runtime probe wiring:
- run-scenario.sh: adds --validate-only flag (probe-only, no setup)
- resolver/index.ts: E2E_PROBE_OVERRIDES_JSON escape hatch for keys
  with embedded underscores (e.g. security.policy_engine)

Convention lint + parity harness:
- scripts/e2e/lint-conventions.ts: enforces 6 conventions on suite
  step scripts + requires parity-map.yaml entries for legacy scripts
- scripts/e2e/compare-parity.sh: diffs legacy vs scenario PASS/FAIL
  via parity-map.yaml; flaky: true marker supported (Risk NVIDIA#4)
- test/e2e/parity-map.yaml: seeded with one entry per existing legacy
  script; migration phases 2\u201312 append assertion mappings
- .github/workflows/e2e-parity-compare.yaml: dispatches legacy script
  + migrated scenario on same runner and diffs outcomes

Tests (all passing, 41 total):
- test/e2e-lib-helpers.test.ts: +18 tests (1.A\u20131.E)
- test/e2e-convention-lint.test.ts: +11 tests (1.G\u20131.H)
- test/e2e-expected-state-validator.test.ts: +2 tests (1.F)

No regressions: full cli Vitest project (3258 tests) still green.
cjagwani pushed a commit that referenced this pull request May 14, 2026
Resolve the two output threads in #3456 left after the core dead-loop fix
landed via #3459 + #3434:

Sub-bug #3 — `src/lib/onboard.ts` printed
  `nemoclaw <name> destroy --yes && nemoclaw onboard --gpu`
with a literal `<name>` placeholder, and assumed at least one sandbox
was registered. When the GPU-passthrough mismatch hit on the State B
re-run path with an empty registry (the dead-loop case), the hint was
not actionable. Replace with a registry-aware helper at
`src/lib/onboard/gpu-recovery.ts` that renders the right shape:
  - empty registry → suggest `nemoclaw uninstall && nemoclaw onboard --gpu`
  - one sandbox → suggest destroy --yes --cleanup-gateway for that name
  - multiple sandboxes → list each, only the last gets --cleanup-gateway

Sub-bug #4 — `src/lib/actions/uninstall/run-plan.ts` printed
  `Destroyed gateway 'nemoclaw' skipped`
when the openshell destroy no-op'd (gateway already gone) — the
"Destroyed … skipped" wording was self-contradictory. Extend
`runOptional` with an `onSkip` option; route the gateway destroy to
emit `Gateway 'nemoclaw' already removed or unreachable` on no-op.

Tests:
- `src/lib/onboard/gpu-recovery.test.ts` (6 tests): forbid literal
  `<name>` placeholder anywhere in the output; cover empty / single /
  multi-sandbox cases; defensive filter on whitespace names so a
  `nemoclaw  destroy` rendering can never happen.
- `src/lib/actions/uninstall/run-plan.test.ts`: assert the new
  "already removed or unreachable" wording and the absence of the
  "Destroyed gateway 'nemoclaw' skipped" string.

The core dead loop itself (sub-bugs #1, #2 and State B GPU mismatch)
is already addressed by #3459 + #3434 + #3483; #3456 will close once
this lands. See the #3456 status comment for the full mapping.

Refs #3456. Mirrors (and tightens) the approach in the closed PR #3464,
which left the literal `<name>` placeholder in tests per CodeRabbit
feedback that was never addressed.

Signed-off-by: Charan Jagwani <charjags100@gmail.com>
cv added a commit that referenced this pull request May 14, 2026
…3520)

> **Draft for visibility.** Issue-autopilot Stages 4-5 of #3456. Will
mark ready once batch self-review + CI complete.

## Summary

Closes the two remaining output threads in #3456 after the core
dead-loop fix already landed on `main` (via #3459, #3434, #3483). Full
sub-bug mapping in the [#3456 status
comment](#3456 (comment)).

- **Sub-bug #3** — `nemoclaw <name> destroy --yes` recovery hint
replaced with a registry-aware helper.
- **Sub-bug #4** — `Destroyed gateway 'nemoclaw' skipped`
self-contradictory wording replaced with `Gateway 'nemoclaw' already
removed or unreachable`.

## Acceptance criteria mapping

| Sub-bug | Resolution | Evidence |
|---|---|---|
| #1 dead loop | Already fixed on main (#3459) | out of scope |
| #2 firewall diagnostic | Already fixed on main (#3459) | out of scope
|
| **#3** literal `<name>` placeholder | **This PR** |
`src/lib/onboard/gpu-recovery.ts` + `onboard.ts:10387-10405` |
| **#4** misleading "skipped" wording | **This PR** |
`src/lib/actions/uninstall/run-plan.ts:210-228, 407-414` |
| #5 uninstall residuals | Already fixed on main (#3483) | out of scope
|

## Behavior matrix

`gpuPassthroughRecoveryLines(names)`:

| Input | Suggestion |
|---|---|
| `null` / `[]` | `nemoclaw uninstall && nemoclaw onboard --gpu` |
| one sandbox | `nemoclaw <name> destroy --yes --cleanup-gateway &&
nemoclaw onboard --gpu` |
| many sandboxes | each `destroy --yes`, only the last gets
`--cleanup-gateway` |

## Test plan

```
npm run typecheck:cli
npx vitest run src/lib/onboard/gpu-recovery.test.ts src/lib/actions/uninstall/run-plan.test.ts
```

22 tests pass (6 new + 16 existing).

## Notes for reviewers

- This is the work [#3464
attempted](#3464); that PR was
closed without merging after CodeRabbit asked for the `<name>`
placeholder to be forbidden in tests via negative assertion. This PR
adopts that refinement.
- `runOptional` extension is backwards-compatible — existing callers
without `onSkip` get the original wording.

Closes #3456 once merged.

---------

Signed-off-by: Charan Jagwani <charjags100@gmail.com>
Co-authored-by: Charan Jagwani <charjags100@gmail.com>
Co-authored-by: Carlos Villela <cvillela@nvidia.com>
ericksoa added a commit that referenced this pull request May 22, 2026
…#4020)

## Summary
Adds `classifyGatewayFailure` and wires it into `showSandboxStatus`'s
final fallback branch so `nemoclaw <name> status` prints a clearly-named
failure layer header before the existing actionable hints. Closes the UX
gap split out of #2666 / #3270.

## Related Issue
Fixes #3271. Supersedes #3309 (kagura-agent), which implemented the same
feature but missed the `docker ps -a` existence check that AC #2
explicitly requires (CodeRabbit major finding on that PR).

## Changes
- `src/lib/actions/sandbox/gateway-failure-classifier.ts`: new module
exposing `classifyGatewayFailure(sandboxName, { runners? })` with
injectable runners (`dockerInfo`, `dockerIsRunning`, `dockerExists`,
`portProbe`) plus `getLayerHeader(layer)`.
- Layers: `docker_unreachable`, `container_missing` (new, distinct from
`container_exited` per AC #2), `container_exited_port_conflict`,
`container_exited`, `gateway_unreachable`.
- Default runners go through `src/lib/adapters/docker` (`dockerInfo`,
`dockerCapture`) to satisfy the docker-abstraction guard.
- `src/lib/actions/sandbox/status.ts`: calls the classifier and prints
the layer header before `printGatewayLifecycleHint` in the final
fallback branch.

## Type of Change
- [x] Code change (feature, bug fix, or refactor)
- [ ] Code change with doc updates
- [ ] Doc only (prose changes, no code sample modifications)
- [ ] Doc only (includes code sample changes)

## Verification
- [x] Unit tests in isolation: `npx vitest run
test/gateway-failure-classifier.test.ts` → 8/8 pass (per-layer,
including `container_missing` and short-circuit behavior).
- [x] Subprocess test in isolation: `npx vitest run
test/repro-2666-silent-list-status.test.ts` → 7/7 pass, including the
new "`nemoclaw <name> status` prints the
`container_exited_port_conflict` layer header (#3271)" test which spawns
the real CLI against a fake docker stack + a real TCP listener holding
the gateway port.
- [x] `test/docker-abstraction-guard.test.ts` passes — no direct
`execSync("docker …")` outside `src/lib/adapters/docker`.
- [x] Tests added or updated for new or changed behavior
- [x] No secrets, API keys, or credentials committed
- [ ] Docs updated for user-facing behavior changes (status output is a
UX polish, not a contract change)
- [ ] `make docs` builds without warnings (no doc changes)

⚠️ Committed with `--no-verify` (user-authorized): the pre-commit `Test
(CLI)` hook (full vitest with v8 coverage) hits unrelated timeout flakes
on this macOS workstation (Defender + Spotlight + iMessage indexer
contention). The new tests in this PR pass cleanly in isolation. CI on
Linux runners is the authoritative gate.

## Definition of Done (from #3271)
- [x] `status` prints a clearly-named layer header in each classified
state (5 layers, expanded from the original 4 to split
`container_missing` from `container_exited`).
- [x] Classifier has unit tests per layer.
- [x] Repro subprocess test extended to assert the named layer for the
container-stopped + foreign-port-holder scenario.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Added smarter gateway failure diagnostics that identify unreachable
Docker, missing or exited gateway containers, and port conflicts;
includes clear failure headers.

* **Bug Fixes**
* Status command now shows the appropriate failure header before
guidance and exits with a non-zero status when verification fails.

* **Tests**
* Added unit and end-to-end tests covering diagnostics, header ordering,
and port-conflict scenarios.

<!-- review_stack_entry_start -->

[![Review Change
Stack](https://storage.googleapis.com/coderabbit_public_assets/review-stack-in-coderabbit-ui.svg)](https://app.coderabbit.ai/change-stack/NVIDIA/NemoClaw/pull/4020?utm_source=github_walkthrough&utm_medium=github&utm_campaign=change_stack)

<!-- review_stack_entry_end -->
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Signed-off-by: Charan Jagwani <cjagwani@nvidia.com>
Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: Aaron Erickson <aerickson@nvidia.com>
jyaunches added a commit that referenced this pull request May 28, 2026
…re pending steps

Addresses PR Review Advisor finding #2: 'Required security probes and
expected-failure checks still skip without failing live runs'.

Same architectural failure mode as findings #5 (E2E_DRY_RUN) and #6
(phase-1-skeleton): the typed framework declared a contract, but a
code path silently produced non-failing skips that contradicted this
PR's 'one mode, no fake green' invariant.

Three suites (security-shields, security-policy, security-injection)
were wired as kind: 'probe' steps; the orchestrator marked them
status: 'skipped' because the probe registry hasn't landed yet. The
expected-failure side-effect validator (runtime.expected-failure.no-
side-effects) was wired as kind: 'pending'. run.ts only sets
process.exitCode = 1 on status === 'failed', so a live run could
omit security shields, network policy, prompt-injection blocking,
AND the negative-scenario forbidden-side-effect contract while still
appearing non-failing.

Fix follows the framework spirit: declare requirement in the typed
metadata, enforce it once at the orchestrator boundary.

Typed contract extension (types.ts):
  AssertionStep.required?: boolean
  When true, a probe/pending step that resolves as 'skipped' is
  reclassified as 'failed' by the phase orchestrator. Defaults to
  false (existing behavior preserved for diagnostics, docs-validation,
  and other non-security probes).

Orchestrator (phase.ts):
  executeStep: probe and pending branches now check step.required
  and return status: 'failed' with a 'required <kind> not <state>'
  message when set. Non-required probes/pending steps continue to
  surface as visible skipped results so unrelated gaps stay
  diagnostic.

Registry (assertions/registry.ts):
  - probeStep / pendingStep helpers grew an options arg with
    required?: boolean (matches the secretEnv pattern: options come
    in via typed objects, not positional args).
  - security-shields, security-policy, security-injection probes
    annotated required: true. These are the suites the run is not
    safe without; failing closed beats fake green.
  - runtime.expected-failure.no-side-effects pending step annotated
    required: true. Negative scenarios cannot honestly pass while
    their core side-effect contract is unimplemented.
  - diagnostics, docs-validation probes intentionally remain
    non-required: they are informational, not safety-critical, and
    will switch to required when the probe registry lands and they
    have real implementations.

run.ts: no change needed. The new failed-status flows through the
existing exit-code path (process.exitCode = 1 when any phase status
is failed), so the workflow correctly reports red for missing
required probes/pending steps.

Tests (e2e-phase-orchestrators.test.ts, +5):
  - test_required_probe_step_that_is_unregistered_fails_the_phase
  - test_non_required_probe_step_continues_to_skip_visibly
  - test_required_pending_step_fails_closed
  - test_security_suite_groups_in_registry_mark_their_steps_as_required
  - test_expected_failure_no_side_effects_step_in_registry_is_required

318/318 e2e-scenario framework tests pass.

Until the probe registry follow-up PR registers actual
implementations for shieldsConfigProbe / networkPolicyProbe /
injectionBlockedProbe / expectedFailureNoSideEffectsProbe, scenarios
that include those suites/expected-failure groups will fail loudly
with 'required probe not registered' or 'required pending step not
implemented'. That is the correct intermediate state: visible red
until real, instead of green-by-default.

Signed-off-by: Julie Yaunches <jyaunches@nvidia.com>
@wscurran wscurran added feature PR adds or expands user-visible functionality and removed enhancement: feature labels Jun 3, 2026
@cv cv mentioned this pull request Jun 5, 2026
12 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feature PR adds or expands user-visible functionality

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants