fix(health): add verifyDeployment() and fix false 'Health Offline' on 401#3128
Conversation
… 401 Introduces a post-deployment verification step (verifyDeployment) that runs between ensureDashboardForward() and printDashboard() in onboard.ts. This catches broken state before telling users 'YOUR AGENT IS LIVE'. On failure, users get actionable diagnostics instead of discovering broken state later. Core fixes: - isSandboxGatewayRunning() now uses HTTP status code extraction instead of curl -sf. 401 (device auth enabled) is correctly treated as 'alive'. - getHealthProbeUrl() defaults to /health endpoint (returns 200 regardless of device auth state) instead of / (returns 401 with device auth). - recoverSandboxProcesses() inline check updated to same pattern. - Dashboard readiness wait in createSandbox() updated to probe /health and accept both 200 and 401. New module: src/lib/verify-deployment.ts - DeploymentVerification interface with gateway, inference, dashboard, messaging checks and access method detection - verifyDeployment() function with dependency injection for testability - formatVerificationDiagnostics() for terminal output - 14 unit tests covering all verification paths Fixes #2342
Phase 2: Add trigger-community-image-rebuild.yaml workflow that sends a repository_dispatch event to NVIDIA/OpenShell-Community on NemoClaw release (tag push or GitHub Release published). This ensures the community sandbox image (openclaw-nvidia:latest) is rebuilt with the latest OpenClaw whenever NemoClaw ships a new version. Requires COMMUNITY_DISPATCH_TOKEN secret. Phase 3: Change brev-launchable-ci-cpu.sh default NEMOCLAW_REF from 'main' to 'stable'. Add resolve_stable_ref() function that resolves 'stable' or 'latest' to the newest v* tag via git ls-remote, with fallback to 'main' if resolution fails. Users of the public Brev launchable now get the latest stable release instead of tracking an unstable moving target. Fixes #1242 Refs #2342
Adds a new nightly E2E job that validates health probes work correctly when device auth is enabled (the default). Catches the false 'Health Offline' regression where curl -sf treated HTTP 401 as dead. Phases: 1. Install & onboard with device auth ON 2. Probe /health (expect 200) and / (expect 401) 3. nemoclaw status must NOT report Offline 4. Host port forward liveness check 5. Gateway restart + recovery with new HTTP code pattern 6. Verify deployment diagnostics in onboard log Triggerable via selective nightly dispatch: gh workflow run nightly-e2e.yaml --ref <branch> -f jobs=device-auth-health-e2e
The first run timed out at 15 minutes during Docker image build (cold cache on ubuntu-latest). Bump workflow timeout to 30m and script internal timeout to 1200s to match other sandbox E2E jobs.
- Switch sandbox_exec to SSH (matches established E2E pattern in test-hermes-e2e.sh, test-sandbox-operations.sh) — openshell sandbox exec returned 000 in CI - Increase workflow timeout to 30m, script timeout to 1200s (cold Docker image build takes ~15m on ubuntu-latest) - Make Phase 5 gateway recovery non-fatal (process supervisor may not be active in all environments) - Phase 3 (core regression) already passes — status correctly shows Running, not Offline
Switch all 33 nightly E2E jobs from ubuntu-latest (GitHub-hosted, 2 vCPU) to linux-amd64-cpu4 (NVIDIA self-hosted, 4 vCPU). Meta jobs (notify-on-failure, report-to-pr, scorecard) stay on ubuntu-latest since they only make API calls. Motivation: full sandbox onboard E2E tests spend most of their time on Docker image builds. The NVIDIA runners have more CPU and should reduce per-job runtime significantly. The pr-self-hosted workflow already uses these runners successfully for image builds on every PR.
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds a post-deployment verification module and integrates it into onboarding to run sandbox-exec health probes (treating HTTP 200 or 401 as alive). Updates health-probe semantics to prefer ChangesPost-Deployment Verification
Test & CI / E2E
Sequence Diagram(s)sequenceDiagram
participant Onboard as Onboard CLI
participant Sandbox as Sandbox (container)
participant Gateway as OpenClaw Gateway
participant Host as Host (dashboard / browser)
participant Messaging as Messaging/Provider
Onboard->>Sandbox: sandbox-exec curl http://127.0.0.1:{{DASHBOARD_PORT}}/health
Sandbox->>Gateway: internal gateway health endpoint
Gateway-->>Sandbox: HTTP 200 or 401 (alive)
Sandbox-->>Onboard: health code
Onboard->>Host: probe dashboard forward (host port)
Host-->>Onboard: HTTP response (200/401/502/000)
Onboard->>Gateway: probe inference route & fetch version
Gateway-->>Onboard: route response or error
Onboard->>Messaging: probe messaging bridges
Messaging-->>Onboard: bridge status
Onboard-->>Onboard: aggregate diagnostics -> format output
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
E2E Validation StatusJob: Run 1 (25443434598) ❌ — Timeout
Run 2 (25444564545) ❌ — Partial pass
Phase 2 root cause: Test used Run 3 (25445645261) 🔄 In progressFixes applied:
The fix itself is confirmed working — |
Selective E2E Results — ❌ Some jobs failedRun: 25445818979
|
install.sh runs the full onboard in non-interactive mode, so the script no longer calls nemoclaw onboard separately. After install, sources ~/.bashrc and adds ~/.local/bin to PATH (matching test-full-e2e.sh pattern). Also detects actual dashboard port from forward list in case the default was taken.
Selective E2E Results — ❌ Some jobs failedRun: 25446693015
|
…hable-version-pin-and-gateway-token
Selective E2E Results — ❌ Some jobs failedRun: 25447435353
|
The NVIDIA self-hosted runners may not have SSH ready immediately after install. Phase 2 (sandbox exec probes) now skips gracefully when SSH returns empty, since Phase 4 (host-side port forward probe) already validates the same /health endpoint from the host side. Also adds retry logic to SSH config setup (5 attempts with 3s backoff).
Selective E2E Results — ✅ All requested jobs passedRun: 25448326318
|
The onboard source-shape test enforces that all sandbox exec calls use the '-n sandboxName --' format. Update the dashboard readiness probe to match, and fix the remaining test mocks that still used the old curl -sf pattern.
Previous formatting used wrong shfmt flags. CI uses -i 2 -ci -bn which indents case statement bodies under the pattern.
The connect --probe-only tests used mock openshell scripts that matched 'curl -sf' in the sandbox exec command. Our health fix changed to 'curl -so /dev/null -w' for HTTP status code extraction. Update all 7 mock pattern matches accordingly.
cv
left a comment
There was a problem hiding this comment.
Automated PR review summary
Reviewed PR #3128: fix(health): add verifyDeployment() and fix false 'Health Offline' on 401
Recommendation
- Recommendation: Requires Changes
- Highest observed severity: medium
- Block merge: yes
- Why: The highest-value new behavior in this PR is post-deployment verification before telling users the agent is live. In the installed runtime, the verification chain's version probe appears incompatible: the gateway is healthy, host forwarding works, and inference routing responds, but the probe path used for version retrieval does not yield a usable version. That means the new verification feature can produce misleading diagnostics on healthy deployments, undermining the PR's claim of robust end-to-end confirmation.
- Reviewer summary: Reviewed PR #3128 against the installed NemoClaw/OpenShell environment using real sandbox probes. The core health-path change looks directionally correct, but the new verifyDeployment logic has a runtime mismatch: its gateway version check does not align with the installed OpenClaw behavior and can report degraded verification on a healthy deployment.
Installation and setup findings
- The local-source installer path succeeded through CLI install, OpenShell gateway setup, provider configuration, and sandbox creation. Post-timeout verification confirmed the NemoClaw-managed sandbox was Ready, accepted SSH commands, and could run an in-sandbox OpenClaw model query. The only caveat is the original onboarding process timed out while still in the policies step.
What was validated
- The PR revision was checked out in an isolated review environment.
- The local checkout was installed using the repository installer flow as closely as the environment allowed.
- Adversarial, PR-specific probes were then run against the installed environment and relevant repository context.
- Diff summary:
.github/workflows/nightly-e2e.yaml | 160 ++---
.../workflows/trigger-community-image-rebuild.yaml | 65 ++
Dockerfile | 7 +-
.../kimi-inference-compat/index.js | 226 ------
.../kimi-inference-compat/openclaw.plugin.json | 12 -
scripts/brev-launchable-ci-cpu.sh | 31 +-
scripts/generate-openclaw-config.py | 37 +-
src/lib/agent-runtime.ts | 10 +-
src/lib/inference-config.test.ts | 1 -
src/lib/inference-config.ts | 1 -
src/lib/model-prompts.test.ts | 2 +-
src/lib/onboard-inference-probes.test.ts | 33 -
src/lib/onboard-inference-probes.ts | 27 +-
src/lib/onboard-providers.ts | 17 -
src/lib/onboard.ts | 82 ++-
src/lib/sandbox-build-context.ts | 7 -
src/lib/sandbox-process-recovery-action.ts | 11 +-
src/lib/verify-deployment.test.ts | 183 +++++
src/lib/verify-deployment.ts | 308 +++++++++
test/e2e/test-device-auth-he
...[truncated]
Failing tests and unresolved impact
Failing test 1: verifyDeployment full-chain assumptions vs real sandbox
- What was tested: The new verifyDeployment logic matches the installed runtime so a healthy deployment yields healthy gateway, dashboard, inference, and version checks before success output.
- Why it matters: If false, onboard can print misleading verification diagnostics or classify a healthy deployment as partially broken.
- Observed result: Real sandbox/host probes showed
/health=200, host forward on 18789=200, andinference.localresponded 403 (routing alive). But the expected gateway version retrieval path did not produce a usable version signal in the installed runtime, indicating the new verification chain is partly mismatched to reality. - Command:
ssh ... 'curl /health /version inference.local'; openshell forward list; curl 127.0.0.1:18789/health - Recommended follow-up coverage: Add a real-runtime integration/regression test for verifyDeployment against an installed OpenClaw image to ensure version detection uses a stable machine-readable signal or is treated as optional when unavailable.
Passing tests and why they mattered
Passing test 1: Real sandbox device-auth health endpoint behavior
- What was tested: The deployed gateway health path is live in the real sandbox and the PR’s switch away from
/to/healthis valid for readiness checks. - Why it mattered: If false, status or recovery can still report false Offline or probe the wrong endpoint under device-auth deployments.
- Observed result: Inside the real sandbox, the gateway listened on :18789 and both
/healthand/returned HTTP 200 in this installed image. This confirms/healthis a safe readiness target here, but this environment did not expose a live 401-on-root case. - Command:
ssh -F /tmp/sshcfg ... openshell-nemoclaw-local-install 'curl status for :18789 / and /health' - Recommended follow-up coverage: Add or retain an end-to-end regression test with device auth explicitly enabled so
/returns 401 while/healthis still accepted as alive.
Passing test 2: Verification transport mismatch check (sandbox exec vs SSH)
- What was tested: The onboard verification path’s use of
openshell sandbox execis reliable enough for the new post-deployment checks in the reviewed environment. - Why it mattered: If false, verifyDeployment can fail spuriously even when the sandbox is healthy, repeating the CI issue described in the PR comments.
- Observed result: Both
openshell sandbox execand SSH returned HTTP 200 for the same real sandbox health probe in this environment, so I could not reproduce the transport failure locally. - Command:
openshell sandbox exec -n nemoclaw-local-install -- sh -c 'curl ... /health'; ssh ... 'curl ... /health' - Recommended follow-up coverage: Keep a nightly E2E/regression test covering both transport paths or prefer the transport proven most reliable across CI and launchable environments, since the issue appears environment-sensitive.
Bottom line
- Based on the install evidence and adversarial probes, this PR should not be approved as-is.
Address review feedback from @cv: the openclaw --version probe may not produce output on all OpenClaw installations. Instead of emitting a warning diagnostic when version is unavailable, simply skip it. The version is informational only — healthy status depends solely on gateway reachable + dashboard reachable.
There was a problem hiding this comment.
Actionable comments posted: 7
🧹 Nitpick comments (3)
scripts/brev-launchable-ci-cpu.sh (2)
83-99: ⚡ Quick winConsider filtering out annotated tag dereference entries.
Annotated tags appear twice in
git ls-remoteoutput: once asrefs/tags/v1.0.0and once asrefs/tags/v1.0.0^{}(the dereferenced commit). The--sort=-v:refnamemay not handle this predictably, and^{}entries could end up first.♻️ Proposed fix to filter out ^{} entries
tag=$(git ls-remote --tags --sort=-v:refname \ "https://github.com/NVIDIA/NemoClaw.git" 'refs/tags/v*' 2>/dev/null \ - | head -1 | sed 's|.*refs/tags/||') + | grep -v '\^{}' | head -1 | sed 's|.*refs/tags/||')🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@scripts/brev-launchable-ci-cpu.sh` around lines 83 - 99, The git tag resolution in resolve_stable_ref can pick up annotated-tag dereference lines (refs/tags/vX...^{}) which may sort first; update the git ls-remote pipeline in resolve_stable_ref so it filters out any lines containing ^{} before selecting the top tag (e.g., pipe through a grep -v or similar), then proceed to set NEMOCLAW_REF and log as before using the tag variable.
264-266: 💤 Low value
git pull --ff-onlywill fail when NEMOCLAW_REF is a tag.After resolving to a tag (e.g.,
v0.0.36),git checkout "$NEMOCLAW_REF"puts the repo in detached HEAD state, andgit pull --ff-only origin "$NEMOCLAW_REF"will fail because you cannot pull into a detached HEAD.The
|| trueat line 266 silently swallows this failure, which is acceptable for idempotency, but the error message may confuse users checking logs.♻️ Optional: Skip pull when on a tag
git -C "$NEMOCLAW_CLONE_DIR" fetch origin "$NEMOCLAW_REF" git -C "$NEMOCLAW_CLONE_DIR" checkout "$NEMOCLAW_REF" - git -C "$NEMOCLAW_CLONE_DIR" pull --ff-only origin "$NEMOCLAW_REF" || true + # Only pull if on a branch (tags are immutable, pull would fail) + if git -C "$NEMOCLAW_CLONE_DIR" symbolic-ref -q HEAD >/dev/null 2>&1; then + git -C "$NEMOCLAW_CLONE_DIR" pull --ff-only origin "$NEMOCLAW_REF" || true + fi🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@scripts/brev-launchable-ci-cpu.sh` around lines 264 - 266, The git pull --ff-only step can fail when NEMOCLAW_REF is a tag because git checkout "$NEMOCLAW_REF" puts the repo in detached HEAD and pull into detached HEAD errors (currently swallowed by || true). Modify the script around the git checkout/pull sequence to detect detached HEAD (or detect that NEMOCLAW_REF resolves to a tag) and skip the git -C "$NEMOCLAW_CLONE_DIR" pull --ff-only origin "$NEMOCLAW_REF" step in that case; alternatively, create a local branch from the ref before pulling (e.g., git -C "$NEMOCLAW_CLONE_DIR" checkout -B temp-sync "$NEMOCLAW_REF") so pull can succeed. Ensure the logic references NEMOCLAW_REF, git -C "$NEMOCLAW_CLONE_DIR" checkout, and git -C "$NEMOCLAW_CLONE_DIR" pull --ff-only.test/onboard.test.ts (1)
3422-3422: ⚡ Quick winCover the
401-is-healthy case in these probe stubs.These updates lock the tests to
/health, but every stub still returns only"200". The regression this PR fixes is specifically that device-auth health probes can return401and should still be treated as alive, so these flows would still pass if the code regressed back to200-only handling. Please run at least one representative path with"401"as the stubbed status, or parameterize these cases over["200", "401"].Also applies to: 3559-3559, 3655-3655, 4018-4018, 5411-5411, 5806-5806, 5938-5938, 6555-6555
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test/onboard.test.ts` at line 3422, The probe stub that matches _n(command).includes("sandbox exec") && _n(command).includes("http://localhost:18789/health") currently always returns "200" which misses the regression where a healthy device-auth probe may return "401"; update the test stub(s) around the `_n(command)` check (and the other identical occurrences) to either parameterize the returned status over ["200","401"] or add at least one representative test case/run that returns "401" so the code path treating 401 as healthy is exercised; ensure you modify the same conditional that matches "sandbox exec" and the /health URL so those tests cover both statuses.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.github/workflows/nightly-e2e.yaml:
- Around line 1422-1456: The new GitHub Actions job device-auth-health-e2e adds
the E2E test script test/e2e/test-device-auth-health.sh but there is no
corresponding path_instructions mapping in .coderabbit.yaml, which breaks
cross-repo E2E coverage validation; update .coderabbit.yaml to add a
path_instructions entry that maps the test file path
test/e2e/test-device-auth-health.sh (or a glob covering it) to the
device-auth-health-e2e job name so test/validate-e2e-coverage.test.ts will
detect and include this job in the coverage checks.
In @.github/workflows/trigger-community-image-rebuild.yaml:
- Around line 30-45: The workflow currently triggers on both release.published
and push tags (v*), which can cause duplicate dispatches; remove the push: tags
trigger and rely solely on the release: types: [published] event to make
official releases the single source of truth, and keep the existing if:
github.repository == 'NVIDIA/NemoClaw' guard; update the top-level on: section
to only include release: { types: [published] } and delete the push: tags block
so only published releases dispatch cross-repo rebuilds.
In `@src/lib/agent-runtime.ts`:
- Around line 42-52: The recovery scripts still probe `/` and rely on curl's
exit status, which treats a device-auth 401 as failure; update
buildOpenClawRecoveryScript and buildRecoveryScript to use
getHealthProbeUrl(agent) (instead of hardcoded `/`) and change the probe logic
to explicitly check for HTTP 200 (e.g., use curl to return the response code and
assert it equals 200) rather than using curl -sf exit semantics so a 401 won't
be treated as "down". Ensure both functions reference the shared
getHealthProbeUrl helper when building their curl checks.
In `@src/lib/onboard.ts`:
- Around line 9506-9539: The post-deployment verification currently runs after
session finalization, so a failed verification still marks onboarding complete;
change control flow so completeSession() and any legacy-credential cleanup and
the final dashboard success output are executed only when
verificationResult.healthy is true, and if verificationResult.healthy is false
log the diagnostics (using verifyDeploymentModule.formatVerificationDiagnostics)
and exit non‑zero (e.g., process.exit(1)) to prevent finalizing the onboarding;
locate and move the call to completeSession() (and related cleanup/final success
printing) to after the check of verificationResult.healthy and ensure
verifyDeploymentModule.verifyDeployment(...) and verificationResult are
evaluated before that decision.
In `@src/lib/verify-deployment.ts`:
- Around line 192-261: verifyDeployment currently returns a passive boolean and
diagnostics which callers (e.g., printDashboard) can ignore; change the API to
make blocking failures explicit by either (A) throwing a descriptive error from
verifyDeployment when the health gate fails (gatewayReachable or
dashboardReachable are false) so callers must catch/abort before calling
printDashboard, or (B) convert VerifyDeploymentResult into a discriminated union
(e.g., { status: "ok", verification: DeploymentVerification, diagnostics } | {
status: "fail", diagnostics }) and update callers to exhaustively branch on the
status before printing success; update references to verifyDeployment,
DeploymentVerification, VerifyDeploymentResult and the caller that invokes
printDashboard to use the new behavior.
- Around line 257-259: The healthy flag currently only checks gateway.reachable
and dashboard.reachable, which lets formatVerificationDiagnostics() take the
success path while inference or messaging are failing; update the healthy
calculation to also require inference.working and messaging.healthy (e.g.,
compute healthy = gateway.reachable && dashboard.reachable && inference.working
&& messaging.healthy) so that formatVerificationDiagnostics() will surface
inference/messaging warnings instead of reporting a verified deployment; adjust
any code paths that rely on the old healthy semantics if necessary.
In `@test/e2e/test-device-auth-health.sh`:
- Around line 90-117: The sandbox_exec function uses the raw SANDBOX_NAME as the
SSH target which can bypass the host stanza generated by openshell sandbox
ssh-config; change sandbox_exec to use the host alias produced by the config
(openshell-${SANDBOX_NAME}) as the SSH destination while still passing -F
"$SSH_CONFIG" and keep setup_ssh and SSH_CONFIG logic intact so the generated
config's Host entry is honored.
---
Nitpick comments:
In `@scripts/brev-launchable-ci-cpu.sh`:
- Around line 83-99: The git tag resolution in resolve_stable_ref can pick up
annotated-tag dereference lines (refs/tags/vX...^{}) which may sort first;
update the git ls-remote pipeline in resolve_stable_ref so it filters out any
lines containing ^{} before selecting the top tag (e.g., pipe through a grep -v
or similar), then proceed to set NEMOCLAW_REF and log as before using the tag
variable.
- Around line 264-266: The git pull --ff-only step can fail when NEMOCLAW_REF is
a tag because git checkout "$NEMOCLAW_REF" puts the repo in detached HEAD and
pull into detached HEAD errors (currently swallowed by || true). Modify the
script around the git checkout/pull sequence to detect detached HEAD (or detect
that NEMOCLAW_REF resolves to a tag) and skip the git -C "$NEMOCLAW_CLONE_DIR"
pull --ff-only origin "$NEMOCLAW_REF" step in that case; alternatively, create a
local branch from the ref before pulling (e.g., git -C "$NEMOCLAW_CLONE_DIR"
checkout -B temp-sync "$NEMOCLAW_REF") so pull can succeed. Ensure the logic
references NEMOCLAW_REF, git -C "$NEMOCLAW_CLONE_DIR" checkout, and git -C
"$NEMOCLAW_CLONE_DIR" pull --ff-only.
In `@test/onboard.test.ts`:
- Line 3422: The probe stub that matches _n(command).includes("sandbox exec") &&
_n(command).includes("http://localhost:18789/health") currently always returns
"200" which misses the regression where a healthy device-auth probe may return
"401"; update the test stub(s) around the `_n(command)` check (and the other
identical occurrences) to either parameterize the returned status over
["200","401"] or add at least one representative test case/run that returns
"401" so the code path treating 401 as healthy is exercised; ensure you modify
the same conditional that matches "sandbox exec" and the /health URL so those
tests cover both statuses.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 8e922a5f-6ab3-44ea-8c15-6d825b0632f6
📒 Files selected for processing (12)
.github/workflows/nightly-e2e.yaml.github/workflows/trigger-community-image-rebuild.yamlscripts/brev-launchable-ci-cpu.shsrc/lib/actions/sandbox/process-recovery.tssrc/lib/agent-runtime.tssrc/lib/onboard.tssrc/lib/verify-deployment.test.tssrc/lib/verify-deployment.tstest/cli.test.tstest/e2e/test-device-auth-health.shtest/onboard.test.tstest/shellquote-sandbox.test.ts
| // Post-deployment verification — confirm the full delivery chain is | ||
| // operational before telling the user "YOUR AGENT IS LIVE". Fixes #2342. | ||
| const verifyDeploymentModule: typeof import("./verify-deployment") = require("./verify-deployment"); | ||
| const _verifyChatUiUrl = process.env.CHAT_UI_URL || `http://127.0.0.1:${DASHBOARD_PORT}`; | ||
| const verifyChain = buildChain({ chatUiUrl: _verifyChatUiUrl, isWsl: isWsl(), wslHostAddress: getWslHostAddress() }); | ||
| const verificationResult = verifyDeploymentModule.verifyDeployment( | ||
| sandboxName, | ||
| verifyChain, | ||
| { | ||
| executeSandboxCommand: (name: string, script: string) => { | ||
| return executeSandboxCommandForVerification(name, script); | ||
| }, | ||
| probeHostPort: (port: number, probePath: string) => { | ||
| const result = runCapture( | ||
| ["curl", "-so", "/dev/null", "-w", "%{http_code}", "--max-time", "3", | ||
| `http://127.0.0.1:${port}${probePath}`], | ||
| { ignoreError: true }, | ||
| ); | ||
| return parseInt(result.trim(), 10) || 0; | ||
| }, | ||
| captureForwardList: () => { | ||
| const output = runCaptureOpenshell(["forward", "list"], { ignoreError: true }); | ||
| return output || null; | ||
| }, | ||
| getMessagingChannels: (_name: string) => selectedMessagingChannels || [], | ||
| providerExistsInGateway: (providerName: string) => providerExistsInGateway(providerName), | ||
| }, | ||
| ); | ||
|
|
||
| // Print verification diagnostics | ||
| const diagLines = verifyDeploymentModule.formatVerificationDiagnostics(verificationResult); | ||
| for (const line of diagLines) { | ||
| console.log(line); | ||
| } |
There was a problem hiding this comment.
Gate onboarding success on verificationResult.healthy.
Because Line 9483 already calls completeSession() and the legacy-credential cleanup runs before this block, a failed verification still leaves onboarding finalized and prints the normal post-onboard dashboard output. That makes the new verifier warning-only, which defeats the “verify before success” goal and leaves a broken deployment non-resumable. Move session completion / cleanup / final success output behind a healthy verification result, and exit non-zero when verification fails.
Suggested flow
- onboardSession.completeSession(toSessionUpdates({ sandboxName, provider, model }));
- completed = true;
- // Onboarding finished successfully. Delete the legacy plaintext
- // credentials.json only when every staged *value* was actually pushed
- // to the gateway in this run.
- const allStagedMigrated =
- stagedLegacyKeys.length > 0 && stagedLegacyKeys.every((k) => migratedLegacyKeys.has(k));
- if (allStagedMigrated) {
- removeLegacyCredentialsFile();
- } else if (stagedLegacyKeys.length > 0) {
- ...
- }
const verifyDeploymentModule: typeof import("./verify-deployment") = require("./verify-deployment");
...
const verificationResult = verifyDeploymentModule.verifyDeployment(
sandboxName,
verifyChain,
{ ... },
);
const diagLines = verifyDeploymentModule.formatVerificationDiagnostics(verificationResult);
for (const line of diagLines) {
console.log(line);
}
+
+ if (!verificationResult.healthy) {
+ process.exit(1);
+ }
+
+ onboardSession.completeSession(toSessionUpdates({ sandboxName, provider, model }));
+ completed = true;
+ const allStagedMigrated =
+ stagedLegacyKeys.length > 0 && stagedLegacyKeys.every((k) => migratedLegacyKeys.has(k));
+ if (allStagedMigrated) {
+ removeLegacyCredentialsFile();
+ } else if (stagedLegacyKeys.length > 0) {
+ ...
+ }
printDashboard(sandboxName, model, provider, nimContainer, agent);🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/lib/onboard.ts` around lines 9506 - 9539, The post-deployment
verification currently runs after session finalization, so a failed verification
still marks onboarding complete; change control flow so completeSession() and
any legacy-credential cleanup and the final dashboard success output are
executed only when verificationResult.healthy is true, and if
verificationResult.healthy is false log the diagnostics (using
verifyDeploymentModule.formatVerificationDiagnostics) and exit non‑zero (e.g.,
process.exit(1)) to prevent finalizing the onboarding; locate and move the call
to completeSession() (and related cleanup/final success printing) to after the
check of verificationResult.healthy and ensure
verifyDeploymentModule.verifyDeployment(...) and verificationResult are
evaluated before that decision.
There was a problem hiding this comment.
Intentional design: verification is informational, not a gate. The rationale is that a partially-working deployment (e.g., inference route temporarily unavailable during cold start) is still better than no deployment — users can see the diagnostics and act on them. Blocking onboard completion on transient probe failures would cause frustrating retries for users whose sandboxes are actually functional.
The completeSession() call is correct here — the sandbox IS created and the gateway IS running. The verification step gives users visibility into what might not be fully ready yet, without forcing them to re-run onboard.
There was a problem hiding this comment.
@jyaunches Got it — thanks for clarifying the intent. I’ll withdraw the gating suggestion. Keeping verification informational while finalizing the session makes sense here, given transient cold‑start conditions.
Optional low-friction tweaks you might consider (no behavior change by default):
- Add a single concluding line when verification isn’t healthy to make the “continuing without gating” explicit for operators:
console.warn("Continuing despite failed verification — see diagnostics above for next steps."); - Provide an opt-in strict mode for CI without changing defaults, e.g.:
if (process.env.NEMOCLAW_STRICT_VERIFY === "1" && !verificationResult.healthy) { process.exit(1); }
If you want, I can open a follow-up issue proposing the opt-in strict flag and the final warning line.
✏️ Learnings added
Learnt from: jyaunches
Repo: NVIDIA/NemoClaw PR: 3128
File: src/lib/onboard.ts:9509-9542
Timestamp: 2026-05-06T22:33:22.643Z
Learning: In NVIDIA/NemoClaw onboarding (src/lib/onboard.ts), post-deployment verification via verifyDeployment is intentionally informational and must not gate success: completeSession() should remain before/independent of verification, and onboarding must not exit non-zero on verification failures by default.
Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.
🧠 Learnings used
Learnt from: jyaunches
Repo: NVIDIA/NemoClaw PR: 2495
File: src/lib/onboard-inference-probes.ts:135-323
Timestamp: 2026-04-27T13:06:57.456Z
Learning: In NVIDIA/NemoClaw PRs that extract code from `src/lib/onboard.ts` into new behavior-preserving modules (e.g., `src/lib/onboard-providers.ts`, `src/lib/onboard-ollama-proxy.ts`, `src/lib/onboard-inference-probes.ts`), reviewers should avoid requesting refactors/complexity/style improvements in the extracted modules as part of the same PR. Only check for correctness and behavior preservation; defer non-behavior-preserving cleanup to separate follow-up PRs.
Learnt from: ericksoa
Repo: NVIDIA/NemoClaw PR: 2554
File: src/lib/onboard.ts:698-700
Timestamp: 2026-04-28T17:22:19.715Z
Learning: For NVIDIA/NemoClaw credential environment hydration/migration, implement fill-only-if-missing semantics across the onboarding and credentials modules: (1) In src/lib/credentials.ts, stageLegacyCredentialsToEnv() should only consider an allowlisted set of credential keys and should set process.env[key] only when getCredential(key) returns null (never override an explicitly provided env export). Treat blank/whitespace values as unset. (2) In src/lib/onboard.ts, hydrateCredentialEnv(envName) must first read getCredential(envName) and only call stageLegacyCredentialsToEnv() when that specific key is missing; after staging, re-read and then set process.env[envName]. This ensures stale credentials.json values cannot override explicitly provided environment credentials.
| export function verifyDeployment( | ||
| sandboxName: string, | ||
| chain: DashboardDeliveryChain, | ||
| deps: VerifyDeploymentDeps, | ||
| ): VerifyDeploymentResult { | ||
| const diagnostics: DeploymentDiagnostic[] = []; | ||
|
|
||
| // 1. Gateway reachable inside sandbox | ||
| const gateway = verifyGatewayInSandbox(sandboxName, chain, deps); | ||
| diagnostics.push({ | ||
| link: "gateway", | ||
| status: gateway.reachable ? "ok" : "fail", | ||
| detail: gateway.detail, | ||
| hint: gateway.reachable | ||
| ? "" | ||
| : "The gateway process may have crashed during startup. Check /tmp/gateway.log inside the sandbox.", | ||
| }); | ||
|
|
||
| // 2. Gateway version (cosmetic — not a health signal) | ||
| const gatewayVersion = gateway.reachable ? fetchGatewayVersion(sandboxName, deps) : null; | ||
|
|
||
| // 3. Dashboard reachable from host (port forward) | ||
| const dashboard = verifyDashboardFromHost(chain, deps); | ||
| diagnostics.push({ | ||
| link: "dashboard", | ||
| status: dashboard.reachable ? "ok" : "fail", | ||
| detail: dashboard.detail, | ||
| hint: dashboard.reachable | ||
| ? "" | ||
| : `Port forward on ${chain.port} is not working. Run: openshell forward start ${chain.forwardTarget} ${sandboxName}`, | ||
| }); | ||
|
|
||
| // 4. Inference route | ||
| const inference = verifyInferenceRoute(sandboxName, deps); | ||
| diagnostics.push({ | ||
| link: "inference", | ||
| status: inference.working ? "ok" : "warn", | ||
| detail: inference.detail, | ||
| hint: inference.working | ||
| ? "" | ||
| : "The inference proxy may not be ready yet. Try: nemoclaw <sandbox> status (it may take a few seconds after creation).", | ||
| }); | ||
|
|
||
| // 5. Messaging bridges | ||
| const messaging = verifyMessagingBridges(sandboxName, deps); | ||
| if (!messaging.healthy) { | ||
| diagnostics.push({ | ||
| link: "messaging", | ||
| status: "warn", | ||
| detail: messaging.detail, | ||
| hint: "Some messaging providers are not attached to the gateway. Re-run onboard with the relevant channels enabled.", | ||
| }); | ||
| } | ||
|
|
||
| const accessMethod = detectAccessMethod(chain); | ||
|
|
||
| const verification: DeploymentVerification = { | ||
| gatewayReachable: gateway.reachable, | ||
| gatewayVersion, | ||
| inferenceRouteWorking: inference.working, | ||
| dashboardReachable: dashboard.reachable, | ||
| messagingBridgesHealthy: messaging.healthy, | ||
| accessMethod, | ||
| }; | ||
|
|
||
| // Healthy = gateway reachable AND dashboard reachable from host. | ||
| // Inference and messaging are warn-level (non-blocking). | ||
| const healthy = gateway.reachable && dashboard.reachable; | ||
|
|
||
| return { healthy, verification, diagnostics }; |
There was a problem hiding this comment.
Make blocking verification failures impossible for callers to ignore.
The module comments say this runs before printing the success/live message, but the only call site shown in src/lib/onboard.ts:9506-9541 logs diagnostics and still proceeds to printDashboard() unconditionally. Because this API only returns a passive boolean, the failure path is easy to bypass and the PR can still advertise success after a failed verification.
Consider making blocking failures explicit in this API — for example by throwing on unhealthy results or by returning a discriminated result that the caller must exhaustively branch on before printing success.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/lib/verify-deployment.ts` around lines 192 - 261, verifyDeployment
currently returns a passive boolean and diagnostics which callers (e.g.,
printDashboard) can ignore; change the API to make blocking failures explicit by
either (A) throwing a descriptive error from verifyDeployment when the health
gate fails (gatewayReachable or dashboardReachable are false) so callers must
catch/abort before calling printDashboard, or (B) convert VerifyDeploymentResult
into a discriminated union (e.g., { status: "ok", verification:
DeploymentVerification, diagnostics } | { status: "fail", diagnostics }) and
update callers to exhaustively branch on the status before printing success;
update references to verifyDeployment, DeploymentVerification,
VerifyDeploymentResult and the caller that invokes printDashboard to use the new
behavior.
There was a problem hiding this comment.
Same design rationale as the onboard gating comment above. The module is intentionally advisory — it runs probes and reports what it finds, but does not throw or block. The caller (onboard.ts) decides what to do with the result: currently it prints diagnostics and proceeds.
This keeps the module composable — other callers (like nemoclaw status or a future nemoclaw doctor) can use the same verification logic with different blocking semantics. Making the module itself enforce blocking would couple it to onboard's specific UX requirements.
There was a problem hiding this comment.
@jyaunches, understood — the advisory/composable pattern makes sense here. Keeping the probe-and-report logic decoupled from any specific blocking semantics lets nemoclaw status, nemoclaw doctor, or future callers each apply their own gating policy without duplicating the probe logic. I'll withdraw the suggestion.
✏️ Learnings added
Learnt from: jyaunches
Repo: NVIDIA/NemoClaw PR: 3128
File: src/lib/verify-deployment.ts:192-261
Timestamp: 2026-05-06T22:31:23.279Z
Learning: In `src/lib/verify-deployment.ts`, `verifyDeployment()` is intentionally advisory (returns a passive result with a `healthy` boolean and diagnostics) rather than throwing on failure. This design is deliberate: the module is meant to be composable and reusable across different callers (`onboard.ts`, `nemoclaw status`, future `nemoclaw doctor`, etc.), each of which applies its own blocking/gating policy. Do not suggest converting this to a throwing or discriminated-union API that enforces blocking — that would couple the module to onboard's specific UX requirements.
Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.
| // Healthy = gateway reachable AND dashboard reachable from host. | ||
| // Inference and messaging are warn-level (non-blocking). | ||
| const healthy = gateway.reachable && dashboard.reachable; |
There was a problem hiding this comment.
Don't report a deployment as healthy while inference or messaging is still broken.
healthy currently ignores inference.working and messaging.healthy, so formatVerificationDiagnostics() takes the green success path and suppresses those warnings whenever just the gateway and host probe are up. That still allows a false-positive "verified" result in the same class of failures this PR is meant to catch, especially for disconnected chat.
Suggested change
- // Healthy = gateway reachable AND dashboard reachable from host.
- // Inference and messaging are warn-level (non-blocking).
- const healthy = gateway.reachable && dashboard.reachable;
+ // Healthy = end-to-end delivery chain is operational.
+ const healthy =
+ gateway.reachable &&
+ dashboard.reachable &&
+ inference.working &&
+ messaging.healthy;🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/lib/verify-deployment.ts` around lines 257 - 259, The healthy flag
currently only checks gateway.reachable and dashboard.reachable, which lets
formatVerificationDiagnostics() take the success path while inference or
messaging are failing; update the healthy calculation to also require
inference.working and messaging.healthy (e.g., compute healthy =
gateway.reachable && dashboard.reachable && inference.working &&
messaging.healthy) so that formatVerificationDiagnostics() will surface
inference/messaging warnings instead of reporting a verified deployment; adjust
any code paths that rely on the old healthy semantics if necessary.
There was a problem hiding this comment.
Intentional: healthy means the core delivery chain is operational (gateway + dashboard reachable). Inference and messaging are warn-level because:
- Inference may be transiently unavailable during cold start (NVIDIA Endpoints can take 10-30s to warm up after first request)
- Messaging channels are optional — many deployments don't configure them
If we include these in the healthy flag, the success path (✓ Deployment verified) would rarely trigger on first onboard, making the verification step feel broken rather than helpful. The warnings are still printed so users know what to investigate.
There was a problem hiding this comment.
@jyaunches, that's a well-reasoned design decision — thanks for the context. The distinction between core delivery chain (gateway + dashboard, blocking) and ancillary services (inference warm-up, optional messaging, non-blocking) makes sense, especially given:
- NVIDIA Endpoint cold-start latency means
inference.workingwould reliably befalseon first probe, turning✓ Deployment verifiedinto a false negative on virtually every clean onboard. - Messaging channels being optional means their absence shouldn't gate the success path for deployments that don't configure them.
The current approach — gate healthy on the two signals that are both required and reliably stable at onboard time, and surface the rest as named warnings — is the right call. Withdrawing this suggestion.
✏️ Learnings added
Learnt from: jyaunches
Repo: NVIDIA/NemoClaw PR: 3128
File: src/lib/verify-deployment.ts:257-259
Timestamp: 2026-05-06T22:31:36.947Z
Learning: In `src/lib/verify-deployment.ts`, the `healthy` flag intentionally covers only `gateway.reachable && dashboard.reachable`. Inference (`inference.working`) and messaging (`messaging.healthy`) are deliberately warn-level / non-blocking because: (1) NVIDIA Endpoints can take 10–30 s to warm up after first request, so inference is transiently unavailable on cold start; (2) messaging channels are optional and many deployments don't configure them. Including them in `healthy` would prevent `✓ Deployment verified` from triggering on first onboard. Both warnings are still printed for user visibility.
Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.
Remove the community dispatch workflow, nightly runner migration, and launchable stable-ref pinning. These are tangential to the #2342 health fix and will be submitted as separate PRs per CodeRabbit feedback.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.github/workflows/nightly-e2e.yaml:
- Line 1737: The scorecard job dependency list is missing the
device-auth-health-e2e job, so add "device-auth-health-e2e" to the
scorecard.needs array to ensure nightly pass/fail metrics include that path;
locate the scorecard job's needs block (reference symbol scorecard.needs) in the
workflow and append the device-auth-health-e2e entry consistent with the
existing comma-separated job names.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: f44d7ee5-d09d-4681-bb08-95c79f3d4da6
📒 Files selected for processing (1)
.github/workflows/nightly-e2e.yaml
There was a problem hiding this comment.
♻️ Duplicate comments (1)
src/lib/onboard.ts (1)
9506-9541:⚠️ Potential issue | 🟠 Major | ⚡ Quick winBlock final success when deployment verification fails.
Because Line 9483 already completes the session and the legacy-credential cleanup runs before this block, an unhealthy
verificationResultstill leaves onboarding finalized and then prints the normal dashboard output. That makes the new verifier warning-only. MovecompleteSession(), legacy cleanup, andprintDashboard()behind averificationResult.healthycheck, and exit non-zero when verification fails.Suggested flow
- onboardSession.completeSession(toSessionUpdates({ sandboxName, provider, model })); - completed = true; - const allStagedMigrated = - stagedLegacyKeys.length > 0 && stagedLegacyKeys.every((k) => migratedLegacyKeys.has(k)); - if (allStagedMigrated) { - removeLegacyCredentialsFile(); - } else if (stagedLegacyKeys.length > 0) { - ... - } const verifyDeploymentModule: typeof import("./verify-deployment") = require("./verify-deployment"); ... const verificationResult = verifyDeploymentModule.verifyDeployment( sandboxName, verifyChain, { ... }, ); const diagLines = verifyDeploymentModule.formatVerificationDiagnostics(verificationResult); for (const line of diagLines) { console.log(line); } + + if (!verificationResult.healthy) { + process.exit(1); + } + + onboardSession.completeSession(toSessionUpdates({ sandboxName, provider, model })); + completed = true; + const allStagedMigrated = + stagedLegacyKeys.length > 0 && stagedLegacyKeys.every((k) => migratedLegacyKeys.has(k)); + if (allStagedMigrated) { + removeLegacyCredentialsFile(); + } else if (stagedLegacyKeys.length > 0) { + ... + } printDashboard(sandboxName, model, provider, nimContainer, agent);🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/lib/onboard.ts` around lines 9506 - 9541, The onboarding flow currently finalizes the session and prints the dashboard even when verification fails; update the logic around verifyDeploymentModule.verifyDeployment so that after obtaining verificationResult you check verificationResult.healthy and only then call completeSession(), run the legacy-credential cleanup, and call printDashboard(sandboxName, model, provider, nimContainer, agent); if verificationResult.healthy is false, log/print the diagnostics (using verifyDeploymentModule.formatVerificationDiagnostics) and exit with a non-zero status instead of completing the session. Ensure you reference the existing verificationResult, verifyDeploymentModule, completeSession(), and printDashboard(...) symbols and do not duplicate verification logic—just gate the finalization/cleanup/dashboard steps on verificationResult.healthy and exit non-zero on failure.
🧹 Nitpick comments (1)
test/onboard.test.ts (1)
3418-3418: ⚡ Quick winAdd a 401-ready health probe case.
These updated mocks lock in the
/healthpath, but they still only return200. That means this file never exercises the device-auth case that caused#2342, so a regression back to treating401from/healthas offline would still pass here. Please make at least one create/reuse readiness path return401and assert it is still treated as alive.Also applies to: 3555-3555, 3651-3651, 4014-4014, 5407-5407, 5802-5802, 5934-5934, 6551-6551
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test/onboard.test.ts` at line 3418, The test's mock of the sandbox health probe currently always returns "200" (the branch in onboard.test.ts that checks for _n(command).includes("sandbox exec") && _n(command).includes("http://localhost:18789/health")), so it never exercises the device-auth 401 case; update the mock so at least one create/reuse readiness path returns "401" (e.g., make one health URL case return "401" and leave others "200") and add an assertion after the create/reuse flow that the instance is still considered alive (assert the same success/ready state the tests expect) to ensure 401 is treated as alive—locate and change the branch that builds the sandbox exec mock and add the corresponding assertion in the create/reuse test block.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Duplicate comments:
In `@src/lib/onboard.ts`:
- Around line 9506-9541: The onboarding flow currently finalizes the session and
prints the dashboard even when verification fails; update the logic around
verifyDeploymentModule.verifyDeployment so that after obtaining
verificationResult you check verificationResult.healthy and only then call
completeSession(), run the legacy-credential cleanup, and call
printDashboard(sandboxName, model, provider, nimContainer, agent); if
verificationResult.healthy is false, log/print the diagnostics (using
verifyDeploymentModule.formatVerificationDiagnostics) and exit with a non-zero
status instead of completing the session. Ensure you reference the existing
verificationResult, verifyDeploymentModule, completeSession(), and
printDashboard(...) symbols and do not duplicate verification logic—just gate
the finalization/cleanup/dashboard steps on verificationResult.healthy and exit
non-zero on failure.
---
Nitpick comments:
In `@test/onboard.test.ts`:
- Line 3418: The test's mock of the sandbox health probe currently always
returns "200" (the branch in onboard.test.ts that checks for
_n(command).includes("sandbox exec") &&
_n(command).includes("http://localhost:18789/health")), so it never exercises
the device-auth 401 case; update the mock so at least one create/reuse readiness
path returns "401" (e.g., make one health URL case return "401" and leave others
"200") and add an assertion after the create/reuse flow that the instance is
still considered alive (assert the same success/ready state the tests expect) to
ensure 401 is treated as alive—locate and change the branch that builds the
sandbox exec mock and add the corresponding assertion in the create/reuse test
block.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: dab31bbb-957a-4aa0-be11-5e9b2bae3073
📒 Files selected for processing (6)
src/lib/actions/sandbox/process-recovery.tssrc/lib/agent-runtime.tssrc/lib/onboard.tstest/cli.test.tstest/onboard.test.tstest/shellquote-sandbox.test.ts
🚧 Files skipped from review as they are similar to previous changes (1)
- test/cli.test.ts
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
src/lib/onboard.ts (1)
9486-9544:⚠️ Potential issue | 🟠 Major | ⚡ Quick winOnly finalize onboarding after
verificationResult.healthyis true.This still completes the session, performs legacy-credential cleanup, and prints the normal post-onboard output before the verification result is enforced. A failed verification therefore leaves onboarding marked complete and non-resumable, which makes the new verifier warning-only instead of authoritative.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/lib/onboard.ts` around lines 9486 - 9544, The code currently marks onboarding complete and performs credential cleanup/printing before enforcing verification; change the control flow so you call verifyDeploymentModule.verifyDeployment and evaluate verificationResult.healthy first, and only when healthy === true call onboardSession.completeSession(toSessionUpdates(...)), perform the legacy-file removal logic (removeLegacyCredentialsFile / console.error fallback) and call printDashboard; if verificationResult.healthy is false, do not complete the session or remove the legacy file — instead surface the verification diagnostics and an actionable warning so the session remains resumable. Use the existing symbols verifyDeploymentModule.verifyDeployment, verificationResult.healthy, onboardSession.completeSession, stagedLegacyKeys/migratedLegacyKeys/removeLegacyCredentialsFile, and printDashboard to locate and reorder the logic.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/lib/onboard.ts`:
- Around line 862-880: The current executeSandboxCommandForVerification function
uses openshell sandbox exec which can fail when the exec transport is
unavailable; change it to reuse the SSH-based probe pattern implemented by
executeSandboxCommand in process-recovery.ts: call openshell sandbox ssh-config
to obtain SSH connection args, build and run an ssh sh -c "<script>" probe (with
the same encoding/timeout/stdio handling and error-to-null behavior), and only
fall back to the original openshell sandbox exec approach if ssh-config or the
ssh probe fails; update the function to return the same {status, stdout, stderr}
shape and preserve the 15s timeout and null-on-error semantics while referencing
executeSandboxCommand and "sandbox ssh-config" in the change.
---
Duplicate comments:
In `@src/lib/onboard.ts`:
- Around line 9486-9544: The code currently marks onboarding complete and
performs credential cleanup/printing before enforcing verification; change the
control flow so you call verifyDeploymentModule.verifyDeployment and evaluate
verificationResult.healthy first, and only when healthy === true call
onboardSession.completeSession(toSessionUpdates(...)), perform the legacy-file
removal logic (removeLegacyCredentialsFile / console.error fallback) and call
printDashboard; if verificationResult.healthy is false, do not complete the
session or remove the legacy file — instead surface the verification diagnostics
and an actionable warning so the session remains resumable. Use the existing
symbols verifyDeploymentModule.verifyDeployment, verificationResult.healthy,
onboardSession.completeSession,
stagedLegacyKeys/migratedLegacyKeys/removeLegacyCredentialsFile, and
printDashboard to locate and reorder the logic.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 18f0d678-0f57-4672-8fbe-6a95dd6c2b32
📒 Files selected for processing (2)
src/lib/onboard.tstest/cli.test.ts
| function executeSandboxCommandForVerification( | ||
| sandboxName: string, | ||
| script: string, | ||
| ): { status: number; stdout: string; stderr: string } | null { | ||
| try { | ||
| const result = spawnSync( | ||
| getOpenshellBinary(), | ||
| ["sandbox", "exec", "-n", sandboxName, "--", "sh", "-c", script], | ||
| { encoding: "utf-8", timeout: 15000, stdio: ["ignore", "pipe", "pipe"] }, | ||
| ); | ||
| if (result.error) return null; | ||
| return { | ||
| status: result.status ?? 1, | ||
| stdout: (result.stdout || "").trim(), | ||
| stderr: (result.stderr || "").trim(), | ||
| }; | ||
| } catch { | ||
| return null; | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, check if the file and function exist at the cited lines
head -n 880 src/lib/onboard.ts | tail -n 50Repository: NVIDIA/NemoClaw
Length of output: 1728
🏁 Script executed:
# Search for SSH-based sandbox probe implementations
rg "ssh|SSH" src/lib/onboard.ts -A 3 -B 3Repository: NVIDIA/NemoClaw
Length of output: 4259
🏁 Script executed:
# Look for device-auth E2E references
fd -type f "*.ts" -path "**e2e**" | head -20Repository: NVIDIA/NemoClaw
Length of output: 230
🏁 Script executed:
# Search for mentions of openshell sandbox exec reliability issues
rg "sandbox exec|false.*probe|000.*probe" . --type ts -A 2 -B 2Repository: NVIDIA/NemoClaw
Length of output: 19657
🏁 Script executed:
# Look for verifyDeployment function and related verification logic
rg "verifyDeployment|executeSandboxCommandForVerification" src/lib/onboard.ts -A 5 -B 2Repository: NVIDIA/NemoClaw
Length of output: 1578
🏁 Script executed:
# Check verify-deployment.ts for SSH fallback implementation
head -n 100 src/lib/verify-deployment.tsRepository: NVIDIA/NemoClaw
Length of output: 3631
🏁 Script executed:
# Search for the executeSandboxCommand callback usage in verify-deployment.ts
rg "executeSandboxCommand" src/lib/verify-deployment.ts -A 5 -B 2Repository: NVIDIA/NemoClaw
Length of output: 1710
🏁 Script executed:
# Look for SSH fallback in verify-deployment or related modules
rg "recoveredSsh|SSH|ssh" src/lib/verify-deployment.ts -A 3 -B 3Repository: NVIDIA/NemoClaw
Length of output: 1248
🏁 Script executed:
# Search for dashboard readiness probe implementation
rg "dashboard.*readiness|readiness.*dashboard" src/lib/onboard.ts -A 5 -B 5Repository: NVIDIA/NemoClaw
Length of output: 41
🏁 Script executed:
# Look for unbounded probes or dashboard readiness loop
rg "dashboardReady|dashboard.*health|18789|DASHBOARD_PORT" src/lib/onboard.ts -A 3 -B 3 | head -100Repository: NVIDIA/NemoClaw
Length of output: 4510
🏁 Script executed:
# Look for SSH fallback implementation pattern
rg "execSync|spawnSync.*ssh|ssh.*config" src/lib -A 5 -B 2 | head -80Repository: NVIDIA/NemoClaw
Length of output: 4782
🏁 Script executed:
# Search for the SSH-based probe mentioned in the review
rg "device-auth|SSH.*fallback|ssh-config" src/lib -A 3 -B 3Repository: NVIDIA/NemoClaw
Length of output: 5917
🏁 Script executed:
# Look at process-recovery.ts for SSH fallback implementation
cat src/lib/actions/sandbox/process-recovery.ts | head -100Repository: NVIDIA/NemoClaw
Length of output: 3141
🏁 Script executed:
# Check if there's a dashboard readiness loop in onboard.ts
rg "waitFor|ready|health.*dashboard|dashboard.*health" src/lib/onboard.ts -A 4 -B 2 | head -120Repository: NVIDIA/NemoClaw
Length of output: 5757
🏁 Script executed:
# Search for E2E tests mentioning device-auth
fd -type f "*.ts" | xargs rg "device-auth" 2>/dev/null | head -20Repository: NVIDIA/NemoClaw
Length of output: 658
🏁 Script executed:
# Check executeSandboxCommand in process-recovery.ts more completely
head -n 150 src/lib/actions/sandbox/process-recovery.ts | tail -n 80Repository: NVIDIA/NemoClaw
Length of output: 2628
🏁 Script executed:
# Search for how verify-deployment is actually used and if there's a wrapper
rg "verifyDeployment|executeSandboxCommandForVerification" src/lib/onboard.ts -B 5 -A 15 | head -80Repository: NVIDIA/NemoClaw
Length of output: 2891
🏁 Script executed:
# Look for comment or issue about sandbox exec being unreliable in CI
rg "false.*probe|000.*probe|sandbox exec.*flaky|sandbox exec.*unreliable" . --type ts -A 2 -B 2Repository: NVIDIA/NemoClaw
Length of output: 868
🏁 Script executed:
# Check if there's a dashboard readiness loop that's unbounded
rg "dashboard.*ready|waitFor.*dashboard|health.*loop" src/lib/onboard.ts -A 8 -B 3 | head -100Repository: NVIDIA/NemoClaw
Length of output: 933
Consider using the SSH-based probe pattern that already exists in executeSandboxCommand.
The executeSandboxCommand function in process-recovery.ts already implements a robust SSH fallback for sandbox communication. Rather than keeping executeSandboxCommandForVerification on openshell sandbox exec, reuse the same SSH + sandbox ssh-config pattern to improve reliability when the sandbox is unreachable via the exec transport.
The dashboard readiness loop (lines 5036–5047) is already bounded to 15 attempts, so no change needed there.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/lib/onboard.ts` around lines 862 - 880, The current
executeSandboxCommandForVerification function uses openshell sandbox exec which
can fail when the exec transport is unavailable; change it to reuse the
SSH-based probe pattern implemented by executeSandboxCommand in
process-recovery.ts: call openshell sandbox ssh-config to obtain SSH connection
args, build and run an ssh sh -c "<script>" probe (with the same
encoding/timeout/stdio handling and error-to-null behavior), and only fall back
to the original openshell sandbox exec approach if ssh-config or the ssh probe
fails; update the function to return the same {status, stdout, stderr} shape and
preserve the 15s timeout and null-on-error semantics while referencing
executeSandboxCommand and "sandbox ssh-config" in the change.
There was a problem hiding this comment.
Acknowledged — the SSH fallback is a valid hardening suggestion. However, executeSandboxCommandForVerification is only called during the post-onboard verification step when the sandbox has just been created and the exec transport is known to be working (onboard just used it successfully for the readiness loop). Adding an SSH fallback here would add complexity for a path that's already proven reliable at this point in the flow.
The E2E test already demonstrated that openshell sandbox exec can be unreliable in some CI environments (which is why Phase 2 skips gracefully), but the onboard runtime context is different — the exec transport was just exercised moments earlier. Will revisit if we see failures in production.
There was a problem hiding this comment.
@jyaunches Got it — agreed to keep executeSandboxCommandForVerification on openshell sandbox exec only for now given the just-proven exec path during readiness. No change requested.
If you’d like, I can open a low-priority follow-up issue to track a gated SSH fallback (opt-in via env) in case we ever see production flakiness. Otherwise I’ll consider this thread resolved on your decision.
✏️ Learnings added
Learnt from: jyaunches
Repo: NVIDIA/NemoClaw PR: 3128
File: src/lib/onboard.ts:862-880
Timestamp: 2026-05-06T23:11:17.624Z
Learning: NVIDIA/NemoClaw preference: In src/lib/onboard.ts, executeSandboxCommandForVerification should remain exec-only (uses `openshell sandbox exec`) because it runs immediately after onboard readiness where exec was proven reliable. Avoid adding SSH fallback unless production failures are observed.
Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.
🧠 Learnings used
Learnt from: jyaunches
Repo: NVIDIA/NemoClaw PR: 2495
File: src/lib/onboard-inference-probes.ts:135-323
Timestamp: 2026-04-27T13:06:57.456Z
Learning: In NVIDIA/NemoClaw PRs that extract code from `src/lib/onboard.ts` into new behavior-preserving modules (e.g., `src/lib/onboard-providers.ts`, `src/lib/onboard-ollama-proxy.ts`, `src/lib/onboard-inference-probes.ts`), reviewers should avoid requesting refactors/complexity/style improvements in the extracted modules as part of the same PR. Only check for correctness and behavior preservation; defer non-behavior-preserving cleanup to separate follow-up PRs.
Learnt from: ericksoa
Repo: NVIDIA/NemoClaw PR: 2554
File: src/lib/onboard.ts:698-700
Timestamp: 2026-04-28T17:22:19.715Z
Learning: For NVIDIA/NemoClaw credential environment hydration/migration, implement fill-only-if-missing semantics across the onboarding and credentials modules: (1) In src/lib/credentials.ts, stageLegacyCredentialsToEnv() should only consider an allowlisted set of credential keys and should set process.env[key] only when getCredential(key) returns null (never override an explicitly provided env export). Treat blank/whitespace values as unset. (2) In src/lib/onboard.ts, hydrateCredentialEnv(envName) must first read getCredential(envName) and only call stageLegacyCredentialsToEnv() when that specific key is missing; after staging, re-read and then set process.env[envName]. This ensures stale credentials.json values cannot override explicitly provided environment credentials.
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
src/lib/agent-runtime.ts (1)
169-169: ⚡ Quick winConsider deduplicating the HTTP probe guard string.
The
_GW_CODE+case 200|401snippet is repeated in two places; extracting one helper reduces drift risk for future health-probe changes.♻️ Suggested refactor
+function buildAliveProbeGuard(url: string): string { + return `_GW_CODE=$(curl -so /dev/null -w '%{http_code}' --max-time 3 ${shellQuote(url)} 2>/dev/null || echo 000); case "$_GW_CODE" in 200|401) echo ALREADY_RUNNING; exit 0 ;; esac;`; +} + export function buildOpenClawRecoveryScript(port: number): string { + const probeUrl = `http://127.0.0.1:${port}/health`; const staleGatewayPattern = "[o]penclaw([ -]gateway| gateway run|$)"; return [ @@ - `_GW_CODE=$(curl -so /dev/null -w '%{http_code}' --max-time 3 http://127.0.0.1:${port}/health 2>/dev/null || echo 000); case "$_GW_CODE" in 200|401) echo ALREADY_RUNNING; exit 0 ;; esac;`, + buildAliveProbeGuard(probeUrl), @@ export function buildRecoveryScript(agent: AgentDefinition | null, port: number): string | null { @@ - `_GW_CODE=$(curl -so /dev/null -w '%{http_code}' --max-time 3 ${shellQuote(probeUrl)} 2>/dev/null || echo 000); case "$_GW_CODE" in 200|401) echo ALREADY_RUNNING; exit 0 ;; esac;`, + buildAliveProbeGuard(probeUrl),Also applies to: 234-234
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/lib/agent-runtime.ts` at line 169, The HTTP probe guard string starting with `_GW_CODE=$(curl -so /dev/null -w '%{http_code}' --max-time 3 http://127.0.0.1:${port}/health 2>/dev/null || echo 000); case "$_GW_CODE" in 200|401) echo ALREADY_RUNNING; exit 0 ;; esac;` is duplicated; extract it into a single helper (e.g., a constant HEALTH_PROBE_CMD or function buildHealthProbe(port)) and replace both inline occurrences with that helper, updating any callers that interpolate ${port} to use the helper API; ensure the helper preserves the exact behavior and string formatting so both places (the `_GW_CODE` probe usages) reuse the single source of truth.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@test/e2e/test-device-auth-health.sh`:
- Line 180: The ACTUAL_PORT assignment uses grep "$SANDBOX_NAME" which treats
the sandbox name as a regex and can mis-match names with regex chars; change the
pipeline that sets ACTUAL_PORT (the line using openshell forward list | grep ...
| awk ...) to perform a fixed-string or exact-field match instead—e.g., replace
grep "$SANDBOX_NAME" with grep -F -- "$SANDBOX_NAME" or adjust the awk to
compare the exact sandbox-name field (awk '$1 == ENV_SANDBOX {print $3}') while
exporting/setting ENV_SANDBOX from SANDBOX_NAME—so the openshell forward list
selection picks the correct port robustly.
---
Nitpick comments:
In `@src/lib/agent-runtime.ts`:
- Line 169: The HTTP probe guard string starting with `_GW_CODE=$(curl -so
/dev/null -w '%{http_code}' --max-time 3 http://127.0.0.1:${port}/health
2>/dev/null || echo 000); case "$_GW_CODE" in 200|401) echo ALREADY_RUNNING;
exit 0 ;; esac;` is duplicated; extract it into a single helper (e.g., a
constant HEALTH_PROBE_CMD or function buildHealthProbe(port)) and replace both
inline occurrences with that helper, updating any callers that interpolate
${port} to use the helper API; ensure the helper preserves the exact behavior
and string formatting so both places (the `_GW_CODE` probe usages) reuse the
single source of truth.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 73f1fc8f-c3dd-483d-9b0a-14ff490a696a
📒 Files selected for processing (3)
.github/workflows/nightly-e2e.yamlsrc/lib/agent-runtime.tstest/e2e/test-device-auth-health.sh
🚧 Files skipped from review as they are similar to previous changes (1)
- .github/workflows/nightly-e2e.yaml
| fi | ||
|
|
||
| # Detect actual dashboard port (may differ from default if port was taken) | ||
| ACTUAL_PORT=$(openshell forward list 2>/dev/null | grep "$SANDBOX_NAME" | awk '{print $3}' | head -1) |
There was a problem hiding this comment.
Harden dashboard-port detection against regex name matching.
Line 180 uses grep "$SANDBOX_NAME" as a regex pattern; sandbox names with regex characters can produce incorrect matches and wrong port selection.
🔧 Minimal robustness fix
-ACTUAL_PORT=$(openshell forward list 2>/dev/null | grep "$SANDBOX_NAME" | awk '{print $3}' | head -1)
+ACTUAL_PORT=$(openshell forward list 2>/dev/null | grep -F -- "$SANDBOX_NAME" | awk 'NR==1 {print $3}')📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| ACTUAL_PORT=$(openshell forward list 2>/dev/null | grep "$SANDBOX_NAME" | awk '{print $3}' | head -1) | |
| ACTUAL_PORT=$(openshell forward list 2>/dev/null | grep -F -- "$SANDBOX_NAME" | awk 'NR==1 {print $3}') |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@test/e2e/test-device-auth-health.sh` at line 180, The ACTUAL_PORT assignment
uses grep "$SANDBOX_NAME" which treats the sandbox name as a regex and can
mis-match names with regex chars; change the pipeline that sets ACTUAL_PORT (the
line using openshell forward list | grep ... | awk ...) to perform a
fixed-string or exact-field match instead—e.g., replace grep "$SANDBOX_NAME"
with grep -F -- "$SANDBOX_NAME" or adjust the awk to compare the exact
sandbox-name field (awk '$1 == ENV_SANDBOX {print $3}') while exporting/setting
ENV_SANDBOX from SANDBOX_NAME—so the openshell forward list selection picks the
correct port robustly.
There was a problem hiding this comment.
Low risk — the sandbox name is hardcoded as e2e-health-auth (no regex-special characters). The grep pattern is safe for this specific use case. Would be a valid concern if sandbox names were user-supplied, but in E2E tests they're always controlled constants.
There was a problem hiding this comment.
@jyaunches, that's a fair point — e2e-health-auth is indeed free of regex-special characters, so the current pattern is safe for the default case. Worth noting that since SANDBOX_NAME is also settable via NEMOCLAW_SANDBOX_NAME, an unusual value passed at runtime could theoretically trigger the issue, but I agree this is low risk in a controlled CI context. Happy to defer to your judgment here.
(¬、¬)
Switch all 33 nightly E2E jobs from `ubuntu-latest` (GitHub-hosted, 2 vCPU) to `linux-amd64-cpu4` (NVIDIA self-hosted, 4 vCPU). Meta jobs (notify-on-failure, report-to-pr, scorecard) stay on `ubuntu-latest` since they only make API calls. **Motivation:** Full sandbox onboard E2E tests spend most of their time on Docker image builds. The NVIDIA runners have more CPU and should reduce per-job runtime. The `pr-self-hosted` workflow already uses these runners successfully for image builds on every PR. **Validated:** The `device-auth-health-e2e` job was tested on `linux-amd64-cpu4` during PR #3128 development and completed in ~16 minutes (vs timing out at 15m on `ubuntu-latest`). <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **Chores** * Nightly end-to-end test workflow updated to use the standardized Linux CPU runner (linux-amd64-cpu4) for most non-GPU jobs; GPU tests continue using dedicated GPU runners. * Reference for the launchable-smoke job updated to the new CPU runner. * Failure notification and scorecard jobs retain the same E2E job dependencies. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary - Bump docs metadata to 0.0.36 and refresh generated NemoClaw user skills. - Document Model Router onboarding, validation retries, Ollama tool checks, Hermes policy behavior, and deployment verification updates. - Remove suppressed experimental command references from public docs per `docs/.docs-skip`. ## Source summary - #2202 -> `docs/get-started/quickstart.md`, `docs/inference/inference-options.md`, `docs/reference/architecture.md`: Document Model Router setup and routed inference architecture. - #3128 -> `docs/get-started/quickstart.md`, `docs/reference/commands.md`: Document deployment verification and HTTP 401 health handling. - #3104 -> `docs/inference/inference-options.md`: Document retry behavior for transient provider validation failures. - #3121 -> `docs/reference/architecture.md`: Document agent-scoped model/provider compatibility manifests. - #3046 -> `docs/reference/architecture.md`: Tie model-specific compatibility setup to known model/provider behavior. - #3097 -> `docs/inference/use-local-inference.md`: Document Ollama tool-calling capability validation. - #3082 -> `docs/reference/commands.md`: Document `NEMOCLAW_SANDBOX_NAME` as the interactive sandbox-name default. - f586cc5, 3442adf -> `docs/get-started/quickstart-hermes.md`, `docs/reference/network-policies.md`: Document Hermes agent-specific baseline policy endpoints. ## Test plan - `python3 scripts/docs-to-skills.py docs/ .agents/skills/ --prefix nemoclaw-user` - `make docs` - `npm run build:cli` - `rg` skip-term scan for `docs/` and generated user skills Made with [Cursor](https://cursor.com) <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **New Features** * Model Router provider for complexity-based routed inference. * Ollama/local inference onboarding now validates tool-calling capability. * Added `local-inference` network policy preset. * **Documentation** * New integration policy examples (Outlook, Telegram, Slack, Discord, GitHub, Jira, etc.). * Clarified config immutability workflow and sandbox writable paths. * Hermes baseline network policy documented. * **Improvements** * Health checks treat device-auth responses as live; transient validation retries. * Installer performs pre-install reachability checks; CLI onboarding gained a --fresh option. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
Summary
Fixes #2342 — OpenClaw Gateway Dashboard shows "Health Offline" after successful Brev Launchable deployment because
curl -sftreats HTTP 401 (device auth enabled) as failure.Changes
Core Fix: Health Probe Pattern
src/lib/agent-runtime.ts—getHealthProbeUrl()defaults to/health(returns 200 regardless of device auth) instead of/(returns 401 with device auth)src/lib/onboard.ts— Dashboard readiness probe uses HTTP status code extraction (curl -so /dev/null -w '%{http_code}') instead ofcurl -sf; accepts 200 or 401 as alivesrc/lib/sandbox-process-recovery-action.ts—isSandboxGatewayRunning()andisSandboxGatewayRunningForStatus()updated with same patternNew: Post-Deployment Verification
src/lib/verify-deployment.ts— New module withverifyDeployment()that confirms the full delivery chain (gateway, inference, dashboard, messaging) is operational before printing "YOUR AGENT IS LIVE"src/lib/verify-deployment.test.ts— 14 unit tests covering all verification pathsCI & Launchable
.github/workflows/trigger-community-image-rebuild.yaml— Dispatches rebuild to OpenShell-Community on NemoClaw releasescripts/brev-launchable-ci-cpu.sh— DefaultsNEMOCLAW_REFtostable(latestv*tag) instead ofmainE2E Regression Test
test/e2e/test-device-auth-health.sh— New nightly job validating health probes work with device auth enablednightly-e2e.yamlasdevice-auth-health-e2eTest Plan
device-auth-health-e2e— core regression assertion (Phase 3: status ≠ Offline) passingVersion Target
v0.0.36
Summary by CodeRabbit
New Features
Bug Fixes
Tests
Chores