Skip to content

feat(onboard): auto-allocate dashboard port for multi-sandbox#2411

Merged
jyaunches merged 12 commits into
mainfrom
fix/2174-auto-allocate-dashboard-port
Apr 28, 2026
Merged

feat(onboard): auto-allocate dashboard port for multi-sandbox#2411
jyaunches merged 12 commits into
mainfrom
fix/2174-auto-allocate-dashboard-port

Conversation

@ericksoa

@ericksoa ericksoa commented Apr 24, 2026

Copy link
Copy Markdown
Contributor

Summary

Fixes #2174. When onboarding a second sandbox while the first holds port 18789, the dashboard forward now auto-allocates the next free port instead of crashing.

Related Issue

Fixes #2174

Changes

  • --control-ui-port <N> CLI flag: Explicit port override (takes precedence over CHAT_UI_URL env var and defaults)
  • Auto-allocation: When the preferred port is taken by another sandbox, scans 18789-18799 for the next free port and warns the user
  • Registry persistence: dashboardPort stored in sandboxes.json so it survives across sessions and --resume flows
  • nemoclaw list: Shows dashboard: http://127.0.0.1:{port}/ per sandbox
  • nemoclaw status: Shows :port suffix per sandbox line
  • Range exhaustion: Throws with actionable error listing each port's owner and suggesting --control-ui-port with a port outside the range
  • Pre-existing test fix: Updated local-inference preset test to match current YAML (node binary was added)

Type of Change

  • Code change (feature, bug fix, or refactor)

Verification

  • npx prek run --all-files passes
  • npm test passes (CLI project: 1965/1965)
  • Tests added or updated for new or changed behavior
  • No secrets, API keys, or credentials committed

AI Disclosure

  • AI-assisted — tool: Claude Code

Signed-off-by: Aaron Erickson aerickson@nvidia.com

Summary by CodeRabbit

Release Notes

  • New Features

    • Added --control-ui-port CLI flag to specify the dashboard port during sandbox onboarding
    • Dashboard port information now displayed in nemoclaw status and nemoclaw list commands
  • Improvements

    • Enhanced dashboard port allocation to automatically find the next available port instead of failing when the preferred port is occupied
    • Improved port management and persistence across sandbox operations

…2174)

When onboarding a second sandbox while the first holds port 18789,
the dashboard forward now auto-allocates the next free port in
range 18789-18799 instead of crashing with an unhandled error.

Changes:
- Add --control-ui-port <N> CLI flag (takes precedence over
  CHAT_UI_URL env var and defaults)
- Auto-allocate next free port when preferred port is taken,
  with warning message showing the reassigned port
- Persist dashboardPort in the sandbox registry so it survives
  across sessions and resume flows
- Display dashboard URL in `nemoclaw list` output
- Display dashboard port in `nemoclaw status` output
- Throw only when the entire 18789-18799 range is exhausted,
  with actionable error listing each port's owner
- Fix pre-existing local-inference preset test (node binary added)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@coderabbitai

coderabbitai Bot commented Apr 24, 2026

Copy link
Copy Markdown
Contributor

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Introduces dashboard port auto-allocation to support multiple simultaneous sandboxes. Adds --control-ui-port CLI flag with validation, dashboardPort field to sandbox registry for persistence, automatic port conflict resolution with next-free-port allocation, and visibility of assigned ports in list/status output.

Changes

Cohort / File(s) Summary
Port Configuration
src/lib/ports.ts
Exports dashboard port range constants (DASHBOARD_PORT_RANGE_START and DASHBOARD_PORT_RANGE_END) to enable auto-allocation within a sane range (18789–18799).
CLI Flag Support
src/lib/onboard-command.ts, src/lib/onboard-command.test.ts
Adds --control-ui-port <N> CLI flag with integer validation (1024–65535), threading the value into OnboardCommandOptions, and comprehensive test coverage for parsing, error handling, and help text.
Sandbox Registry & Display
src/lib/registry.ts, src/lib/inventory-commands.ts
Adds optional dashboardPort field to SandboxEntry interface; updates nemoclaw list to print dashboard URLs and nemoclaw status to append port numbers for sandboxes with allocated ports.
Onboarding Flow
src/lib/onboard.ts
Threads controlUiPort through onboarding, pre-computes available forward ports, reworks ensureDashboardForward to auto-allocate next free port on conflict (instead of failing), and persists final allocated port to registry. Follows precedence: CLI flag → persisted port → forwarded port → default.
Test Updates
test/onboard.test.ts, test/policies.test.ts
Updates sandbox creation call assertion regex and adds /usr/bin/node assertion to local-inference policy test.
E2E Test Cleanup
test/e2e/test-messaging-providers.sh
Preserves existing EXIT trap handlers instead of unconditionally replacing them, allowing composite cleanup logic.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 Ports were clashing, what a sight!
Sandboxes feuding through the night.
Till clever hopping found the way—
Auto-allocate! Hip hooray!
Now dashboards bloom on every port,
The rabbit celebrates, of course! 🎊

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 56.25% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title 'feat(onboard): auto-allocate dashboard port for multi-sandbox' clearly captures the main feature—auto-allocation of dashboard ports when onboarding multiple sandboxes.
Linked Issues check ✅ Passed The PR addresses all primary objectives from #2174: prevents onboarding failure on port conflicts, adds --control-ui-port CLI flag, auto-allocates next free port (18789–18799), persists dashboardPort, exposes ports via nemoclaw list/status, and emits actionable errors when range exhausted.
Out of Scope Changes check ✅ Passed All changes are directly aligned with the linked issue objectives: core dashboard port logic, CLI parsing, registry persistence, port range constants, and test updates for new behavior. One minor update to an E2E test script's EXIT trap handling and a local-inference preset test assertion are supportive.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/2174-auto-allocate-dashboard-port

Comment @coderabbitai help to get the list of available commands and usage tips.

…et (#2402)

- The `messaging-providers-e2e` nightly test fails at **S1: Gateway is
not serving on port 18789** because the Slack SDK hangs or crashes
during initialization
- **Root cause**: the base network policy (`openclaw-sandbox.yaml`)
includes Telegram and Discord rules but **not Slack** — Slack access
requires the `slack.yaml` preset, which onboard applies in Step 8
*after* the sandbox container starts in Step 6. By the time the preset
is applied, the Slack SDK has already hung on a blocked connection or
crashed on a CONNECT 403
- **Fix (E2E test)**: pre-merge the Slack network policy rules into the
base policy BEFORE `install.sh` runs, so the sandbox is created with
Slack access from the start. The SDK gets a fast `invalid_auth`
rejection, the channel guard catches it, and the gateway serves normally
- **Fix (guard)**: add `slack.com` domain check to `isSlackRejection()`
so proxy CONNECT tunnel failures (e.g. `CONNECT tunnel to
api.slack.com:443 failed with status 403`) are also caught — these
errors lack `@slack/` in the stack trace because they originate from the
HTTP client, not the SDK

Fixes the nightly failure:
https://github.com/NVIDIA/NemoClaw/actions/runs/24866963508/job/72805036431

| File | What |
|------|------|
| `test/e2e/test-messaging-providers.sh` | Pre-merge Slack policy into
base policy before `install.sh`; remove broken Phase 1b restart
(`openshell sandbox restart` does not exist); fail fast on missing
policy files |
| `scripts/nemoclaw-start.sh` | Add `slack.com` domain check to
`isSlackRejection()` for proxy/network errors |
| `test/nemoclaw-start.test.ts` | Update guard detection test to cover
domain-based matching |
| `test/local-slack-auth-test.sh` | Add T3b scenario for CONNECT tunnel
failure to `api.slack.com` |

- [ ] Re-run the `nightly-e2e` workflow — S1 and S2 should now pass
- [ ] Verify the Slack policy pre-merge check correctly appends rules
(idempotent: skips if `api.slack.com` already present)
- [ ] Verify `npm test` passes (unit test updated for domain-based guard
detection)
- [ ] Run `test/local-slack-auth-test.sh` to verify T3b (CONNECT tunnel
failure) is caught by the guard

🤖 Generated with [Claude Code](https://claude.com/claude-code)

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

* **New Features**
* Injects a Slack network policy into the sandbox during setup when
missing.

* **Bug Fixes**
* Expanded safety net to classify/log Slack-related network errors
(domain-referenced rejections) without terminating the process.

* **Tests**
* Added/updated tests to verify sandbox policy injection, Slack
connection handling, and updated preset expectations to include
additional runtime tools (node, curl, python3).
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
src/lib/onboard.ts (2)

4020-4043: ⚠️ Potential issue | 🟠 Major

Propagate the selected port into the current onboarding session.

actualDashboardPort is persisted, but the rest of this run still derives dashboard URLs from process.env.CHAT_UI_URL / CONTROL_UI_PORT. On the first auto-allocation, the post-onboard banner can still tell the user to open the wrong port.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/lib/onboard.ts` around lines 4020 - 4043, The onboard flow persists
actualDashboardPort but still builds post-onboard URLs from
process.env.CHAT_UI_URL / CONTROL_UI_PORT; update the session state after
computing actualDashboardPort so later URL generation uses the allocated port:
set the in-memory chatUiUrl/dashboardUrl variables (or mutate the onboarding
session object) to reflect actualDashboardPort and update
process.env.CONTROL_UI_PORT or process.env.CHAT_UI_URL equivalents so functions
that build the banner/URLs (where chatUiUrl and CONTROL_UI_PORT are used) pick
up the allocated port; locate the code around ensureDashboardForward,
actualDashboardPort, and registry.registerSandbox to perform this propagation.

6938-6950: ⚠️ Potential issue | 🟠 Major

Apply --control-ui-port before preflight, not after it.

The new flag only reaches createSandbox(). Preflight still hard-checks the default dashboard port, so nemoclaw onboard --control-ui-port <free-port> can still fail early if 18789 is occupied. That makes the override unable to bypass the conflict it was added for.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/lib/onboard.ts` around lines 6938 - 6950, Preflight is still validating
the default dashboard port (18789) before the CLI flag is applied, so move the
application of opts.controlUiPort to run before preflight: ensure the port
override (opts.controlUiPort) is assigned to the same variable or config that
preflight uses (or pass it into the preflight call) so the preflight check uses
the user-specified port instead of the hardcoded 18789; specifically, apply
opts.controlUiPort to the dashboard port variable/config prior to invoking the
preflight routine and keep passing it into createSandbox(...) as currently done.
🧹 Nitpick comments (3)
src/lib/onboard-command.test.ts (1)

223-290: Add one regression test for precedence over CHAT_UI_URL.

The new tests validate parsing, but not the contract that --control-ui-port should win over env-driven dashboard URL/port selection.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/lib/onboard-command.test.ts` around lines 223 - 290, Add a regression
test asserting that CLI --control-ui-port takes precedence over the CHAT_UI_URL
env var: call parseOnboardArgs (or runOnboardCommand) with env containing
CHAT_UI_URL set to a URL with a different port and provide "--control-ui-port"
on the args; then assert the returned result.controlUiPort equals the CLI port
(not the port parsed from CHAT_UI_URL) and that runOnboardCommand's logged/used
port matches the CLI value; reference parseOnboardArgs and runOnboardCommand to
locate where to add the test.
src/lib/inventory-commands.ts (1)

127-129: Prefer explicit null checks for dashboardPort rendering.

Using truthiness works for current ports, but != null is clearer and avoids accidental suppression if semantics ever expand.

♻️ Proposed refactor
-    if (sb.dashboardPort) {
+    if (sb.dashboardPort != null) {
       log(`      dashboard: http://127.0.0.1:${sb.dashboardPort}/`);
     }
@@
-      const portSuffix = sb.dashboardPort ? ` :${sb.dashboardPort}` : "";
+      const portSuffix = sb.dashboardPort != null ? ` :${sb.dashboardPort}` : "";

Also applies to: 158-159

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/lib/inventory-commands.ts` around lines 127 - 129, Replace the truthy
check for sb.dashboardPort with an explicit null/undefined check to avoid
accidental suppression of valid falsy port values; update the condition in the
block that logs the dashboard URL (currently using "if (sb.dashboardPort)") to
"if (sb.dashboardPort != null)" and apply the same change to the other similar
check around the later logging (the second occurrence at the block referenced by
lines ~158-159) so both use explicit != null checks on sb.dashboardPort.
src/lib/ports.ts (1)

43-46: Avoid duplicating the default dashboard port literal.

DASHBOARD_PORT_RANGE_START duplicates SANDBOX_DASHBOARD_PORT (18789). Deriving from the existing constant avoids future drift.

♻️ Proposed refactor
-export const DASHBOARD_PORT_RANGE_START = 18789;
+export const DASHBOARD_PORT_RANGE_START = SANDBOX_DASHBOARD_PORT;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/lib/ports.ts` around lines 43 - 46, The DASHBOARD_PORT_RANGE_START
constant duplicates the SANDBOX_DASHBOARD_PORT numeric literal; change
DASHBOARD_PORT_RANGE_START to derive from the existing SANDBOX_DASHBOARD_PORT
constant (e.g., set DASHBOARD_PORT_RANGE_START = SANDBOX_DASHBOARD_PORT) so the
default port is defined in one place; update any exports or references to
continue using DASHBOARD_PORT_RANGE_START and ensure SANDBOX_DASHBOARD_PORT is
in scope (import or reference the same module) so there’s no duplicated literal.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/lib/onboard-command.ts`:
- Around line 92-109: The CLI flag parsing correctly sets controlUiPort but
downstream onboarding still lets process.env.CHAT_UI_URL override it; update the
chatUiUrl construction in onboard.ts so that when the parsed controlUiPort (from
onboard-command.ts) is non-null it takes precedence and is used to build the
chat UI URL, otherwise fall back to process.env.CHAT_UI_URL and then the
existing default; locate the chatUiUrl assignment and replace the current
`process.env.CHAT_UI_URL || ...` precedence with a conditional that uses
controlUiPort first (e.g., check controlUiPort !== null) so the CLI flag cannot
be bypassed by the environment variable.

In `@src/lib/onboard.ts`:
- Around line 6212-6224: Existing forwards owned by the same sandbox are not
cleaned up; after computing existingForwards (from
runCaptureOpenshell(["forward","list"])) and before starting the new forward,
iterate the parsed existing forwards to find entries whose owner equals
sandboxName and call runOpenshell(["forward","stop", port], { ignoreError: true
}) for each stale port (skip stopping the port that will be used as actualPort
if you already handle it separately). Update the block around
existingForwards/actualPort/getDashboardForwardTarget and the runOpenshell
stop/start calls so all forwards owned by sandboxName (except the desired active
port) are stopped before starting the new background forward. Ensure you use the
existing runOpenshell helper, respect ignoreError: true, and reference
sandboxName, existingForwards, actualPort, runOpenshell and runCaptureOpenshell
when locating the code to change.
- Around line 3368-3370: The port-selection logic currently uses CHAT_UI_URL
before the CLI flag and never reads a persisted dashboardPort; change it so
effectivePort follows the desired precedence: if controlUiPort is set use it;
else if process.env.CHAT_UI_URL is set use that URL as chatUiUrl and parse its
port into effectivePort; else if agent?.dashboardPort use that; else if
agent?.forwardPort use that; otherwise fallback to CONTROL_UI_PORT. Ensure
chatUiUrl is the raw CHAT_UI_URL when provided, otherwise build it from the
computed effectivePort. Update references to effectivePort, chatUiUrl,
controlUiPort, process.env.CHAT_UI_URL, agent.dashboardPort and
agent.forwardPort accordingly.

In `@src/lib/registry.ts`:
- Line 25: The registerSandbox routine is building and persisting a sandbox
object but omits the dashboardPort property so any provided dashboardPort is
lost; update the object constructed/returned in registerSandbox to include
dashboardPort (preserve null/undefined as appropriate) and ensure the persisted
payload written by registerSandbox includes the dashboardPort field (update the
create/update payload and any mappings) so the stored sandbox record contains
the dashboardPort value.

In `@test/policies.test.ts`:
- Around line 167-172: The test "local-inference preset restricts binaries to
expected set" in policies.test.ts is missing an assertion for the alternate Node
path; update the test that calls policies.loadPreset("local-inference") to also
assert that the returned content contains "/usr/bin/node" (in addition to the
existing checks for "/usr/local/bin/node", "/usr/local/bin/openclaw", and
"/usr/local/bin/claude") so the allowed apt-based Node binary path is covered.

---

Outside diff comments:
In `@src/lib/onboard.ts`:
- Around line 4020-4043: The onboard flow persists actualDashboardPort but still
builds post-onboard URLs from process.env.CHAT_UI_URL / CONTROL_UI_PORT; update
the session state after computing actualDashboardPort so later URL generation
uses the allocated port: set the in-memory chatUiUrl/dashboardUrl variables (or
mutate the onboarding session object) to reflect actualDashboardPort and update
process.env.CONTROL_UI_PORT or process.env.CHAT_UI_URL equivalents so functions
that build the banner/URLs (where chatUiUrl and CONTROL_UI_PORT are used) pick
up the allocated port; locate the code around ensureDashboardForward,
actualDashboardPort, and registry.registerSandbox to perform this propagation.
- Around line 6938-6950: Preflight is still validating the default dashboard
port (18789) before the CLI flag is applied, so move the application of
opts.controlUiPort to run before preflight: ensure the port override
(opts.controlUiPort) is assigned to the same variable or config that preflight
uses (or pass it into the preflight call) so the preflight check uses the
user-specified port instead of the hardcoded 18789; specifically, apply
opts.controlUiPort to the dashboard port variable/config prior to invoking the
preflight routine and keep passing it into createSandbox(...) as currently done.

---

Nitpick comments:
In `@src/lib/inventory-commands.ts`:
- Around line 127-129: Replace the truthy check for sb.dashboardPort with an
explicit null/undefined check to avoid accidental suppression of valid falsy
port values; update the condition in the block that logs the dashboard URL
(currently using "if (sb.dashboardPort)") to "if (sb.dashboardPort != null)" and
apply the same change to the other similar check around the later logging (the
second occurrence at the block referenced by lines ~158-159) so both use
explicit != null checks on sb.dashboardPort.

In `@src/lib/onboard-command.test.ts`:
- Around line 223-290: Add a regression test asserting that CLI
--control-ui-port takes precedence over the CHAT_UI_URL env var: call
parseOnboardArgs (or runOnboardCommand) with env containing CHAT_UI_URL set to a
URL with a different port and provide "--control-ui-port" on the args; then
assert the returned result.controlUiPort equals the CLI port (not the port
parsed from CHAT_UI_URL) and that runOnboardCommand's logged/used port matches
the CLI value; reference parseOnboardArgs and runOnboardCommand to locate where
to add the test.

In `@src/lib/ports.ts`:
- Around line 43-46: The DASHBOARD_PORT_RANGE_START constant duplicates the
SANDBOX_DASHBOARD_PORT numeric literal; change DASHBOARD_PORT_RANGE_START to
derive from the existing SANDBOX_DASHBOARD_PORT constant (e.g., set
DASHBOARD_PORT_RANGE_START = SANDBOX_DASHBOARD_PORT) so the default port is
defined in one place; update any exports or references to continue using
DASHBOARD_PORT_RANGE_START and ensure SANDBOX_DASHBOARD_PORT is in scope (import
or reference the same module) so there’s no duplicated literal.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 8b3e74e3-4d7b-462d-8158-a669dade4f64

📥 Commits

Reviewing files that changed from the base of the PR and between 3275f97 and 5627080.

📒 Files selected for processing (8)
  • src/lib/inventory-commands.ts
  • src/lib/onboard-command.test.ts
  • src/lib/onboard-command.ts
  • src/lib/onboard.ts
  • src/lib/ports.ts
  • src/lib/registry.ts
  • test/onboard.test.ts
  • test/policies.test.ts

Comment thread src/lib/onboard-command.ts
Comment thread src/lib/onboard.ts Outdated
Comment thread src/lib/onboard.ts
Comment thread src/lib/registry.ts
Comment thread test/policies.test.ts Outdated
@ericksoa ericksoa self-assigned this Apr 24, 2026

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@test/e2e/test-messaging-providers.sh`:
- Line 217: The new trap replaces the existing EXIT handler (registered by
sandbox-teardown.sh) and prevents sandbox teardown; preserve and chain the
existing handler instead of overwriting it by reading the current EXIT trap
(trap -p EXIT) into a variable and installing a new EXIT trap that first invokes
the original handler and then performs the policy cleanup (using BASE_POLICY_BAK
and BASE_POLICY), or wrap the cleanup in a function that calls the saved
original handler before running cp "$BASE_POLICY_BAK" "$BASE_POLICY" && rm -f
"$BASE_POLICY_BAK".
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: eec121bf-ed86-40f3-8f5d-c40632547d93

📥 Commits

Reviewing files that changed from the base of the PR and between 5627080 and 623c438.

📒 Files selected for processing (5)
  • scripts/nemoclaw-start.sh
  • test/e2e/test-messaging-providers.sh
  • test/local-slack-auth-test.sh
  • test/nemoclaw-start.test.ts
  • test/policies.test.ts
✅ Files skipped from review due to trivial changes (1)
  • test/policies.test.ts

Comment thread test/e2e/test-messaging-providers.sh Outdated
- Fix --control-ui-port precedence over CHAT_UI_URL env var
- Consult registry dashboardPort for resumed sandboxes
- Update chatUiUrl after auto-allocation for correct banner
- Pass controlUiPort to preflight port check
- Use explicit != null checks for dashboardPort rendering
- Derive DASHBOARD_PORT_RANGE_START from constant

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/lib/onboard.ts (1)

2842-2864: ⚠️ Potential issue | 🔴 Critical

Auto-allocation never gets a chance in preflight.

_preflightDashboardPort is documented as null => skip dashboard port check, but const dashboardPortToCheck = _preflightDashboardPort ?? DASHBOARD_PORT; turns that back into 18789. The default path still hard-fails in preflight when 18789 is busy, so onboarding never reaches ensureDashboardForward(). The hard-coded DASHBOARD_PORT checks below also miss non-default --control-ui-port values.

Suggested fix
-  const dashboardPortToCheck = _preflightDashboardPort ?? DASHBOARD_PORT;
+  const dashboardPortToCheck = _preflightDashboardPort;
   const requiredPorts = [
     { port: GATEWAY_PORT, label: "OpenShell gateway" },
     ...(dashboardPortToCheck !== null
       ? [{ port: dashboardPortToCheck, label: "NemoClaw dashboard" }]
       : []),
   ];
@@
-      if ((port === GATEWAY_PORT || port === DASHBOARD_PORT) && gatewayReuseState === "healthy") {
+      if (
+        (port === GATEWAY_PORT || port === dashboardPortToCheck) &&
+        gatewayReuseState === "healthy"
+      ) {
         console.log(`  ✓ Port ${port} already owned by healthy NemoClaw runtime (${label})`);
         continue;
       }
@@
-      if (port === DASHBOARD_PORT && portCheck.process === "ssh" && portCheck.pid) {
+      if (port === dashboardPortToCheck && portCheck.process === "ssh" && portCheck.pid) {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/lib/onboard.ts` around lines 2842 - 2864, The preflight logic forces a
dashboard port check by computing dashboardPortToCheck = _preflightDashboardPort
?? DASHBOARD_PORT which converts a documented null (meaning "skip check") into
the default DASHBOARD_PORT and prevents auto-allocation and non-default
--control-ui-port handling; change the logic so that when
_preflightDashboardPort is strictly null you do not include a dashboard entry in
requiredPorts (i.e. set dashboardPortToCheck to null when
_preflightDashboardPort === null, or build requiredPorts conditionally), and
ensure checks that compare against DASHBOARD_PORT/GATEWAY_PORT use the actual
requested port variable (dashboardPortToCheck or the explicit control-ui port)
when deciding the gatewayReuseState and SSH cleanup branches; update references
in this block around dashboardPortToCheck, requiredPorts, checkPortAvailable and
any comparisons to DASHBOARD_PORT so preflight will skip the dashboard check
when _preflightDashboardPort is null and honor non-default control-ui-port
values before calling ensureDashboardForward().
♻️ Duplicate comments (1)
src/lib/onboard.ts (1)

6228-6241: ⚠️ Potential issue | 🟠 Major

Stop this sandbox's old forwards before starting the new one.

If this sandbox already owns a different forwarded port, only stopping actualPort leaves the old forward behind. That leaks stale dashboard URLs and can eventually consume the whole 18789-18799 pool.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/lib/onboard.ts` around lines 6228 - 6241, The code currently only stops
the forward at actualPort, leaving any other forwards owned by this sandbox
running and leaking ports; modify the flow around runOpenshell/forward to first
parse existingForwards (the result of runCaptureOpenshell(["forward", "list"]))
and stop any forwards whose owner equals sandboxName and whose port !==
String(actualPort) before starting the new forward; reference
findAvailableDashboardPort, existingForwards, sandboxName, actualPort and use
runOpenshell(["forward","stop", <port>], { ignoreError: true }) for each stale
port to ensure all previous forwards for this sandbox are cleaned up.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/lib/onboard.ts`:
- Around line 6182-6219: The parser in getOccupiedPorts currently treats any
forward list line with a numeric port as occupied, causing stopped/stale
forwards to block ports; update getOccupiedPorts (or switch to the canonical
parser from sandbox-session-state.ts) to only include entries whose status is
"running" (i.e., filter out stopped entries) before mapping port → owner so
findAvailableDashboardPort only considers active forwards within
DASHBOARD_PORT_RANGE_START..END.
- Around line 4034-4037: The local rewrite of chatUiUrl (using
actualDashboardPort and getDashboardForwardPort) doesn't affect what
printDashboard() / getDashboardAccessInfo() output because they read
process.env.CHAT_UI_URL or the default port; update the shared source of truth
instead: after computing actualDashboardPort and chatUiUrl, set
process.env.CHAT_UI_URL (or the persisted dashboardPort variable used by
getDashboardAccessInfo/printDashboard) to reflect the new URL/port so subsequent
calls to getDashboardAccessInfo() and printDashboard() produce links with the
allocated port; reference chatUiUrl, actualDashboardPort,
getDashboardForwardPort, process.env.CHAT_UI_URL, getDashboardAccessInfo, and
printDashboard when making the change.
- Around line 3377-3383: The code picks effectivePort early but discovers
actualDashboardPort only after sandbox creation, causing the UI URL to point to
a port with no listener; before building the sandbox (and before calling
patchStagedDockerfile(), setting sandbox env, and the readiness probe), resolve
and lock the final dashboard port (compute actualDashboardPort using the same
logic that handles forwardPort conflicts and registry.getSandbox(sandboxName)
resolution), then set effectivePort/chatUiUrl to that resolved
actualDashboardPort so patchStagedDockerfile(), sandbox environment variables,
and readiness probe all use the final allocated port rather than a provisional
one; update references to persistedPort, effectivePort, chatUiUrl, and usages in
patchStagedDockerfile()/readiness probe to use this precomputed
actualDashboardPort.

---

Outside diff comments:
In `@src/lib/onboard.ts`:
- Around line 2842-2864: The preflight logic forces a dashboard port check by
computing dashboardPortToCheck = _preflightDashboardPort ?? DASHBOARD_PORT which
converts a documented null (meaning "skip check") into the default
DASHBOARD_PORT and prevents auto-allocation and non-default --control-ui-port
handling; change the logic so that when _preflightDashboardPort is strictly null
you do not include a dashboard entry in requiredPorts (i.e. set
dashboardPortToCheck to null when _preflightDashboardPort === null, or build
requiredPorts conditionally), and ensure checks that compare against
DASHBOARD_PORT/GATEWAY_PORT use the actual requested port variable
(dashboardPortToCheck or the explicit control-ui port) when deciding the
gatewayReuseState and SSH cleanup branches; update references in this block
around dashboardPortToCheck, requiredPorts, checkPortAvailable and any
comparisons to DASHBOARD_PORT so preflight will skip the dashboard check when
_preflightDashboardPort is null and honor non-default control-ui-port values
before calling ensureDashboardForward().

---

Duplicate comments:
In `@src/lib/onboard.ts`:
- Around line 6228-6241: The code currently only stops the forward at
actualPort, leaving any other forwards owned by this sandbox running and leaking
ports; modify the flow around runOpenshell/forward to first parse
existingForwards (the result of runCaptureOpenshell(["forward", "list"])) and
stop any forwards whose owner equals sandboxName and whose port !==
String(actualPort) before starting the new forward; reference
findAvailableDashboardPort, existingForwards, sandboxName, actualPort and use
runOpenshell(["forward","stop", <port>], { ignoreError: true }) for each stale
port to ensure all previous forwards for this sandbox are cleaned up.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: af741e6b-030e-4647-932d-5dcef0fdf5f0

📥 Commits

Reviewing files that changed from the base of the PR and between 623c438 and 2cb1f28.

📒 Files selected for processing (3)
  • src/lib/inventory-commands.ts
  • src/lib/onboard.ts
  • src/lib/ports.ts
🚧 Files skipped from review as they are similar to previous changes (2)
  • src/lib/inventory-commands.ts
  • src/lib/ports.ts

Comment thread src/lib/onboard.ts Outdated
Comment thread src/lib/onboard.ts Outdated
Comment thread src/lib/onboard.ts Outdated
@wscurran wscurran added bug Something fails against expected or documented behavior fix labels Apr 24, 2026
@wscurran

Copy link
Copy Markdown
Contributor

✨ Thanks for submitting this pull request that proposes a way to fix a bug that prevents first-time config sets under unset namespaces.


Related open issues:

Merge conflicts resolved in src/lib/onboard.ts, test/onboard.test.ts,
and test/policies.test.ts. In each case, kept the feature branch's
auto-allocate additions while incorporating main's refactors
(requirePresetContent helper, regex update, ensureDashboardForward
refactoring).

CodeRabbit feedback addressed:

1. (Critical) Resolve dashboard port before building sandbox so
   CHAT_UI_URL baked into the Dockerfile, sandbox env, and readiness
   probe all use the final forwarded port. Previously effectivePort was
   provisional and actualDashboardPort was only discovered post-build,
   causing a host→container port mismatch on the conflict path.

2. Set process.env.CHAT_UI_URL after port allocation so printDashboard
   and getDashboardAccessInfo see the actual port instead of re-reading
   a stale env var or the default.

3. Clean up stale forwards owned by the same sandbox on other ports
   before starting the new forward, preventing forward leaks that
   exhaust the 18789-18799 range over time.

4. Persist dashboardPort in registerSandbox — the field existed on
   SandboxEntry but was silently dropped during serialization.

5. Add /usr/bin/node assertion to local-inference preset test.

6. Preserve existing EXIT trap handler in test-messaging-providers.sh
   so sandbox-teardown.sh cleanup is not silently discarded.

7. Filter stopped forwards in getOccupiedPorts so stale entries don't
   block port allocation or trigger false "range exhausted" errors.
   Also skips the header row and adds TypeScript type annotations.

8. Add TypeScript types to getOccupiedPorts, findAvailableDashboardPort,
   ensureDashboardForward, and controlUiPort parameter/OnboardOptions.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (3)
src/lib/onboard.ts (3)

6794-6805: ⚠️ Potential issue | 🟠 Major

Treat only running forwards as occupied.

getOccupiedPorts() skips "stopped", but it still counts any other status or malformed row as occupied. That can falsely block allocation or trigger a bogus range-exhausted error. The canonical parser in src/lib/sandbox-session-state.ts:59-92 already models status; this helper should either reuse it or filter to status === "running" only.

Suggested fix
-    const status = (parts[4] || "").toLowerCase();
-    if (status === "stopped") continue;
+    const status = parts.slice(4).join(" ").toLowerCase();
+    if (status !== "running") continue;
     occupied.set(parts[2], parts[0]);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/lib/onboard.ts` around lines 6794 - 6805, getOccupiedPorts currently
treats any non-"stopped" row as occupied and can falsely block ports; update
getOccupiedPorts to only mark ports occupied when the parsed status equals
"running" (i.e., replace the current check that skips "stopped" with an explicit
status === "running" guard), or alternatively reuse the canonical parser from
sandbox-session-state.ts (the parser that models status) to determine status and
only set occupied.set(parts[2], parts[0]) when status === "running".

3995-3997: ⚠️ Potential issue | 🟠 Major

Update the shared dashboard URL before these early returns.

These branches persist dashboardPort, but they never rewrite process.env.CHAT_UI_URL. printDashboard() / getDashboardAccessInfo() still read from that env var (or fall back to 18789), so reused sandboxes can still print the wrong URL after auto-allocation.

Suggested fix
   const reusedPort = ensureDashboardForward(sandboxName, chatUiUrl);
+  const parsedChatUiUrl = new URL(
+    chatUiUrl.includes("://") ? chatUiUrl : `http://${chatUiUrl}`,
+  );
+  parsedChatUiUrl.port = String(reusedPort);
+  process.env.CHAT_UI_URL = parsedChatUiUrl.toString();
   registry.updateSandbox(sandboxName, { dashboardPort: reusedPort });
   return sandboxName;

Apply the same rewrite in each early-return branch above.

Also applies to: 4025-4027, 4070-4072, 4087-4089

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/lib/onboard.ts` around lines 3995 - 3997, The branches that call
ensureDashboardForward(sandboxName, chatUiUrl) and then
registry.updateSandbox(sandboxName, { dashboardPort: reusedPort }) return early
without updating the shared CHAT_UI_URL, so
printDashboard/getDashboardAccessInfo still read the old env var; fix by setting
process.env.CHAT_UI_URL to the new chatUiUrl (or to the computed host:port
value) immediately after registry.updateSandbox in each early-return branch
(e.g., the branches that call ensureDashboardForward and return sandboxName),
and apply the same change to the other similar early-return branches that update
dashboardPort so the environment always reflects the reused sandbox URL.

3830-3843: ⚠️ Potential issue | 🟠 Major

Honor CHAT_UI_URL when deriving the effective dashboard port.

This still ignores the env URL's port when computing preferredPort/effectivePort. If CHAT_UI_URL is http://127.0.0.1:19000, the Dockerfile/env get 19000 but the allocator and readiness probe still operate on the persisted/default port, so the sandbox can start on one port while onboarding probes another.

Suggested fix
-  const persistedPort = registry.getSandbox(sandboxName)?.dashboardPort ?? null;
-  const preferredPort = controlUiPort ?? persistedPort ?? (agent ? agent.forwardPort : CONTROL_UI_PORT);
+  const persistedPort = registry.getSandbox(sandboxName)?.dashboardPort ?? null;
+  const baseChatUiUrl =
+    process.env.CHAT_UI_URL ||
+    `http://127.0.0.1:${persistedPort ?? (agent ? agent.forwardPort : CONTROL_UI_PORT)}`;
+  const parsedChatUiUrl = new URL(
+    baseChatUiUrl.includes("://") ? baseChatUiUrl : `http://${baseChatUiUrl}`,
+  );
+  if (controlUiPort != null) {
+    parsedChatUiUrl.port = String(controlUiPort);
+  }
+  const preferredPort = Number(parsedChatUiUrl.port || CONTROL_UI_PORT);
   const earlyForwards = runCaptureOpenshell(["forward", "list"], { ignoreError: true });
   const effectivePort = findAvailableDashboardPort(sandboxName, preferredPort, earlyForwards);
   if (effectivePort !== preferredPort) {
     console.warn(`  ! Port ${preferredPort} is taken. Using port ${effectivePort} instead.`);
   }
-  let chatUiUrl =
-    controlUiPort != null
-      ? `http://127.0.0.1:${effectivePort}`
-      : process.env.CHAT_UI_URL || `http://127.0.0.1:${effectivePort}`;
+  parsedChatUiUrl.port = String(effectivePort);
+  let chatUiUrl = parsedChatUiUrl.toString();
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/lib/onboard.ts` around lines 3830 - 3843, The code ignores the port in
process.env.CHAT_UI_URL when computing preferredPort/effectivePort, causing
probes and allocator to use a different port than CHAT_UI_URL; update the logic
in onboard.ts so you parse the port from process.env.CHAT_UI_URL (e.g., new
URL(process.env.CHAT_UI_URL).port) into a variable (envPort), then include that
envPort in the preferredPort selection (change preferredPort = controlUiPort ??
envPort ?? persistedPort ?? (agent ? agent.forwardPort : CONTROL_UI_PORT)); keep
using findAvailableDashboardPort(sandboxName, preferredPort, earlyForwards) and
leave chatUiUrl assignment to use controlUiPort or process.env.CHAT_UI_URL as
before so effectivePort and the URL are consistent.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/lib/onboard.ts`:
- Around line 3280-3289: The preflight uses DASHBOARD_PORT because
dashboardPortToCheck is set with nullish coalescing (_preflightDashboardPort ??
DASHBOARD_PORT), which forces a check even when no explicit port was provided;
change the assignment so dashboardPortToCheck is null when no explicit port was
supplied (e.g., set dashboardPortToCheck = _preflightDashboardPort === undefined
? null : _preflightDashboardPort) so the requiredPorts array will skip the
dashboard check in auto-allocation mode and allow
findAvailableDashboardPort()/ensureDashboardForward to pick a free port.

---

Duplicate comments:
In `@src/lib/onboard.ts`:
- Around line 6794-6805: getOccupiedPorts currently treats any non-"stopped" row
as occupied and can falsely block ports; update getOccupiedPorts to only mark
ports occupied when the parsed status equals "running" (i.e., replace the
current check that skips "stopped" with an explicit status === "running" guard),
or alternatively reuse the canonical parser from sandbox-session-state.ts (the
parser that models status) to determine status and only set
occupied.set(parts[2], parts[0]) when status === "running".
- Around line 3995-3997: The branches that call
ensureDashboardForward(sandboxName, chatUiUrl) and then
registry.updateSandbox(sandboxName, { dashboardPort: reusedPort }) return early
without updating the shared CHAT_UI_URL, so
printDashboard/getDashboardAccessInfo still read the old env var; fix by setting
process.env.CHAT_UI_URL to the new chatUiUrl (or to the computed host:port
value) immediately after registry.updateSandbox in each early-return branch
(e.g., the branches that call ensureDashboardForward and return sandboxName),
and apply the same change to the other similar early-return branches that update
dashboardPort so the environment always reflects the reused sandbox URL.
- Around line 3830-3843: The code ignores the port in process.env.CHAT_UI_URL
when computing preferredPort/effectivePort, causing probes and allocator to use
a different port than CHAT_UI_URL; update the logic in onboard.ts so you parse
the port from process.env.CHAT_UI_URL (e.g., new
URL(process.env.CHAT_UI_URL).port) into a variable (envPort), then include that
envPort in the preferredPort selection (change preferredPort = controlUiPort ??
envPort ?? persistedPort ?? (agent ? agent.forwardPort : CONTROL_UI_PORT)); keep
using findAvailableDashboardPort(sandboxName, preferredPort, earlyForwards) and
leave chatUiUrl assignment to use controlUiPort or process.env.CHAT_UI_URL as
before so effectivePort and the URL are consistent.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 082f3b3f-825d-4cc1-bcb4-cbc2e864e83c

📥 Commits

Reviewing files that changed from the base of the PR and between 2cb1f28 and 3203407.

📒 Files selected for processing (7)
  • nemoclaw-blueprint/scripts/ws-proxy-fix.js
  • src/lib/onboard-command.test.ts
  • src/lib/onboard.ts
  • src/lib/registry.ts
  • test/e2e/test-messaging-providers.sh
  • test/onboard.test.ts
  • test/policies.test.ts
✅ Files skipped from review due to trivial changes (1)
  • nemoclaw-blueprint/scripts/ws-proxy-fix.js
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/lib/onboard-command.test.ts

Comment thread src/lib/onboard.ts
…ctive

When no explicit --control-ui-port is set, the preflight check was
still falling back to DASHBOARD_PORT (18789), which would fail if that
port was already taken by another sandbox — before findAvailableDashboardPort
ever got a chance to auto-allocate a free port. This reintroduced the
#2174 regression for the common multi-sandbox case.

Fix: default dashboardPortToCheck to null instead of DASHBOARD_PORT so
the preflight skips the dashboard port check entirely when auto-allocation
is active. The explicit --control-ui-port path still checks the
user-specified port.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (2)
src/lib/onboard.ts (2)

3995-3997: ⚠️ Potential issue | 🟠 Major

Update the shared dashboard URL before these early returns.

These branches persist dashboardPort, but they never normalize chatUiUrl / process.env.CHAT_UI_URL to the port returned by ensureDashboardForward(). printDashboard() will still read the old env/default value and can announce the wrong URL for reuse and backup-abort flows.

Also applies to: 4025-4027, 4070-4072, 4087-4089

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/lib/onboard.ts` around lines 3995 - 3997, The early-return branches call
ensureDashboardForward(sandboxName, chatUiUrl) and
registry.updateSandbox(sandboxName, { dashboardPort: reusedPort }) but never
normalize the shared CHAT_UI_URL, so printDashboard reads the old URL; after
calling ensureDashboardForward and before returning from functions that
reuse/backup-abort, update the normalized chat UI URL (e.g. reconstruct
chatUiUrl with the returned reusedPort and set process.env.CHAT_UI_URL or the
module-level shared variable) so printDashboard sees the correct URL; apply the
same change around the other reuse branches that call
ensureDashboardForward/registry.updateSandbox (the occurrences around the
reusedPort returns).

3830-3843: ⚠️ Potential issue | 🟠 Major

Keep effectivePort and CHAT_UI_URL on the same port.

When CHAT_UI_URL is set and --control-ui-port is not, Line 3843 keeps the env URL, but Lines 3834-3836 still resolve effectivePort from the registry/default path. The image is then built with one port while Line 4490 probes another, so onboarding can wait on a listener that will never exist.

Also applies to: 4488-4491

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/lib/onboard.ts` around lines 3830 - 3843, The code can end up building
the image with one port and using a different port for readiness because
preferredPort is chosen without considering process.env.CHAT_UI_URL; ensure
CHAT_UI_URL and effectivePort stay in sync by parsing a port from
process.env.CHAT_UI_URL when --control-ui-port is not provided and using that
port as part of preferredPort (so findAvailableDashboardPort(..., preferredPort,
...) uses the same port), or alternatively always construct chatUiUrl from the
computed effectivePort; update the logic around preferredPort,
findAvailableDashboardPort, effectivePort and chatUiUrl to pick the CHAT_UI_URL
port when present (and fall back to registry.getSandbox(...).dashboardPort,
agent.forwardPort, CONTROL_UI_PORT) so the built image and readiness probe use
the same final port.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/lib/onboard.ts`:
- Around line 6815-6835: findAvailableDashboardPort currently trusts openshell
"forward list" (via getOccupiedPorts and forwardListOutput) and can return ports
already bound on the host; change the allocator to validate host-port
availability before selecting or returning a port. Specifically, in
findAvailableDashboardPort (and the related ensureDashboardForward usage), after
determining a candidate port (preferredPort or loop candidate between
DASHBOARD_PORT_RANGE_START and DASHBOARD_PORT_RANGE_END) perform a real host
check (e.g., attempt a short-lived bind/listen or OS-level port probe) and skip
any candidate that fails the host-availability check; only return or persist a
port that passes both the occupied-owner check and the host-port-availability
check, and include this validation when populating the owners/error message if
no usable port is found.

---

Duplicate comments:
In `@src/lib/onboard.ts`:
- Around line 3995-3997: The early-return branches call
ensureDashboardForward(sandboxName, chatUiUrl) and
registry.updateSandbox(sandboxName, { dashboardPort: reusedPort }) but never
normalize the shared CHAT_UI_URL, so printDashboard reads the old URL; after
calling ensureDashboardForward and before returning from functions that
reuse/backup-abort, update the normalized chat UI URL (e.g. reconstruct
chatUiUrl with the returned reusedPort and set process.env.CHAT_UI_URL or the
module-level shared variable) so printDashboard sees the correct URL; apply the
same change around the other reuse branches that call
ensureDashboardForward/registry.updateSandbox (the occurrences around the
reusedPort returns).
- Around line 3830-3843: The code can end up building the image with one port
and using a different port for readiness because preferredPort is chosen without
considering process.env.CHAT_UI_URL; ensure CHAT_UI_URL and effectivePort stay
in sync by parsing a port from process.env.CHAT_UI_URL when --control-ui-port is
not provided and using that port as part of preferredPort (so
findAvailableDashboardPort(..., preferredPort, ...) uses the same port), or
alternatively always construct chatUiUrl from the computed effectivePort; update
the logic around preferredPort, findAvailableDashboardPort, effectivePort and
chatUiUrl to pick the CHAT_UI_URL port when present (and fall back to
registry.getSandbox(...).dashboardPort, agent.forwardPort, CONTROL_UI_PORT) so
the built image and readiness probe use the same final port.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 33651536-aecb-4124-93c3-2ffed53955ab

📥 Commits

Reviewing files that changed from the base of the PR and between 3203407 and c4dd7d2.

📒 Files selected for processing (1)
  • src/lib/onboard.ts

Comment thread src/lib/onboard.ts
…rd port

- getOccupiedPorts: filter to status === "running" only, not just skip
  "stopped" — prevents unknown/malformed statuses from falsely blocking ports
- findAvailableDashboardPort: add host-port availability check via lsof so
  ports bound by non-OpenShell processes are skipped during auto-allocation
- Parse CHAT_UI_URL env port into preferredPort precedence chain so the
  allocator, Dockerfile, and readiness probe all use the same port
- Build chatUiUrl from effectivePort while preserving CHAT_UI_URL hostname
  so remote-access URLs stay correct after auto-allocation
- Set process.env.CHAT_UI_URL before returning from all 4 early-return
  branches (reuse, interactive-reuse, backup-abort, backup-throw) so
  printDashboard/getDashboardAccessInfo see the allocated port
- Add regression test: --control-ui-port parsed independently of CHAT_UI_URL
@ericksoa

Copy link
Copy Markdown
Contributor Author

Addressing remaining CodeRabbit review comments

Commit 619d182 resolves all open CodeRabbit findings from the latest review rounds. Here's the mapping:

Inline comment (new)

  • findAvailableDashboardPort host-port check (onboard.ts:6835): Added isPortBoundOnHost() using synchronous lsof -i :PORT -sTCP:LISTEN so ports bound by non-OpenShell processes are skipped during auto-allocation. Replied inline.

Duplicate comments (repeated across 3 review rounds)

  • getOccupiedPorts filter (onboard.ts:6794-6805): Changed if (status === "stopped") continueif (status !== "running") continue so only explicitly running forwards are treated as occupied.
  • Early-return branches missing process.env.CHAT_UI_URL (onboard.ts:3995, 4025, 4070, 4087): All 4 reuse/abort early-return paths now set process.env.CHAT_UI_URL before returning, so printDashboard()/getDashboardAccessInfo() display the correct port.
  • preferredPort ignoring CHAT_UI_URL port (onboard.ts:3830-3843): Now parses the port from CHAT_UI_URL into envPort and includes it in the precedence chain (controlUiPort > envPort > persistedPort > agent.forwardPort > default). chatUiUrl is built from effectivePort while preserving the hostname from CHAT_UI_URL.

Nitpick comments

  • Regression test for --control-ui-port over CHAT_UI_URL (onboard-command.test.ts): Added.
  • Explicit != null checks for dashboardPort (inventory-commands.ts): Already addressed in earlier commits (confirmed present in current code).
  • DASHBOARD_PORT_RANGE_START derived from SANDBOX_DASHBOARD_PORT (ports.ts): Already addressed in earlier commits (confirmed present in current code).

@jyaunches jyaunches self-assigned this Apr 25, 2026
@jyaunches

Copy link
Copy Markdown
Contributor

PR Review: feat(onboard): auto-allocate dashboard port for multi-sandbox

Files Changed: 9 | Lines: +304 / -43


🔴 Blockers (must fix before merge)

1. Duplicate PR #2454 — merge conflict guaranteed

PR #2454 by @cjagwani implements the identical feature (auto-allocate dashboard port, --control-ui-port flag, registry persistence, list/status display) touching nearly the same files (onboard.ts, onboard-command.ts, ports.ts, registry.ts, inventory-commands.ts). Both fix #2174. PR #2455 (E2E tests) depends explicitly on #2454.

Decision needed: Which PR to land? #2411 was opened first, but #2454 has more unit tests (6 port-allocation tests, 2 inventory tests) and extracts findFreeDashboardPort into ports.ts rather than adding to onboard.ts. They cannot co-exist.

2. Double port-conflict warning — src/lib/onboard.ts:3849 + src/lib/onboard.ts:6894

createSandbox calls findAvailableDashboardPort early (line 3847), prints "Port X is taken" (line 3849), then builds chatUiUrl with effectivePort. Later, ensureDashboardForward (line 4531) calls findAvailableDashboardPort again. In the happy path the second call returns the same port and no duplicate fires. But if port state changes between calls (rare race), the user sees the warning twice and the sandbox's baked NEMOCLAW_DASHBOARD_PORT env won't match the final forward port.

Suggested fix: Have ensureDashboardForward accept the already-resolved port directly (skip its internal findAvailableDashboardPort) when called from createSandbox, or fold the early allocation into ensureDashboardForward.


🟡 Warnings (should fix)

1. _preflightDashboardPort module-level mutable statesrc/lib/onboard.ts:330

This follows the existing NON_INTERACTIVE / RECREATE_SANDBOX pattern, but that pattern is recognized debt — new code should accept config as parameters. Since preflight() is only called from onboard(), the port could be passed as a parameter instead of using module scope.

2. opts.controlUiPort || null falsy-coercionsrc/lib/onboard.ts:7386 and :7790

|| null would coerce 0 to null. While parseOnboardArgs validates 1024–65535 so 0 can't actually reach here, using ?? null would be semantically precise and consistent with the rest of this PR.

3. Missing unit tests for port allocation logic

The onboard-command tests cover CLI flag parsing well (5 new tests), but getOccupiedPorts, isPortBoundOnHost, and findAvailableDashboardPort have no isolated unit tests. For comparison, PR #2454 includes 6 port-allocation unit tests. Coverage for: preferred port free, preferred port taken by another sandbox, range exhaustion, sandbox already owns the port.

4. test/e2e/test-messaging-providers.sh trap chaintrap -p EXIT | sed pattern-matching can break if the previous trap body contains single quotes. Worth noting as a known limitation.


🔵 Suggestions (nice to have)

1. Extract port helpers into src/lib/dashboard-ports.ts

getOccupiedPorts, isPortBoundOnHost, and findAvailableDashboardPort are pure/IO-boundary functions with clear contracts — ideal for extraction from onboard.ts (6,404+ lines). Makes unit testing straightforward without mocking the entire onboard context.

2. Status format :18790 in nemoclaw status is terse. Consider dashboard=18790 or [dashboard: 18790] to match the existing provider=XXX format.

3. Port range sizeDASHBOARD_PORT_RANGE_END = 18799 gives 11 ports (18789–18799). #2454 uses 10 ports (18789–18798). Neither is wrong, but the team should agree on the range.


✅ What's Good

  • --control-ui-port parsing — Clean validation (range, type, missing value), 5 thorough tests
  • Stale forward cleanup — Loop that stops stale forwards owned by this sandbox on other ports prevents forward leaks. Good defensive code.
  • Range exhaustion error — Actionable error listing per-port ownership and suggesting --control-ui-port. Excellent UX.
  • Safe executionisPortBoundOnHost uses runCapture(["lsof", ...]) argv-array form, no shell strings
  • Registry persistence — All 4 sandbox-reuse paths persist dashboardPort via registry.updateSandbox

@jyaunches jyaunches merged commit 8b216ea into main Apr 28, 2026
16 checks passed
@miyoungc miyoungc mentioned this pull request Apr 29, 2026
13 tasks
miyoungc added a commit that referenced this pull request Apr 29, 2026
## Summary
Refreshes the 0.0.29 documentation for user-facing changes merged in the
past 24 hours. Version metadata stays on `0.0.29`.

## Changes
- `docs/get-started/quickstart.md`, `docs/reference/commands.md`, and
`docs/reference/troubleshooting.md`: Document dashboard port
auto-allocation, `--control-ui-port`, and `nemoclaw list` dashboard URL
output from [#2411](#2411).
- `docs/inference/inference-options.md` and
`docs/inference/switch-inference-providers.md`: Document local Ollama
and local vLLM credential isolation from `OPENAI_API_KEY` from
[#2580](#2580).
- `docs/inference/inference-options.md`: Document Local NVIDIA NIM
validation behavior from
[#2505](#2505).
- `docs/reference/commands.md`: Document the cloud-only NIM status
display behavior from
[#2622](#2622).
- `docs/deployment/deploy-to-remote-gpu.md`: Clarify runtime propagation
for `NEMOCLAW_PROXY_HOST` and `NEMOCLAW_PROXY_PORT` from
[#2581](#2581).
- `docs/workspace/backup-restore.md`: Document snapshot restore symlink
handling for sandbox data paths from
[#2488](#2488).
- `docs/reference/commands.md`: Document `skill install --help` and
OpenClaw plugin-shaped directory guidance from
[#2585](#2585).

## Type of Change
- [ ] Code change (feature, bug fix, or refactor)
- [ ] Code change with doc updates
- [x] Doc only (prose changes, no code sample modifications)
- [ ] Doc only (includes code sample changes)

## Verification
- [x] `npx prek run --all-files` passes
- [ ] `npm test` passes
- [ ] Tests added or updated for new or changed behavior
- [x] No secrets, API keys, or credentials committed
- [x] Docs updated for user-facing behavior changes
- [x] `make docs` builds without warnings (doc changes only)
- [x] Doc pages follow the [style
guide](https://github.com/NVIDIA/NemoClaw/blob/main/docs/CONTRIBUTING.md)
(doc changes only)
- [ ] New doc pages include SPDX header and frontmatter (new pages only)

## AI Disclosure
- [x] AI-assisted — tool: Codex

---
Signed-off-by: Miyoung Choi <miyoungc@nvidia.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

* **Documentation**
  * Added `--control-ui-port` flag for explicit dashboard port control
* Implemented automatic port selection (18789–18799) when the default
port is occupied
* Clarified that local inference routes (Ollama, local vLLM) don't
require `OPENAI_API_KEY`
  * Improved dashboard URL display in list and status commands
  * Enhanced symlink handling in workspace backup restoration
  * Updated multi-sandbox quickstart and troubleshooting guidance

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
cv pushed a commit that referenced this pull request Apr 29, 2026
…2174 followup) (#2627)

## Summary

Followup to #2411 (merged). Closes the second half of #2174's title —
"leaks ghost sandbox" — that the merged fix didn't address.

When `findAvailableDashboardPort` exhausts the port range it `throw`s.
By the time `ensureDashboardForward` runs (`src/lib/onboard.ts:3830`),
the openshell sandbox has already been streamed into existence
(`streamSandboxCreate` at line 3734, ready-wait at lines 3789–3803). The
throw escapes unhandled, leaving the sandbox in `openshell sandbox list`
with no NemoClaw registry entry — exactly the drift reported in #2174.

Issue comment with the analysis:
#2174 (comment)

## Related Issue

Followup to #2174 (closed by #2411). #2411 fixed the crash; this PR
fixes the leak.

## Changes

- `src/lib/onboard.ts` — `ensureDashboardForward` takes a new `{
rollbackSandboxOnFailure?: boolean }` option. On unrecoverable
port-allocation failure (range exhaustion), the just-created openshell
sandbox is deleted, an actionable error is printed, and the process
exits with code 1 — mirroring the not-ready rollback pattern already in
place at lines 3789–3803.
- The post-create call site (line 3830) opts in
(`rollbackSandboxOnFailure: true`).
- The four reuse-path call sites (lines 3295, 3326, 3372, 3390) keep the
default (`false`) — they operate on already-existing sandboxes where
deleting on alloc failure would destroy user state.

## Type of Change

- [x] Code change (feature, bug fix, or refactor)

## Verification

- [x] `npm run build:cli` + `npm run typecheck:cli` green
- [x] `npx vitest run --project cli` — 2250/2250 passed
- [x] `npx vitest run --project plugin` — 410/410 passed
- [x] `npx prek run --files src/lib/onboard.ts` — green
- [x] No new tests added: the rollback path mirrors the existing
not-ready rollback (3789–3803), which also has no dedicated unit test.
Adding test infrastructure here would exceed the minimum scope of this
followup.
- [x] No secrets, API keys, or credentials committed

## Notes for reviewer

- Diff is ~35 lines on a single file. Surgical fix matching an existing
pattern in the same function.
- Doc comment on `ensureDashboardForward` updated to describe the new
option and reference the pattern it mirrors.

## AI Disclosure

- [x] AI-assisted — tool: Claude Code

Signed-off-by: Charan Jagwani <cjagwani@nvidia.com>

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Bug Fixes**
* Sandbox initialization now attempts automatic rollback if dashboard
port allocation fails, and exits safely when rollback succeeds.
* When automatic cleanup fails, users are shown clear manual-cleanup
instructions and the exact command to remove the orphaned sandbox.

* **Tests**
* Added runtime tests verifying rollback messaging covers both
successful and failed cleanup scenarios and accurate sandbox naming in
instructions.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Signed-off-by: Charan Jagwani <cjagwani@nvidia.com>
cv pushed a commit that referenced this pull request Apr 29, 2026
)

## Summary

Adds two regression assertions to Phase 4 of `test-double-onboard.sh`
that verify the #2174 auto-allocation fix works end-to-end:

1. **Log check** — Second-sandbox onboard output contains `"is taken.
Using port"` (confirms `ensureDashboardForward` auto-allocated)
2. **List check** — `nemoclaw list` shows two distinct `dashboard:
http://127.0.0.1:<port>` entries for both sandboxes

## Related

- Guards against regression of #2174 (fix landed via #2411 + #2627)
- Supersedes #2455 (which had a stale assertion string `"allocated
port"` that doesn't match any runtime output, and conflicting workflow
changes already on main)

## Changes

- `test/e2e/test-double-onboard.sh` — 21 lines added between the
existing #849 regression check and Phase 5

## Why this supersedes #2455

| Issue | #2455 | This PR |
|-------|-------|---------|
| Assertion string | `"allocated port"` (doesn't exist in codebase) |
`"is taken. Using port"` (matches `ensureDashboardForward` line ~6600) |
| Workflow changes | Adds a `double-onboard-e2e` job that already exists
on main → merge conflict | None needed — job already runs this script |
| Branch freshness | 95 commits behind main | Based on current main |

## Type of Change

- [x] Code change (feature, bug fix, or refactor)

## Verification

- [x] `bash -n test/e2e/test-double-onboard.sh` — syntax check passes
- [x] shellcheck clean
- [x] No workflow changes needed — `double-onboard-e2e` nightly job
already runs this script
- [x] No secrets, API keys, or credentials committed

## AI Disclosure
- [x] AI-assisted — tool: Claude (pi agent)

---
Signed-off-by: Julie Yaunches <jyaunches@nvidia.com>

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

* **Tests**
* Added end-to-end test assertions for multi-sandbox onboarding
scenarios
* Validates dashboard port auto-allocation prevents conflicts and
maintains isolation across concurrent sandbox environments

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Signed-off-by: Julie Yaunches <jyaunches@nvidia.com>
DemianHeyGen pushed a commit to DemianHeyGen/NemoClaw that referenced this pull request Apr 30, 2026
…#2411)

## Summary

Fixes NVIDIA#2174. When onboarding a second sandbox while the first holds port
18789, the dashboard forward now auto-allocates the next free port
instead of crashing.

## Related Issue

Fixes NVIDIA#2174

## Changes

- **`--control-ui-port <N>` CLI flag**: Explicit port override (takes
precedence over `CHAT_UI_URL` env var and defaults)
- **Auto-allocation**: When the preferred port is taken by another
sandbox, scans 18789-18799 for the next free port and warns the user
- **Registry persistence**: `dashboardPort` stored in `sandboxes.json`
so it survives across sessions and `--resume` flows
- **`nemoclaw list`**: Shows `dashboard: http://127.0.0.1:{port}/` per
sandbox
- **`nemoclaw status`**: Shows `:port` suffix per sandbox line
- **Range exhaustion**: Throws with actionable error listing each port's
owner and suggesting `--control-ui-port` with a port outside the range
- **Pre-existing test fix**: Updated `local-inference` preset test to
match current YAML (node binary was added)

## Type of Change

- [x] Code change (feature, bug fix, or refactor)

## Verification

- [x] `npx prek run --all-files` passes
- [x] `npm test` passes (CLI project: 1965/1965)
- [x] Tests added or updated for new or changed behavior
- [x] No secrets, API keys, or credentials committed

## AI Disclosure

- [x] AI-assisted — tool: Claude Code

---
Signed-off-by: Aaron Erickson <aerickson@nvidia.com>

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

## Release Notes

* **New Features**
* Added `--control-ui-port` CLI flag to specify the dashboard port
during sandbox onboarding
* Dashboard port information now displayed in `nemoclaw status` and
`nemoclaw list` commands

* **Improvements**
* Enhanced dashboard port allocation to automatically find the next
available port instead of failing when the preferred port is occupied
  * Improved port management and persistence across sandbox operations
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Julie Yaunches <jyaunches@nvidia.com>
DemianHeyGen pushed a commit to DemianHeyGen/NemoClaw that referenced this pull request Apr 30, 2026
## Summary
Refreshes the 0.0.29 documentation for user-facing changes merged in the
past 24 hours. Version metadata stays on `0.0.29`.

## Changes
- `docs/get-started/quickstart.md`, `docs/reference/commands.md`, and
`docs/reference/troubleshooting.md`: Document dashboard port
auto-allocation, `--control-ui-port`, and `nemoclaw list` dashboard URL
output from [NVIDIA#2411](NVIDIA#2411).
- `docs/inference/inference-options.md` and
`docs/inference/switch-inference-providers.md`: Document local Ollama
and local vLLM credential isolation from `OPENAI_API_KEY` from
[NVIDIA#2580](NVIDIA#2580).
- `docs/inference/inference-options.md`: Document Local NVIDIA NIM
validation behavior from
[NVIDIA#2505](NVIDIA#2505).
- `docs/reference/commands.md`: Document the cloud-only NIM status
display behavior from
[NVIDIA#2622](NVIDIA#2622).
- `docs/deployment/deploy-to-remote-gpu.md`: Clarify runtime propagation
for `NEMOCLAW_PROXY_HOST` and `NEMOCLAW_PROXY_PORT` from
[NVIDIA#2581](NVIDIA#2581).
- `docs/workspace/backup-restore.md`: Document snapshot restore symlink
handling for sandbox data paths from
[NVIDIA#2488](NVIDIA#2488).
- `docs/reference/commands.md`: Document `skill install --help` and
OpenClaw plugin-shaped directory guidance from
[NVIDIA#2585](NVIDIA#2585).

## Type of Change
- [ ] Code change (feature, bug fix, or refactor)
- [ ] Code change with doc updates
- [x] Doc only (prose changes, no code sample modifications)
- [ ] Doc only (includes code sample changes)

## Verification
- [x] `npx prek run --all-files` passes
- [ ] `npm test` passes
- [ ] Tests added or updated for new or changed behavior
- [x] No secrets, API keys, or credentials committed
- [x] Docs updated for user-facing behavior changes
- [x] `make docs` builds without warnings (doc changes only)
- [x] Doc pages follow the [style
guide](https://github.com/NVIDIA/NemoClaw/blob/main/docs/CONTRIBUTING.md)
(doc changes only)
- [ ] New doc pages include SPDX header and frontmatter (new pages only)

## AI Disclosure
- [x] AI-assisted — tool: Codex

---
Signed-off-by: Miyoung Choi <miyoungc@nvidia.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

* **Documentation**
  * Added `--control-ui-port` flag for explicit dashboard port control
* Implemented automatic port selection (18789–18799) when the default
port is occupied
* Clarified that local inference routes (Ollama, local vLLM) don't
require `OPENAI_API_KEY`
  * Improved dashboard URL display in list and status commands
  * Enhanced symlink handling in workspace backup restoration
  * Updated multi-sandbox quickstart and troubleshooting guidance

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
DemianHeyGen pushed a commit to DemianHeyGen/NemoClaw that referenced this pull request Apr 30, 2026
…VIDIA#2174 followup) (NVIDIA#2627)

## Summary

Followup to NVIDIA#2411 (merged). Closes the second half of NVIDIA#2174's title —
"leaks ghost sandbox" — that the merged fix didn't address.

When `findAvailableDashboardPort` exhausts the port range it `throw`s.
By the time `ensureDashboardForward` runs (`src/lib/onboard.ts:3830`),
the openshell sandbox has already been streamed into existence
(`streamSandboxCreate` at line 3734, ready-wait at lines 3789–3803). The
throw escapes unhandled, leaving the sandbox in `openshell sandbox list`
with no NemoClaw registry entry — exactly the drift reported in NVIDIA#2174.

Issue comment with the analysis:
NVIDIA#2174 (comment)

## Related Issue

Followup to NVIDIA#2174 (closed by NVIDIA#2411). NVIDIA#2411 fixed the crash; this PR
fixes the leak.

## Changes

- `src/lib/onboard.ts` — `ensureDashboardForward` takes a new `{
rollbackSandboxOnFailure?: boolean }` option. On unrecoverable
port-allocation failure (range exhaustion), the just-created openshell
sandbox is deleted, an actionable error is printed, and the process
exits with code 1 — mirroring the not-ready rollback pattern already in
place at lines 3789–3803.
- The post-create call site (line 3830) opts in
(`rollbackSandboxOnFailure: true`).
- The four reuse-path call sites (lines 3295, 3326, 3372, 3390) keep the
default (`false`) — they operate on already-existing sandboxes where
deleting on alloc failure would destroy user state.

## Type of Change

- [x] Code change (feature, bug fix, or refactor)

## Verification

- [x] `npm run build:cli` + `npm run typecheck:cli` green
- [x] `npx vitest run --project cli` — 2250/2250 passed
- [x] `npx vitest run --project plugin` — 410/410 passed
- [x] `npx prek run --files src/lib/onboard.ts` — green
- [x] No new tests added: the rollback path mirrors the existing
not-ready rollback (3789–3803), which also has no dedicated unit test.
Adding test infrastructure here would exceed the minimum scope of this
followup.
- [x] No secrets, API keys, or credentials committed

## Notes for reviewer

- Diff is ~35 lines on a single file. Surgical fix matching an existing
pattern in the same function.
- Doc comment on `ensureDashboardForward` updated to describe the new
option and reference the pattern it mirrors.

## AI Disclosure

- [x] AI-assisted — tool: Claude Code

Signed-off-by: Charan Jagwani <cjagwani@nvidia.com>

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Bug Fixes**
* Sandbox initialization now attempts automatic rollback if dashboard
port allocation fails, and exits safely when rollback succeeds.
* When automatic cleanup fails, users are shown clear manual-cleanup
instructions and the exact command to remove the orphaned sandbox.

* **Tests**
* Added runtime tests verifying rollback messaging covers both
successful and failed cleanup scenarios and accurate sandbox naming in
instructions.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Signed-off-by: Charan Jagwani <cjagwani@nvidia.com>
DemianHeyGen pushed a commit to DemianHeyGen/NemoClaw that referenced this pull request Apr 30, 2026
…rd (NVIDIA#2709)

## Summary

Adds two regression assertions to Phase 4 of `test-double-onboard.sh`
that verify the NVIDIA#2174 auto-allocation fix works end-to-end:

1. **Log check** — Second-sandbox onboard output contains `"is taken.
Using port"` (confirms `ensureDashboardForward` auto-allocated)
2. **List check** — `nemoclaw list` shows two distinct `dashboard:
http://127.0.0.1:<port>` entries for both sandboxes

## Related

- Guards against regression of NVIDIA#2174 (fix landed via NVIDIA#2411 + NVIDIA#2627)
- Supersedes NVIDIA#2455 (which had a stale assertion string `"allocated
port"` that doesn't match any runtime output, and conflicting workflow
changes already on main)

## Changes

- `test/e2e/test-double-onboard.sh` — 21 lines added between the
existing NVIDIA#849 regression check and Phase 5

## Why this supersedes NVIDIA#2455

| Issue | NVIDIA#2455 | This PR |
|-------|-------|---------|
| Assertion string | `"allocated port"` (doesn't exist in codebase) |
`"is taken. Using port"` (matches `ensureDashboardForward` line ~6600) |
| Workflow changes | Adds a `double-onboard-e2e` job that already exists
on main → merge conflict | None needed — job already runs this script |
| Branch freshness | 95 commits behind main | Based on current main |

## Type of Change

- [x] Code change (feature, bug fix, or refactor)

## Verification

- [x] `bash -n test/e2e/test-double-onboard.sh` — syntax check passes
- [x] shellcheck clean
- [x] No workflow changes needed — `double-onboard-e2e` nightly job
already runs this script
- [x] No secrets, API keys, or credentials committed

## AI Disclosure
- [x] AI-assisted — tool: Claude (pi agent)

---
Signed-off-by: Julie Yaunches <jyaunches@nvidia.com>

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

* **Tests**
* Added end-to-end test assertions for multi-sandbox onboarding
scenarios
* Validates dashboard port auto-allocation prevents conflicts and
maintains isolation across concurrent sandbox environments

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Signed-off-by: Julie Yaunches <jyaunches@nvidia.com>
@wscurran wscurran added area: cli Command line interface, flags, terminal UX, or output bug-fix PR fixes a bug or regression and removed NemoClaw CLI bug Something fails against expected or documented behavior labels Jun 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area: cli Command line interface, flags, terminal UX, or output bug-fix PR fixes a bug or regression

Projects

None yet

4 participants