Skip to content

test(e2e): cover dashboard port auto-alloc in nightly#2455

Closed
cjagwani wants to merge 1 commit into
mainfrom
test/e2e-double-onboard-port-alloc
Closed

test(e2e): cover dashboard port auto-alloc in nightly#2455
cjagwani wants to merge 1 commit into
mainfrom
test/e2e-double-onboard-port-alloc

Conversation

@cjagwani

@cjagwani cjagwani commented Apr 24, 2026

Copy link
Copy Markdown
Contributor

Summary

Adds three #2174 regression assertions to the existing test-double-onboard.sh and wires it into nightly-e2e.yaml as a new double-onboard-e2e job. Without these, a regression of the auto-alloc path (crashing second onboards or silently stealing the first sandbox's forward) would ship to main with only unit tests as a guard.

Related

Depends on #2454 (auto-alloc behavior must be on main before the assertions can pass).

Changes

  • test/e2e/test-double-onboard.sh — three new assertions added to the existing Phase 4 (second-sandbox onboard):
    • Second onboard's stdout must include allocated port (confirms auto-alloc fired)
    • nemoclaw list must show two distinct dashboard: http://... lines for the two sandboxes
    • Existing "first sandbox still alive" + "port 18789 conflict not detected" assertions preserved
  • .github/workflows/nightly-e2e.yaml — new double-onboard-e2e job that runs the script. Uses the script's built-in fake OpenAI-compatible endpoint so no NVIDIA_API_KEY is needed. Wired into notify-on-failure needs: list so a regression auto-opens a GitHub issue.

Type of Change

  • Code change (feature, bug fix, or refactor)

Verification

Notes for reviewer

AI Disclosure

  • AI-assisted — tool: Claude Code

Signed-off-by: Charan Jagwani cjagwani@nvidia.com

Summary by CodeRabbit

  • Tests

    • Added validation for concurrent sandbox onboarding with dashboard port collision detection.
  • Chores

    • Enhanced nightly automated testing with improved failure tracking and artifact collection.

Extends test/e2e/test-double-onboard.sh with three #2174 regression checks:
- Second-sandbox onboard output must log "allocated port" (auto-alloc fired)
- nemoclaw list must show two distinct dashboard ports for A and B

The existing script already exercises the two-sandbox flow (Phase 4) but
didn't assert anything port-specific — a silent regression of the auto-alloc
path would have passed. Also wires the script into nightly-e2e.yaml as a
new double-onboard-e2e job. Uses the script's fake OpenAI endpoint so no
NVIDIA_API_KEY is required.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Charan Jagwani <cjagwani@nvidia.com>
@coderabbitai

coderabbitai Bot commented Apr 24, 2026

Copy link
Copy Markdown
Contributor

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 7fca9582-2bbc-4ccc-b7d0-5361487290c1

📥 Commits

Reviewing files that changed from the base of the PR and between 260b237 and 65b9b36.

📒 Files selected for processing (2)
  • .github/workflows/nightly-e2e.yaml
  • test/e2e/test-double-onboard.sh

📝 Walkthrough

Walkthrough

A new nightly E2E test job is added to validate concurrent sandbox onboarding without dashboard port collisions, integrated into the existing CI failure notification mechanism using a dedicated test script and local fake OpenAI-compatible endpoint.

Changes

Cohort / File(s) Summary
CI Workflow Enhancement
.github/workflows/nightly-e2e.yaml
New double-onboard-e2e GitHub Actions job that executes concurrent onboarding test, produces artifact logs on failure, and connects to existing failure notification pipeline via needs dependency.
Test Implementation
test/e2e/test-double-onboard.sh
New bash test script that validates two sandboxes onboard simultaneously with distinct dashboard ports, asserting allocation messages and deduplicating port extraction from nemoclaw list output.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

🐰 Two sandboxes hop and play,
Each claiming ports their own way,
Concurrent dance, no collisions here,
The ports stay unique, crystal clear! 🎉

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'test(e2e): cover dashboard port auto-alloc in nightly' accurately describes the main change: adding E2E test coverage for dashboard port auto-allocation in the nightly workflow.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch test/e2e-double-onboard-port-alloc

Comment @coderabbitai help to get the list of available commands and usage tips.

@jyaunches

Copy link
Copy Markdown
Contributor

Superseded by #2709.

This PR had two issues that made reworking it more effort than a fresh branch:

  1. Assertion string mismatch"allocated port" doesn't appear anywhere in the codebase's runtime output. The actual message from ensureDashboardForward is "is taken. Using port".
  2. Workflow changes superseded — The double-onboard-e2e job already exists on main (wired in via other PRs, fixed by fix(e2e): fix double-onboard assertion strings, CI timeout, and per-phase diagnostics #2658/fix(e2e): increase double-onboard script-level timeout from 900s to 4800s #2676), making these changes a merge conflict with no value.

#2709 carries forward the two test assertions (with the corrected grep string) on a fresh branch from current main. Thanks for the original work identifying the coverage gap, @cjagwani!

@jyaunches jyaunches closed this Apr 29, 2026
cv pushed a commit that referenced this pull request Apr 29, 2026
)

## Summary

Adds two regression assertions to Phase 4 of `test-double-onboard.sh`
that verify the #2174 auto-allocation fix works end-to-end:

1. **Log check** — Second-sandbox onboard output contains `"is taken.
Using port"` (confirms `ensureDashboardForward` auto-allocated)
2. **List check** — `nemoclaw list` shows two distinct `dashboard:
http://127.0.0.1:<port>` entries for both sandboxes

## Related

- Guards against regression of #2174 (fix landed via #2411 + #2627)
- Supersedes #2455 (which had a stale assertion string `"allocated
port"` that doesn't match any runtime output, and conflicting workflow
changes already on main)

## Changes

- `test/e2e/test-double-onboard.sh` — 21 lines added between the
existing #849 regression check and Phase 5

## Why this supersedes #2455

| Issue | #2455 | This PR |
|-------|-------|---------|
| Assertion string | `"allocated port"` (doesn't exist in codebase) |
`"is taken. Using port"` (matches `ensureDashboardForward` line ~6600) |
| Workflow changes | Adds a `double-onboard-e2e` job that already exists
on main → merge conflict | None needed — job already runs this script |
| Branch freshness | 95 commits behind main | Based on current main |

## Type of Change

- [x] Code change (feature, bug fix, or refactor)

## Verification

- [x] `bash -n test/e2e/test-double-onboard.sh` — syntax check passes
- [x] shellcheck clean
- [x] No workflow changes needed — `double-onboard-e2e` nightly job
already runs this script
- [x] No secrets, API keys, or credentials committed

## AI Disclosure
- [x] AI-assisted — tool: Claude (pi agent)

---
Signed-off-by: Julie Yaunches <jyaunches@nvidia.com>

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

* **Tests**
* Added end-to-end test assertions for multi-sandbox onboarding
scenarios
* Validates dashboard port auto-allocation prevents conflicts and
maintains isolation across concurrent sandbox environments

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Signed-off-by: Julie Yaunches <jyaunches@nvidia.com>
DemianHeyGen pushed a commit to DemianHeyGen/NemoClaw that referenced this pull request Apr 30, 2026
…rd (NVIDIA#2709)

## Summary

Adds two regression assertions to Phase 4 of `test-double-onboard.sh`
that verify the NVIDIA#2174 auto-allocation fix works end-to-end:

1. **Log check** — Second-sandbox onboard output contains `"is taken.
Using port"` (confirms `ensureDashboardForward` auto-allocated)
2. **List check** — `nemoclaw list` shows two distinct `dashboard:
http://127.0.0.1:<port>` entries for both sandboxes

## Related

- Guards against regression of NVIDIA#2174 (fix landed via NVIDIA#2411 + NVIDIA#2627)
- Supersedes NVIDIA#2455 (which had a stale assertion string `"allocated
port"` that doesn't match any runtime output, and conflicting workflow
changes already on main)

## Changes

- `test/e2e/test-double-onboard.sh` — 21 lines added between the
existing NVIDIA#849 regression check and Phase 5

## Why this supersedes NVIDIA#2455

| Issue | NVIDIA#2455 | This PR |
|-------|-------|---------|
| Assertion string | `"allocated port"` (doesn't exist in codebase) |
`"is taken. Using port"` (matches `ensureDashboardForward` line ~6600) |
| Workflow changes | Adds a `double-onboard-e2e` job that already exists
on main → merge conflict | None needed — job already runs this script |
| Branch freshness | 95 commits behind main | Based on current main |

## Type of Change

- [x] Code change (feature, bug fix, or refactor)

## Verification

- [x] `bash -n test/e2e/test-double-onboard.sh` — syntax check passes
- [x] shellcheck clean
- [x] No workflow changes needed — `double-onboard-e2e` nightly job
already runs this script
- [x] No secrets, API keys, or credentials committed

## AI Disclosure
- [x] AI-assisted — tool: Claude (pi agent)

---
Signed-off-by: Julie Yaunches <jyaunches@nvidia.com>

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

* **Tests**
* Added end-to-end test assertions for multi-sandbox onboarding
scenarios
* Validates dashboard port auto-allocation prevents conflicts and
maintains isolation across concurrent sandbox environments

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Signed-off-by: Julie Yaunches <jyaunches@nvidia.com>
@wscurran wscurran added area: ci CI workflows, checks, release automation, or GitHub Actions area: e2e End-to-end tests, nightly failures, or validation infrastructure chore Build, CI, dependency, or tooling maintenance feature PR adds or expands user-visible functionality and removed CI/CD feature PR adds or expands user-visible functionality labels Jun 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area: ci CI workflows, checks, release automation, or GitHub Actions area: e2e End-to-end tests, nightly failures, or validation infrastructure chore Build, CI, dependency, or tooling maintenance dependencies Pull requests that update a dependency file

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants