Skip to content

ci(nightly): restore Brev E2E workflow#3401

Merged
cv merged 13 commits into
mainfrom
ci/restore-brev-nightly-e2e
May 15, 2026
Merged

ci(nightly): restore Brev E2E workflow#3401
cv merged 13 commits into
mainfrom
ci/restore-brev-nightly-e2e

Conversation

@jyaunches

@jyaunches jyaunches commented May 12, 2026

Copy link
Copy Markdown
Contributor

Summary

Restores the Brev nightly E2E workflow wiring that was reverted after the upstream repository was missing the required Brev credentials. The required BREV_API_KEY, BREV_ORG_ID, and NVIDIA_API_KEY secrets are now present in NVIDIA/NemoClaw, so the reusable workflow can run without failing nightly startup.

Related Issue

Fixes #3350

Changes

  • Reverts the revert of ci(nightly): enable brev-e2e job with long-lived BREV_API_TOKEN #3350 to restore the Brev reusable workflow and nightly brev-e2e matrix.
  • Restores long-lived BREV_API_KEY/BREV_ORG_ID authentication for Brev CI validation.
  • Restores branch-aware checkout, CLI build, and Brev E2E harness updates for the all, messaging-providers, and full suites.

Type of Change

  • Code change (feature, bug fix, or refactor)
  • Code change with doc updates
  • Doc only (prose changes, no code sample modifications)
  • Doc only (includes code sample changes)

Verification

  • npx prek run --all-files passes
  • npm test passes
  • Tests added or updated for new or changed behavior
  • No secrets, API keys, or credentials committed
  • Docs updated for user-facing behavior changes
  • make docs builds without warnings (doc changes only)
  • Doc pages follow the style guide (doc changes only)
  • New doc pages include SPDX header and frontmatter (new pages only)

Notes:

  • Confirmed upstream repository secrets exist with gh secret list --repo NVIDIA/NemoClaw: BREV_API_KEY, BREV_ORG_ID, and NVIDIA_API_KEY.
  • git diff --check origin/main...HEAD passes.
  • npx prek run --all-files and npm test were attempted locally but did not pass because this worktree is missing generated/build artifacts and plugin dependencies (dist/, nemoclaw/dist/, json5 under plugin install); failures were unrelated to this workflow-only revert.

Signed-off-by: Julie Yaunches jyaunches@nvidia.com

Summary by CodeRabbit

  • Improvements

    • Enhanced E2E pipeline reliability with improved error handling and retry logic
    • Improved instance provisioning with published launchable support and fallback mechanisms
    • Upgraded authentication system
  • New Features

    • Added explicit branch selection for manual workflow dispatch

Review Change Stack

@jyaunches jyaunches self-assigned this May 12, 2026
@coderabbitai

coderabbitai Bot commented May 12, 2026

Copy link
Copy Markdown
Contributor

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Updated Brev E2E workflow and test harness to use long-lived API-key authentication, added published-launchable provisioning support with startup-script fallback, and refactored instance provisioning and readiness checks to be mode-aware across workflow dispatch, workflow calls, and E2E tests.

Changes

Brev E2E Workflow and Testing

Layer / File(s) Summary
Auth migration: refresh-token to API-key credentials
.github/workflows/e2e-branch-validation.yaml, vitest.config.ts
Replaced BREV_API_TOKEN with BREV_API_KEY + BREV_ORG_ID; updated workflow secrets, Brev CLI bootstrap to brev login --api-key --org-id, added retry logic for transient failures, and Vitest gating to accept either API-key variant.
Workflow inputs for branch and launchable selection
.github/workflows/e2e-branch-validation.yaml
Added explicit branch dispatch input, deprecated use_launchable (always true), and introduced use_published_launchable and launchable_id inputs for both workflow_dispatch and workflow_call to control published-launchable provisioning.
Job gating and branch resolution
.github/workflows/e2e-branch-validation.yaml
Restricted execution to upstream and known CI fork, added inputs.test_suite to concurrency grouping, and updated checkout ref precedence to prefer explicit branch input over defaulting to main.
E2E test provisioning configuration
test/e2e/brev-e2e.test.ts
Introduced launchable configuration constants and mode flags (USE_PUBLISHED_LAUNCHABLE, BREV_LAUNCHABLE_ID, DEFAULT_SETUP_SCRIPT_PATH) to support published-launchable and startup-script provisioning modes.
Instance listing hardening
test/e2e/brev-e2e.test.ts
Refactored listBrevInstances() to defensively parse brev ls --json across multiple CLI JSON shapes, return empty list on parse/shape mismatches instead of throwing.
Instance provisioning dispatch and mode-specific helpers
test/e2e/brev-e2e.test.ts
Refactored createBrevInstance() to dispatch to createPublishedLaunchableInstance() or createStartupScriptInstance() based on USE_PUBLISHED_LAUNCHABLE; switched error-recovery checks to spawnSync("brev", ["ls"]).
Mode-aware readiness checking and SSH configuration
test/e2e/brev-e2e.test.ts
Updated sshEnv() to export NEMOCLAW_RECREATE_SANDBOX=1; adjusted waitForSsh() retry counts per mode; rewrote waitForLaunchableReady() to probe published mode (CLI presence + repo) vs startup-script mode (sentinel file); improved error hints.
Published-launchable bootstrap and ownership repair
test/e2e/brev-e2e.test.ts
Added bootstrapLaunchable() step to remove stale build output and repair remote ownership via chown (preserving node_modules/.git); for full test suite, stops pre-baked openshell gateway before proceeding.
Instance naming and debug-bundle wiring
.github/workflows/e2e-branch-validation.yaml
Updated INSTANCE identifier in PR comments and debug-bundle collection to include inputs.test_suite for suite-specific tracking.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • NVIDIA/NemoClaw#3350: Contains the same core changes — auth migration to BREV_API_KEY + BREV_ORG_ID, workflow inputs for published-launchable control, and E2E test refactoring for mode-aware provisioning.

Suggested labels

fix

Suggested reviewers

  • ericksoa

Poem

🐰 A launchable now in the clouds so bright,
API keys keep our CI running right,
Mode-aware readiness, ownership repair,
Bootstrap and ownership — handled with care! 🚀

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 63.64% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'ci(nightly): restore Brev E2E workflow' accurately reflects the main change—restoring the Brev E2E workflow and related configuration updates in the CI pipeline.
Linked Issues check ✅ Passed The PR successfully implements all coding requirements from #3350: long-lived BREV_API_KEY/BREV_ORG_ID authentication, API-key auth via brev login --api-key --org-id, repo gating for NVIDIA/NemoClaw and jyaunches/NemoClaw fork, published-launchable support with fallback, suite-specific concurrency isolation, and comprehensive E2E test harness updates.
Out of Scope Changes check ✅ Passed All changes are scoped to Brev E2E workflow restoration and related configuration: e2e-branch-validation.yaml (new inputs/secrets/auth), brev-e2e.test.ts (launchable provisioning modes, instance readiness checks), and vitest.config.ts (test project gating). No unrelated or out-of-scope modifications detected.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch ci/restore-brev-nightly-e2e

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions

github-actions Bot commented May 12, 2026

Copy link
Copy Markdown
Contributor

E2E Advisor Recommendation

Required E2E: e2e-branch-validation/full, launchable-smoke-e2e, cloud-onboard-e2e
Optional E2E: e2e-branch-validation/all, e2e-branch-validation/messaging-providers, shields-config-e2e, dashboard-remote-bind-e2e

Dispatch hint: launchable-smoke-e2e,cloud-onboard-e2e

Auto-dispatched E2E: launchable-smoke-e2e, cloud-onboard-e2e via nightly-e2e.yaml at 3babd15b5e030ede3f23f064534aea530bef0a26nightly run

Workflow run

Full advisor summary

E2E Recommendation Advisor

Base: origin/main
Head: HEAD
Confidence: medium

Required E2E

  • e2e-branch-validation/full (Brev CPU instance plus NVIDIA endpoint usage; roughly medium/high CI cost): Validates the changed Vitest e2e-branch-validation project configuration with Brev credentials and exercises the full install → onboard → sandbox verify → live inference → CLI path on a fresh Brev instance, covering the changed sandbox ENTRYPOINT startup path.
  • launchable-smoke-e2e (Medium; live install/onboard/inference on ubuntu-latest): Covers the launchable install-flow smoke path that the new Brev nightly workflow is meant to protect, including bootstrap, onboard, sandbox health, gateway startup, and inference through the sandbox.
  • cloud-onboard-e2e (Medium; live cloud onboarding and NVIDIA endpoint usage): Provides a direct install/onboard/sandbox security and inference.local check against a live OpenClaw sandbox, which is the highest-signal existing coverage for regressions in scripts/nemoclaw-start.sh startup behavior.

Optional E2E

  • e2e-branch-validation/all (Brev CPU instance plus live sandbox; medium/high CI cost): Useful to mirror one additional matrix leg from the new Brev nightly workflow and confirm the credential-sanitization plus telegram-injection suite still runs under the updated Brev/Vitest configuration.
  • e2e-branch-validation/messaging-providers (Brev CPU instance; medium/high CI cost): Useful to mirror the messaging-providers matrix leg from the new Brev nightly workflow and validate provider/L7 proxy behavior under the Brev launchable branch-validation path.
  • shields-config-e2e (Medium; live sandbox lifecycle): Adjacent confidence for the config permission normalization code in scripts/nemoclaw-start.sh, especially mutable-default config ownership and shields up/down behavior.
  • dashboard-remote-bind-e2e (Brev CPU instance; medium/high CI cost): Optional targeted dashboard-forward confidence because the touched ENTRYPOINT code controls dashboard port selection and gateway URL exports.

New E2E recommendations

  • Brev nightly reusable-workflow contract (high): The new brev-nightly-e2e workflow passes launchable-oriented inputs into e2e-branch-validation. Add a lightweight contract or smoke test that validates the caller/callee workflow_call input set and executes one matrix leg from a PR branch before relying on the nightly schedule.
    • Suggested test: Add a Brev nightly workflow-call contract smoke that dispatches or dry-runs one brev-nightly-e2e matrix leg against the PR branch and fails on unknown reusable-workflow inputs.
  • dashboard port boundary validation (medium): Existing E2E covers dashboard forwarding and sandbox startup, but there is no clearly targeted live E2E assertion for NEMOCLAW_DASHBOARD_PORT boundary values and fail-fast behavior in scripts/nemoclaw-start.sh.
    • Suggested test: Add an E2E scenario that onboards with explicit valid custom NEMOCLAW_DASHBOARD_PORT values and asserts invalid values below 1024 or above 65535 fail before gateway startup.

Dispatch hint

  • Workflow: nightly-e2e.yaml
  • jobs input: launchable-smoke-e2e,cloud-onboard-e2e

@github-actions

Copy link
Copy Markdown
Contributor

Selective E2E Results — ✅ All requested jobs passed

Run: 25736756033
Branch: ci/restore-brev-nightly-e2e
Requested jobs: brev-e2e
Summary: 0 passed, 0 failed, 0 skipped

Job Result
brev-e2e ⚠️ cancelled

@jyaunches

Copy link
Copy Markdown
Contributor Author

Triggered selective nightly dispatch from the upstream repository for this PR branch: https://github.com/NVIDIA/NemoClaw/actions/runs/25736792266\n\nRequested jobs: brev-e2e only.\n\nNote: an earlier dispatch on commit 6296796 proved startup now works after adding parent workflow permissions, but exposed that the Brev matrix legs shared one reusable-workflow concurrency group and cancelled each other. Commit 0b88418 isolates concurrency by test suite; this run validates that path.

@github-actions

Copy link
Copy Markdown
Contributor

Selective E2E Results — ❌ Some jobs failed

Run: 25736792266
Branch: ci/restore-brev-nightly-e2e
Requested jobs: brev-e2e
Summary: 0 passed, 1 failed, 0 skipped

Job Result
brev-e2e ❌ failure

Failed jobs: brev-e2e. Check run artifacts for logs.

@github-actions

Copy link
Copy Markdown
Contributor

Selective E2E Results — ❌ Some jobs failed

Run: 25742958066
Branch: ci/restore-brev-nightly-e2e
Requested jobs: brev-e2e
Summary: 0 passed, 1 failed, 0 skipped

Job Result
brev-e2e ❌ failure

Failed jobs: brev-e2e. Check run artifacts for logs.

@jyaunches

Copy link
Copy Markdown
Contributor Author

Updated this PR to decouple Brev from the aggregate nightly E2E workflow.\n\nWhat changed:\n- Removed brev-e2e from .github/workflows/nightly-e2e.yaml, including dispatch job list, job definition, report needs, failure issue needs, and scorecard needs.\n- Kept e2e-branch-validation.yaml as the manual/on-demand Brev validation path.\n- Kept harness hardening already found during validation: suite-specific instance names, suite-specific concurrency, launchable checkout ownership repair, and matching PR/debug instance names.\n\nValidation run on the workflow files/tests before pushing:\n- git diff --check\n- actionlint -ignore 'label .*linux-amd64.* is unknown' .github/workflows/nightly-e2e.yaml .github/workflows/e2e-branch-validation.yaml\n- npm test -- --run test/e2e/brev-e2e.test.ts\n\nNext step can be a follow-up PR to add a dedicated brev-e2e.yaml nightly once the launchable/gateway state issues are stable.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.github/workflows/nightly-e2e.yaml:
- Around line 94-97: The workflow-level permissions currently grant checks:
write and pull-requests: write to all jobs; remove those from the top-level
permissions block (leave only minimal like contents: read) and instead add a
job-scoped permissions block on the specific job(s) that actually need elevated
rights (e.g., the e2e job(s)). Concretely, remove checks: write and
pull-requests: write from the global permissions and add a permissions: {
checks: write, pull-requests: write } under the specific job definition(s) (the
job name(s) that perform PR/check operations) so only those jobs receive
elevated tokens.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: fed6ad71-c614-44cb-84a6-d631c2def2e3

📥 Commits

Reviewing files that changed from the base of the PR and between 5e3e539 and fab3a11.

📒 Files selected for processing (2)
  • .github/workflows/e2e-branch-validation.yaml
  • .github/workflows/nightly-e2e.yaml
🚧 Files skipped from review as they are similar to previous changes (1)
  • .github/workflows/e2e-branch-validation.yaml

Comment thread .github/workflows/nightly-e2e.yaml Outdated
@wscurran wscurran added platform: brev Affects Brev hosted development environments CI/CD labels May 12, 2026
@wscurran

Copy link
Copy Markdown
Contributor

@github-actions

Copy link
Copy Markdown
Contributor

Brev E2E (all): FAILED on branch ci/restore-brev-nightly-e2eSee logs

@github-actions

Copy link
Copy Markdown
Contributor

Brev E2E (full): FAILED on branch ci/restore-brev-nightly-e2eSee logs

@github-actions

Copy link
Copy Markdown
Contributor

Brev E2E (all): FAILED on branch ci/restore-brev-nightly-e2eSee logs

@github-actions

Copy link
Copy Markdown
Contributor

Brev E2E (full): FAILED on branch ci/restore-brev-nightly-e2eSee logs

Comment thread test/e2e/brev-e2e.test.ts Fixed
@jyaunches jyaunches requested a review from cv May 13, 2026 13:16
@cv cv added v0.0.42 and removed v0.0.41 labels May 14, 2026
…ly-e2e

# Conflicts:
#	.github/workflows/e2e-branch-validation.yaml
#	test/e2e/brev-e2e.test.ts
@github-actions

Copy link
Copy Markdown
Contributor

Selective E2E Results — ✅ All requested jobs passed

Run: 25836210298
Target ref: e7b4a5af380201a61f8790916550de67c331b1b9
Workflow ref: main
Requested jobs: cloud-onboard-e2e,sandbox-survival-e2e,double-onboard-e2e,launchable-smoke-e2e
Summary: 4 passed, 0 failed, 0 skipped

Job Result
cloud-onboard-e2e ✅ success
double-onboard-e2e ✅ success
launchable-smoke-e2e ✅ success
sandbox-survival-e2e ✅ success

@cv cv added v0.0.43 and removed v0.0.42 labels May 14, 2026
@github-actions

Copy link
Copy Markdown
Contributor

Selective E2E Results — ✅ All requested jobs passed

Run: 25925151243
Target ref: 3babd15b5e030ede3f23f064534aea530bef0a26
Workflow ref: main
Requested jobs: launchable-smoke-e2e,cloud-onboard-e2e
Summary: 2 passed, 0 failed, 0 skipped

Job Result
cloud-onboard-e2e ✅ success
launchable-smoke-e2e ✅ success

@cv cv merged commit c596a01 into main May 15, 2026
24 checks passed
@wscurran wscurran added area: ci CI workflows, checks, release automation, or GitHub Actions area: e2e End-to-end tests, nightly failures, or validation infrastructure chore Build, CI, dependency, or tooling maintenance and removed CI/CD labels Jun 3, 2026
@jyaunches jyaunches deleted the ci/restore-brev-nightly-e2e branch June 12, 2026 13:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area: ci CI workflows, checks, release automation, or GitHub Actions area: e2e End-to-end tests, nightly failures, or validation infrastructure chore Build, CI, dependency, or tooling maintenance platform: brev Affects Brev hosted development environments

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants