ci(nightly): enable brev-e2e job with long-lived BREV_API_TOKEN#3350
Conversation
Wires the existing e2e-branch-validation.yaml reusable workflow into nightly-e2e.yaml as a matrix job covering the 'all' (credential- sanitization + telegram-injection), 'messaging-providers', and 'full' suites. Each matrix cell provisions its own ephemeral Brev CPU instance via the CI launchable (scripts/brev-launchable-ci-cpu.sh), runs the suite, and tears down. Previously this workflow was dispatch-only because short-lived Brev refresh tokens would silently expire and the nightly would skip all Brev tests with no signal (see NVIDIA#1638 history). A long-lived BREV_API_TOKEN now removes that failure mode. The e2e-branch-validation.yaml repo gate is re-enabled, scoped to NVIDIA/NemoClaw plus the CI fork (jyaunches/NemoClaw) for validation before the upstream secret is rotated. Launchable path: use_launchable: true (default) — the test harness uses scripts/brev-launchable-ci-cpu.sh as the startup script, so the cloud instance is pre-baked with Docker, Node 22, OpenShell CLI, and the NemoClaw repo before the E2E harness begins. Cost: ~$0.10 per Brev instance x 3 suites ~= $9/month nightly. Validation path: 1. Paste long-lived BREV_API_TOKEN into jyaunches/NemoClaw repo secrets + NVIDIA_API_KEY. 2. Dispatch nightly-e2e.yaml on the fork with jobs=brev-e2e to confirm the matrix works end-to-end. 3. Rotate secret into NVIDIA/NemoClaw, merge, monitor ~3 nightlies.
|
Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
📝 WalkthroughWalkthroughThis PR migrates the E2E branch validation system from Brev refresh-token authentication to long-lived API keys and introduces support for published NemoClaw launchables as the default instance provisioning method, with backward-compatible startup-script fallback. The test provisioning logic is refactored to support both modes. ChangesBrev E2E Provisioning Migration
Sequence Diagram(s)sequenceDiagram
participant GitHub as GitHub Actions
participant Workflow as e2e-branch-validation job
participant BrevCLI as Brev CLI
participant BrevAPI as Brev API service
participant VM as Provisioned VM
GitHub->>Workflow: trigger e2e-branch-validation (inputs + secrets)
Workflow->>BrevCLI: install pinned brev version
Workflow->>BrevCLI: brev login --api-key <BREV_API_KEY> --org-id <BREV_ORG_ID>
BrevCLI->>BrevAPI: authenticate using API key
BrevCLI->>Workflow: brev ls (retry) -> validate credentials
Workflow->>BrevCLI: brev create --launchable <id> OR brev create --startup-script `@file`
BrevCLI->>BrevAPI: create instance request
BrevAPI->>VM: provision VM
VM->>Workflow: SSH readiness + mode-specific probes (nemoclaw/openshell or sentinel)
Workflow->>Workflow: run E2E suites
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Poem
🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
…G_ID) The Brev CI/CD team issued a long-lived API key (bak-...) that is authenticated via 'brev login --api-key <key> --org-id <org>'. This replaces the old short-lived refresh-token-in-credentials.json hack that was added as a workaround for the brev login removal in NVIDIA#1470. Changes: - Bump Brev CLI pin v0.6.322 -> v0.6.324 (first release with --api-key / --org-id flags on brev login). - Rename secret BREV_API_TOKEN -> BREV_API_KEY (old name was a misnomer: refresh_token != api_key). Add BREV_ORG_ID secret. - Replace the manual 'printf > credentials.json' block with a proper 'brev login --api-key ... --org-id ...' invocation. - Rename dispatch override input brev_token -> brev_api_key and add brev_org_id. - Update nightly-e2e.yaml secrets passthrough to match. No test-harness changes: the Brev CLI still populates the same credentials.json the test reads from, just with an additional api_key + api_key_org_id pair alongside the access_token pair.
Now that auth works with the long-lived API key, two latent bugs in
the Brev E2E harness surface immediately:
1. 'Cannot find module ../dist/nemoclaw' when the test-harness
invokes bin/nemoclaw.js for pre-flight sandbox-name validation.
The workflow runs 'npm install --ignore-scripts' (to avoid the
git-hook installer) which skips the 'prepare' script that would
otherwise build dist/. Add an explicit 'npm run build:cli' step
after install.
2. 'listBrevInstances(...).some is not a function' inside the
cleanupLeftoverInstance beforeAll hook. The v0.6.324 'brev ls
--json' output is no longer a bare array (v0.6.322 was) \u2014
it now returns either null or an object wrapper. Make the
parser defensive about array/{workspaces}/empty shapes so
CLI version drift never breaks the harness again.
Both were latent because the workflow has been dispatch-only with
an expired refresh-token for weeks; no run has made it past auth to
exercise these paths.
Previous defensive fix (98e680e) still produced '.some is not a function' on the fork validation run, which means brev ls --json is returning a shape my guards didn't cover. Two changes: 1. vitest.config.ts: the e2e-branch-validation project was still gated on BREV_API_TOKEN (old secret name). Gate on either the new BREV_API_KEY or legacy BREV_API_TOKEN so the project is explicitly enabled under the new auth. 2. test/e2e/brev-e2e.test.ts: make listBrevInstances() fully diagnostic. Separate the brev-call and JSON.parse try blocks, log the raw output + parsed shape when we can't find an array, and check a wider set of wrapper keys (workspaces, instances, data, result, items). Guarantees an array return + tells us what the real v0.6.324 shape is for a clean future fix.
The last 3 fork validation runs silently tested main instead of the
branch the dispatch targeted. The checkout step used:
ref: ${{ env.RESOLVED_BRANCH || inputs.branch || 'main' }}
But 'branch' is only defined on workflow_call, not workflow_dispatch.
For workflow_dispatch without a PR number, both fallbacks were empty
and it used the literal 'main', ignoring --ref. All branch-level test
fixes were invisible to the runs.
Two changes:
- Add an explicit 'branch' input to workflow_dispatch so the target
branch can be named in the UI / CLI dispatch.
- Change the final fallback from 'main' to github.ref_name. That's
the ref the dispatch was triggered against (what 'gh workflow run
--ref X' passes), which is what every caller actually meant.
Missed the 3rd aggregate job. nightly-e2e.yaml has three fan-in jobs that each depend on every real E2E job via 'needs:': - notify-on-failure (updated in c9d48ac) - report-to-pr (updated in c9d48ac) - scorecard (missed) The repo-level test/validate-e2e-coverage.test.ts lints this invariant and was failing on all hosted-runner jobs (wsl-e2e, macos-e2e, regular cli project) with: Nightly E2E aggregate jobs missing real E2E jobs in needs: scorecard -> brev-e2e. Add brev-e2e to scorecard's needs list to restore the invariant.
The whole point of Brev E2E coverage is to validate the image that
customers actually get. Until now the harness provisioned a bare
Ubuntu VM and ran scripts/brev-launchable-ci-cpu.sh \u2014 a parallel
reconstruction of what the image is supposed to look like. That
bypasses the image build pipeline entirely, which means:
- Regressions in brevdev/nemoclaw-image (containerd pre-import,
systemd services, submodule SHA) are invisible to us.
- Drift between the published launchable and what customers get
is undetectable.
- The 2-min fast-boot property the image exists to deliver is
untested.
Switch to 'brev create --launchable env-3Azt0aYgVNFEuz7opyx3gscmowS'
(the public NemoClaw launchable surfaced in docs/deployment/brev-
web-ui.md) by default. The startup-script path is retained as an
opt-out fallback (USE_PUBLISHED_LAUNCHABLE=0) for validating the
setup script itself without touching the image.
Changes:
- test/e2e/brev-e2e.test.ts: new createBrevInstance() branch that
uses --launchable; new readiness probe (nemoclaw + openshell
on PATH + ~/NemoClaw present) instead of the sentinel file the
startup-script wrote but the published image does not.
- workflows/e2e-branch-validation.yaml: new use_published_launchable
+ launchable_id inputs (workflow_dispatch + workflow_call),
defaulting to the published image.
- workflows/nightly-e2e.yaml: brev-e2e matrix now explicitly
requests the published launchable.
Expected impact on run time: provisioning drops from ~5 min
(bare VM + full bootstrap) to ~2 min (pre-baked image boot).
…unchable Run NVIDIA#6 failed at refreshAndWaitForSsh with 'SSH not ready after 40 attempts' but produced NO diagnostic output between env-dump and failure. Two fixes: 1. vitest.config.ts: override silent:isCi for the e2e-branch-validation project. The whole suite is one describe block so there's no test chatter to suppress, and diagnostic output from createBrevInstance / waitForSsh / waitForLaunchableReady is essential for Brev timing debugging. Local runs already show this; CI should too. 2. waitForSsh: default attempts by mode. Bare-VM startup-script path keeps 40 attempts (~5 min). Published launchable gets 96 attempts (~12 min) because the pre-baked image boot is fast (~2 min) but Brev platform SSH-proxy registration for launchable-provisioned instances appears to lag several more minutes before brev refresh picks them up. Empirical from run NVIDIA#6: instance created but SSH never came up in the 5-min window.
Run NVIDIA#7 failed at the 'brev ls' sanity check in the Install Brev CLI step with: API key saved for org *** rpc error: code = Internal desc = context deadline exceeded Confirmed from local testing: the Brev platform's workspaces/list endpoint is currently returning 'context deadline exceeded' across MULTIPLE orgs (not just Nemoclaw CI/CD), even with working credentials. Other endpoints (brev org ls, brev login) work fine. This is a transient platform issue, not a credential or code issue. Add a 5-attempt retry loop with 15s backoff around the brev ls sanity check. On final failure, emit a ::warning:: and proceed \u2014 the test harness's own waitForSsh / refreshAndWaitForSsh calls retry brev refresh on their own, so downstream steps will still validate auth works. We only want to fail here if auth is genuinely broken (e.g., wrong key), not because Brev's backend is having a bad hour.
|
✨ Related open issues: |
# Conflicts: # .github/workflows/nightly-e2e.yaml
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.github/workflows/e2e-branch-validation.yaml:
- Around line 109-116: Remove the plaintext workflow_dispatch inputs
brev_api_key and brev_org_id (the input keys declared in the inputs block) and
update any steps that reference inputs.brev_api_key / inputs.brev_org_id
(including the step that runs the brev login command) to read credentials from
repository secrets (e.g., secrets.BREV_API_KEY and secrets.BREV_ORG_ID) instead;
ensure the brev login invocation no longer interpolates or echoes the inputs so
secrets are passed only via environment or --token-style secret handling, and
also remove the duplicate input declarations noted later in the file.
In @.github/workflows/nightly-e2e.yaml:
- Around line 1753-1755: Replace the hardcoded branch value in the reusable job
input (the with: branch: main) so the job runs against the dispatched ref;
update the branch input to use the workflow ref expression (e.g. ${ { github.ref
} }) instead of "main" so the reusable job uses the selected ref when invoked;
keep the test_suite input as-is.
In `@test/e2e/brev-e2e.test.ts`:
- Around line 452-460: The test uses execSync with interpolated shell command
strings built by buildLaunchableCreateCmd and buildStartupScriptCreateCmd which
allows command injection via workflow inputs (launchable_id/setup_script_url);
replace those execSync calls with execFileSync and pass the command and
arguments as an array (no shell) for the create flow and the other two execSync
usages referenced, e.g., call execFileSync(executable, [arg1, arg2], { encoding,
timeout, stdio: ... }) instead of execSync(createCmd, ...), and update
buildLaunchableCreateCmd/buildStartupScriptCreateCmd to return the executable
and args separately (or construct the arg arrays inline) while keeping
CAPTURE_STDIO/PIPE_INPUT_STDIO and other options unchanged.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: af39c084-b172-4ee7-be29-9f83d48c7764
📒 Files selected for processing (4)
.github/workflows/e2e-branch-validation.yaml.github/workflows/nightly-e2e.yamltest/e2e/brev-e2e.test.tsvitest.config.ts
Address CodeRabbit's three actionable review comments: 1. Remove plaintext workflow_dispatch credential inputs for Brev auth. GitHub Actions inputs are run metadata and are not secret-safe; the reusable workflow now reads BREV_API_KEY and BREV_ORG_ID only from repository/org secrets. 2. Pass github.ref_name from nightly-e2e into the reusable Brev workflow instead of hardcoding main. Scheduled runs still execute on main, but manual dispatches from a branch now validate the selected ref. 3. Replace workflow-input-influenced shell command strings in the Brev provisioning path with execFileSync/spawnSync argument arrays. This covers --launchable, setup_script_url downloads, the startup-script fallback, and the create-error brev ls check.
|
Addressed the three CodeRabbit findings in d7d4a07:
|
There was a problem hiding this comment.
🧹 Nitpick comments (1)
test/e2e/brev-e2e.test.ts (1)
571-577: 💤 Low valueConsider using
execFileSyncfor rsync to match hardenedbrev createpattern.While
REPO_DIRandINSTANCE_NAMEare controlled values (filesystem-derived and workflow-set numeric respectively), theexecSyncshell interpolation pattern differs from the hardenedexecFileSyncapproach now used increateBrevInstance. For consistency and defense in depth:♻️ Suggested refactor
- execSync( - `rsync -az --delete --exclude node_modules --exclude .git --exclude dist --exclude .venv "${REPO_DIR}/" "${INSTANCE_NAME}:${resolvedRemoteDir}/"`, - { encoding: "utf-8", timeout: 120_000 }, - ); + execFileSync( + "rsync", + [ + "-az", + "--delete", + "--exclude", "node_modules", + "--exclude", ".git", + "--exclude", "dist", + "--exclude", ".venv", + `${REPO_DIR}/`, + `${INSTANCE_NAME}:${resolvedRemoteDir}/`, + ], + { encoding: "utf-8", timeout: 120_000 }, + );🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test/e2e/brev-e2e.test.ts` around lines 571 - 577, Replace the shell-interpolated execSync rsync call with a hardened execFileSync invocation: stop passing a single shell string and instead call execFileSync("rsync", [...args]) using the variables REPO_DIR, INSTANCE_NAME and resolvedRemoteDir to build the argument list (include -az, --delete and the --exclude entries), and preserve the same encoding and timeout options; this mirrors the safer pattern used in createBrevInstance and avoids shell interpolation while keeping the same behavior and logging around the sync step.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@test/e2e/brev-e2e.test.ts`:
- Around line 571-577: Replace the shell-interpolated execSync rsync call with a
hardened execFileSync invocation: stop passing a single shell string and instead
call execFileSync("rsync", [...args]) using the variables REPO_DIR,
INSTANCE_NAME and resolvedRemoteDir to build the argument list (include -az,
--delete and the --exclude entries), and preserve the same encoding and timeout
options; this mirrors the safer pattern used in createBrevInstance and avoids
shell interpolation while keeping the same behavior and logging around the sync
step.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 2572998d-c02e-4dae-8106-5d53d23f0c5a
📒 Files selected for processing (3)
.github/workflows/e2e-branch-validation.yaml.github/workflows/nightly-e2e.yamltest/e2e/brev-e2e.test.ts
🚧 Files skipped from review as they are similar to previous changes (1)
- .github/workflows/nightly-e2e.yaml
ericksoa
left a comment
There was a problem hiding this comment.
Approved. This is CI-only Brev/nightly wiring; product runtime risk is low. CodeRabbit's actionable security findings were addressed, normal PR checks are green, and the remaining Brev validation risk is the existing/known Brev platform path rather than NemoClaw product behavior.
…EN (#3350)" (#3373) ## Summary Reverts #3350 because the new `brev-e2e` reusable workflow call breaks `nightly-e2e` at startup on `main`. GitHub rejects `nightly-e2e.yaml` before jobs are created because the caller only grants `contents: read`, while the called workflow requests `checks: write` and `pull-requests: write`. Failing runs: - https://github.com/NVIDIA/NemoClaw/actions/runs/25705100581 - https://github.com/NVIDIA/NemoClaw/actions/runs/25705179290 ## Verification - `git diff --check origin/main..HEAD` - Confirmed `nightly-e2e.yaml` no longer contains the new `brev-e2e` job wiring. <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit ## Release Notes * **Chores** * Updated E2E testing authentication and provisioning mechanisms in CI workflows * Removed brev-e2e from nightly test pipeline * Simplified E2E test setup and instance creation logic [](https://app.coderabbit.ai/change-stack/NVIDIA/NemoClaw/pull/3373) <!-- end of auto-generated comment: release notes by coderabbit.ai --> Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
Nightly E2E started failing at workflow startup after NVIDIA#3350 merged: the runs have conclusion=startup_failure and zero jobs/logs, which means GitHub could not instantiate the workflow graph rather than tests failing. The only workflow-graph change from NVIDIA#3350 was the new brev-e2e reusable workflow call. Make that call more conservative: - pass branch: main again for nightly-e2e. Scheduled nightly runs validate main; branch-specific Brev validation should use e2e-branch-validation.yaml directly. - use secrets: inherit for the same-repository reusable workflow call instead of mapping individual secrets in the caller. The callee still declares the required BREV_API_KEY, BREV_ORG_ID, and NVIDIA_API_KEY secrets. This keeps the Brev job wired into nightly while removing the likely startup-time expression/secret mapping issue that prevented any nightly jobs from being created.
## Summary Restores the Brev nightly E2E workflow wiring that was reverted after the upstream repository was missing the required Brev credentials. The required `BREV_API_KEY`, `BREV_ORG_ID`, and `NVIDIA_API_KEY` secrets are now present in `NVIDIA/NemoClaw`, so the reusable workflow can run without failing nightly startup. ## Related Issue Fixes #3350 ## Changes - Reverts the revert of #3350 to restore the Brev reusable workflow and nightly `brev-e2e` matrix. - Restores long-lived `BREV_API_KEY`/`BREV_ORG_ID` authentication for Brev CI validation. - Restores branch-aware checkout, CLI build, and Brev E2E harness updates for the `all`, `messaging-providers`, and `full` suites. ## Type of Change - [x] Code change (feature, bug fix, or refactor) - [ ] Code change with doc updates - [ ] Doc only (prose changes, no code sample modifications) - [ ] Doc only (includes code sample changes) ## Verification - [ ] `npx prek run --all-files` passes - [ ] `npm test` passes - [ ] Tests added or updated for new or changed behavior - [x] No secrets, API keys, or credentials committed - [ ] Docs updated for user-facing behavior changes - [ ] `make docs` builds without warnings (doc changes only) - [ ] Doc pages follow the [style guide](https://github.com/NVIDIA/NemoClaw/blob/main/docs/CONTRIBUTING.md) (doc changes only) - [ ] New doc pages include SPDX header and frontmatter (new pages only) Notes: - Confirmed upstream repository secrets exist with `gh secret list --repo NVIDIA/NemoClaw`: `BREV_API_KEY`, `BREV_ORG_ID`, and `NVIDIA_API_KEY`. - `git diff --check origin/main...HEAD` passes. - `npx prek run --all-files` and `npm test` were attempted locally but did not pass because this worktree is missing generated/build artifacts and plugin dependencies (`dist/`, `nemoclaw/dist/`, `json5` under plugin install); failures were unrelated to this workflow-only revert. --- Signed-off-by: Julie Yaunches <jyaunches@nvidia.com> <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **Improvements** * Enhanced E2E pipeline reliability with improved error handling and retry logic * Improved instance provisioning with published launchable support and fallback mechanisms * Upgraded authentication system * **New Features** * Added explicit branch selection for manual workflow dispatch [](https://app.coderabbit.ai/change-stack/NVIDIA/NemoClaw/pull/3401) <!-- end of auto-generated comment: release notes by coderabbit.ai -->
Summary
Wires the existing
e2e-branch-validation.yamlreusable workflow into the nightly E2E pipeline as a newbrev-e2ematrix job. This turns on end-to-end coverage of the real Brev cloud provisioning + launchable bootstrap path, which today is only exercised on manualworkflow_dispatch.Why now
Previously this couldn't run unattended because short-lived Brev refresh tokens would expire silently — the nightly would happily `1 skipped, 0 passed` every night until someone noticed (see the history in #1638). A long-lived `BREV_API_TOKEN` (CI service-account-style refresh token) is now available, which removes that failure mode.
What changes
`.github/workflows/e2e-branch-validation.yaml`
```yaml
if: >-
github.repository == 'NVIDIA/NemoClaw' ||
github.repository == 'jyaunches/NemoClaw'
```
The fork is temporarily whitelisted so the long-lived token can be validated end-to-end before rotation into the upstream secret.
`.github/workflows/nightly-e2e.yaml`
Launchable coverage
`use_launchable: true` (default) — the test harness uses `scripts/brev-launchable-ci-cpu.sh` as the Brev startup script. The cloud instance comes up pre-baked with Docker, Node 22, OpenShell CLI, and the NemoClaw repo; the E2E harness then rsyncs the target branch over the launchable's clone and runs the suite.
This is distinct from the existing `launchable-smoke-e2e` job, which validates the launchable script on `ubuntu-latest` without any Brev cloud involvement.
Cost
$0.10 per Brev CPU instance × 3 suites × 30 nights ≈ **$9/month**.Validation plan (before merge)
Related
Update (2026-05-11): switch to long-lived
--api-keyauthBrev CI/CD team issued a proper long-lived API key. Commit
07687a6adswaps the auth mechanism:brev login --api-key/--org-idflags)BREV_API_TOKEN→BREV_API_KEY(old name was misleading —refresh_token≠api_key) + newBREV_ORG_IDsecretbrev login --api-key "$BREV_API_KEY" --org-id "$BREV_ORG_ID"instead of hand-writing~/.brev/credentials.jsonwith a refresh_token (original workaround for brev login removal in refactor(installer): unify host preflight and thin deploy compatibility #1470 is now obsolete)workflow_dispatchoverride inputbrev_token→brev_api_key+ newbrev_org_idNo test-harness changes required — the CLI still populates
credentials.jsonat the same path, just withapi_key+api_key_org_idfields alongside the existingaccess_token/refresh_token.Secret checklist (updated)
BREV_API_KEYBREV_ORG_IDNVIDIA_API_KEYBREV_API_TOKENSummary by CodeRabbit
Chores
Tests