ci(nightly): migrate E2E jobs to NVIDIA self-hosted runners#3144
Conversation
Switch all 33 nightly E2E jobs from ubuntu-latest (GitHub-hosted, 2 vCPU) to linux-amd64-cpu4 (NVIDIA self-hosted, 4 vCPU). Meta jobs (notify-on-failure, report-to-pr, scorecard) stay on ubuntu-latest since they only make API calls. Motivation: full sandbox onboard E2E tests spend most of their time on Docker image builds. The NVIDIA runners have more CPU and should reduce per-job runtime. The pr-self-hosted workflow already uses these runners successfully for image builds on every PR.
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Enterprise Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughThe nightly-e2e GitHub Actions workflow updates 33 CPU-based E2E jobs to run on ChangesNightly E2E Runner Migration
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
.github/workflows/nightly-e2e.yaml (1)
6-7:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winUpdate stale runner descriptions in top-of-file comments.
Lines 6 and 29 still describe
cloud-e2eandlaunchable-smoke-e2eas running onubuntu-latest, but both now run onlinux-amd64-cpu4.✏️ Suggested comment-only fix
-# cloud-e2e Cloud inference (NVIDIA Endpoint API) on ubuntu-latest. +# cloud-e2e Cloud inference (NVIDIA Endpoint API) on linux-amd64-cpu4. ... -# launchable-smoke-e2e Community install path (brev-launchable-ci-cpu.sh) on ubuntu-latest. +# launchable-smoke-e2e Community install path (brev-launchable-ci-cpu.sh) on linux-amd64-cpu4.Also applies to: 29-30
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.github/workflows/nightly-e2e.yaml around lines 6 - 7, Update the stale top-of-file comments that describe runner platforms: change the descriptions for "cloud-e2e" and "launchable-smoke-e2e" (the comment lines mentioning cloud-e2e and launchable-smoke-e2e) to reflect they run on "linux-amd64-cpu4" instead of "ubuntu-latest" so the header comments match current runner configurations.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.github/workflows/nightly-e2e.yaml:
- Line 87: Update the header comments that still say "ubuntu-latest" to match
the actual runner labels used: change the comment entries that reference
ubuntu-latest (the header comments around the workflow name and the reusable job
descriptions) so they reflect that "cloud-e2e" and "launchable-smoke-e2e" are
running on "linux-amd64-cpu4" instead of ubuntu-latest; search for comment text
containing "ubuntu-latest" and replace or reword them to mention
"linux-amd64-cpu4" and the specific job names "cloud-e2e" and
"launchable-smoke-e2e" so the comments accurately describe the runner
configuration.
---
Outside diff comments:
In @.github/workflows/nightly-e2e.yaml:
- Around line 6-7: Update the stale top-of-file comments that describe runner
platforms: change the descriptions for "cloud-e2e" and "launchable-smoke-e2e"
(the comment lines mentioning cloud-e2e and launchable-smoke-e2e) to reflect
they run on "linux-amd64-cpu4" instead of "ubuntu-latest" so the header comments
match current runner configurations.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: f5b4acee-e7ae-448b-b4f8-e3151a34fed6
📒 Files selected for processing (1)
.github/workflows/nightly-e2e.yaml
CodeRabbit flagged that the header comments still referenced ubuntu-latest for cloud-e2e and launchable-smoke-e2e.
Admin merge of the #3144 revert after PR checks passed and the branch nightly showed broad green signal on the reverted runner config.
Switch all 33 nightly E2E jobs from
ubuntu-latest(GitHub-hosted, 2 vCPU) tolinux-amd64-cpu4(NVIDIA self-hosted, 4 vCPU). Meta jobs (notify-on-failure, report-to-pr, scorecard) stay onubuntu-latestsince they only make API calls.Motivation: Full sandbox onboard E2E tests spend most of their time on Docker image builds. The NVIDIA runners have more CPU and should reduce per-job runtime. The
pr-self-hostedworkflow already uses these runners successfully for image builds on every PR.Validated: The
device-auth-health-e2ejob was tested onlinux-amd64-cpu4during PR #3128 development and completed in ~16 minutes (vs timing out at 15m onubuntu-latest).Summary by CodeRabbit