Problem
When a PR touches sensitive code paths (entrypoint scripts, Dockerfile, proxy rewrite, gateway auth), there is no signal telling the reviewer which nightly E2E jobs are relevant. Reviewers must either know the full test matrix by heart or dispatch the entire nightly suite (16+ jobs, ~8 hours of runner time) on faith.
The weekend of Apr 25–27 showed the cost: Bug 2 (gateway token externalization) would have been caught by cloud-experimental-e2e Phase 5e (TUI smoke), but that job was disabled — and no review comment flagged that the PR touched the token flow and needed TUI validation.
Proposal: Two-Part Solution
Part 1: CodeRabbit path_instructions for E2E recommendations
Add path_instructions entries to .coderabbit.yaml that map file-change patterns to recommended nightly E2E jobs. CodeRabbit will surface these as review comments on every PR that touches a mapped path.
File → E2E mapping:
| File Pattern |
Recommended E2E Jobs |
Rationale |
scripts/nemoclaw-start.sh, scripts/lib/sandbox-init.sh |
cloud-experimental-e2e, sandbox-survival-e2e, sandbox-operations-e2e |
Entrypoint changes affect every sandbox boot. Landlock/non-root execution is invisible to unit tests. |
Dockerfile, Dockerfile.base |
cloud-e2e, sandbox-survival-e2e, hermes-e2e, rebuild-openclaw-e2e |
Layer ordering, permissions, baked config affect image behavior. |
nemoclaw-blueprint/scripts/http-proxy-fix.js |
cloud-e2e, inference-routing-e2e |
Proxy rewrite affects all inference routing. FORWARD-mode path needs manual validation until forward-proxy-e2e exists. |
src/lib/onboard.ts |
cloud-e2e, sandbox-operations-e2e, rebuild-openclaw-e2e |
Core onboarding logic. |
src/nemoclaw.ts (status/recovery/connect functions) |
sandbox-survival-e2e, sandbox-operations-e2e, skip-permissions-e2e |
CLI dispatch and gateway recovery. These are the exact jobs that caught the #2398 hang. |
src/lib/cluster-image-patch.ts, src/lib/preflight.ts |
overlayfs-autofix-e2e |
Docker 26+ compatibility. |
src/lib/deploy.ts |
deployment-services-e2e |
Deployment lifecycle. |
src/lib/sandbox-state.ts |
snapshot-commands-e2e, rebuild-openclaw-e2e |
Backup/restore/rebuild. |
src/lib/shields*.ts |
shields-config-e2e |
Config mutability. |
agents/hermes/** |
hermes-e2e, rebuild-hermes-e2e |
Hermes agent. |
CodeRabbit would generate a comment like:
🧪 E2E Test Recommendation
This PR modifies scripts/nemoclaw-start.sh. Consider running these nightly E2E jobs before merge:
sandbox-survival-e2e — gateway restart recovery
sandbox-operations-e2e — process recovery after gateway kill
cloud-experimental-e2e — Landlock + security checks
To run selectively: gh workflow run nightly-e2e.yaml --ref <branch> -f jobs=sandbox-survival-e2e,sandbox-operations-e2e
Part 2: Selective job dispatch via workflow_dispatch input
Add a jobs input to nightly-e2e.yaml that lets maintainers run a subset of nightly jobs on any branch:
on:
schedule:
- cron: "0 0 * * *"
workflow_dispatch:
inputs:
jobs:
description: "Comma-separated job names to run (empty = all)"
required: false
type: string
ref_override:
description: "Override ref (e.g. PR branch). Empty = triggering ref."
required: false
type: string
Each job gets a conditional:
cloud-e2e:
if: >-
github.repository == 'NVIDIA/NemoClaw' &&
(github.event_name != 'workflow_dispatch' ||
inputs.jobs == '' ||
contains(inputs.jobs, 'cloud-e2e'))
Maintainer workflow:
- CodeRabbit comments on a PR: "recommend running
sandbox-survival-e2e, sandbox-operations-e2e"
- Maintainer runs:
gh workflow run nightly-e2e.yaml --ref pull-request/2500 -f jobs=sandbox-survival-e2e,sandbox-operations-e2e
- Only those 2 jobs run (~10 min instead of ~8 hours)
- Results visible in the Actions tab, linked back to the PR branch
Future: Part 3 (stretch) — Automated dispatch from CodeRabbit comment
A GitHub Action triggered by issue_comment that parses a /run-e2e <jobs> command from maintainers:
/run-e2e sandbox-survival-e2e,sandbox-operations-e2e
This would gh workflow run the selective dispatch automatically. Lower priority since the CLI command is already fast.
Implementation Steps
Context
Problem
When a PR touches sensitive code paths (entrypoint scripts, Dockerfile, proxy rewrite, gateway auth), there is no signal telling the reviewer which nightly E2E jobs are relevant. Reviewers must either know the full test matrix by heart or dispatch the entire nightly suite (16+ jobs, ~8 hours of runner time) on faith.
The weekend of Apr 25–27 showed the cost: Bug 2 (gateway token externalization) would have been caught by
cloud-experimental-e2ePhase 5e (TUI smoke), but that job was disabled — and no review comment flagged that the PR touched the token flow and needed TUI validation.Proposal: Two-Part Solution
Part 1: CodeRabbit
path_instructionsfor E2E recommendationsAdd
path_instructionsentries to.coderabbit.yamlthat map file-change patterns to recommended nightly E2E jobs. CodeRabbit will surface these as review comments on every PR that touches a mapped path.File → E2E mapping:
scripts/nemoclaw-start.sh,scripts/lib/sandbox-init.shcloud-experimental-e2e,sandbox-survival-e2e,sandbox-operations-e2eDockerfile,Dockerfile.basecloud-e2e,sandbox-survival-e2e,hermes-e2e,rebuild-openclaw-e2enemoclaw-blueprint/scripts/http-proxy-fix.jscloud-e2e,inference-routing-e2esrc/lib/onboard.tscloud-e2e,sandbox-operations-e2e,rebuild-openclaw-e2esrc/nemoclaw.ts(status/recovery/connect functions)sandbox-survival-e2e,sandbox-operations-e2e,skip-permissions-e2esrc/lib/cluster-image-patch.ts,src/lib/preflight.tsoverlayfs-autofix-e2esrc/lib/deploy.tsdeployment-services-e2esrc/lib/sandbox-state.tssnapshot-commands-e2e,rebuild-openclaw-e2esrc/lib/shields*.tsshields-config-e2eagents/hermes/**hermes-e2e,rebuild-hermes-e2eCodeRabbit would generate a comment like:
Part 2: Selective job dispatch via
workflow_dispatchinputAdd a
jobsinput tonightly-e2e.yamlthat lets maintainers run a subset of nightly jobs on any branch:Each job gets a conditional:
Maintainer workflow:
sandbox-survival-e2e,sandbox-operations-e2e"gh workflow run nightly-e2e.yaml --ref pull-request/2500 -f jobs=sandbox-survival-e2e,sandbox-operations-e2eFuture: Part 3 (stretch) — Automated dispatch from CodeRabbit comment
A GitHub Action triggered by
issue_commentthat parses a/run-e2e <jobs>command from maintainers:This would
gh workflow runthe selective dispatch automatically. Lower priority since the CLI command is already fast.Implementation Steps
path_instructionsto.coderabbit.yamlwith the file→E2E mappinginputs.jobstonightly-e2e.yamlworkflow_dispatchwith per-job conditionalsgh workflow runcommand in recommendations/run-e2ecomment-triggered ActionContext
/skill:nemoclaw-e2e-strategy