Skip to content

ci(e2e): auto-dispatch advised E2E jobs#3426

Merged
cv merged 15 commits into
mainfrom
ci/e2e-advisor-auto-dispatch
May 13, 2026
Merged

ci(e2e): auto-dispatch advised E2E jobs#3426
cv merged 15 commits into
mainfrom
ci/e2e-advisor-auto-dispatch

Conversation

@cv

@cv cv commented May 12, 2026

Copy link
Copy Markdown
Collaborator

Summary

Adds automatic selective E2E dispatch from the Pi E2E advisor for eligible internal NVIDIA PRs. The dispatcher keeps the trusted-code boundary by running nightly-e2e.yaml from main, passes the PR head SHA as target_ref, and derives dispatchable jobs from the target workflow instead of a hardcoded allowlist.

Changes

  • Add tools/e2e-advisor/dispatch.mts to gate auto-dispatch by repository, same-repo PR, non-draft status, OWNER/MEMBER author association, advisor confidence, and target workflow dispatchability.
  • Wire .github/workflows/e2e-advisor.yaml to run the TypeScript dispatcher with node --experimental-strip-types and include dispatch status in the sticky PR comment.
  • Extend .github/workflows/nightly-e2e.yaml with target_ref and pr_number inputs so trusted workflow definitions can test PR head SHAs and report results back to the PR.
  • Update advisor comments/docs and add unit coverage for dispatcher planning and dynamic job extraction.

Type of Change

  • Code change (feature, bug fix, or refactor)
  • Code change with doc updates
  • Doc only (prose changes, no code sample modifications)
  • Doc only (includes code sample changes)

Verification

  • npx prek run --all-files passes
  • npm test passes
  • Tests added or updated for new or changed behavior
  • No secrets, API keys, or credentials committed
  • Docs updated for user-facing behavior changes
  • make docs builds without warnings (doc changes only)
  • Doc pages follow the style guide (doc changes only)
  • New doc pages include SPDX header and frontmatter (new pages only)

Additional verification run:

  • npm run typecheck passes
  • npm run typecheck:cli passes
  • npx vitest run --project cli test/e2e-advisor-dispatch.test.ts passes
  • node --experimental-strip-types tools/e2e-advisor/dispatch.mts --result /tmp/does-not-exist --out-dir /tmp/e2e-advisor-dispatch-smoke smoke-tested strip-types execution
  • node --experimental-strip-types tools/e2e-advisor/comment.mts smoke-tested strip-types execution
  • Parsed .github/workflows/e2e-advisor.yaml and .github/workflows/nightly-e2e.yaml with yaml successfully

Signed-off-by: Carlos Villela cvillela@nvidia.com

Summary by CodeRabbit

  • New Features

    • Automatic dispatch of selective E2E jobs for eligible pull requests; produces dispatch result and markdown summary artifacts.
  • Improvements

    • Longer advisor timeouts and expanded workflow dispatch permissions; PR comments include dispatch results and prefer newer comment tooling with safe fallbacks.
    • Nightly workflow can target specific refs/PRs; PR reporting resolves target PR/ref more reliably.
    • Improved advisor runner with streaming, heartbeat, and safer timeout handling.
  • Documentation

    • Expanded guidance on advisor behavior, outputs, and token/permission usage.
  • Tests

    • Added end-to-end tests for auto-dispatch planning and job selection.

Review Change Stack

@cv cv self-assigned this May 12, 2026
@copy-pr-bot

copy-pr-bot Bot commented May 12, 2026

Copy link
Copy Markdown

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@coderabbitai

coderabbitai Bot commented May 12, 2026

Copy link
Copy Markdown
Contributor

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds an auto-dispatcher that plans and optionally triggers selective nightly E2E workflow runs for eligible PRs; integrates dispatch artifacts into the advisor PR comment tool; updates nightly workflow to accept explicit target refs and optional PR-number reporting.

Changes

E2E Advisor auto-dispatch

Layer / File(s) Summary
Dispatch planner core logic
tools/e2e-advisor/dispatch.mts
Implements CLI/bootstrap, planAutoDispatch, extractDispatchableJobs, collectRecommendedJobs, input validation, dispatchWorkflow, and dispatch summary rendering; produces JSON and Markdown dispatch artifacts.
PR comment dispatch result rendering
tools/e2e-advisor/comment.mts
Converts comment tool to TypeScript, adds --dispatch support, reads typed result and dispatch JSON, renders dispatchHint and Auto-dispatched E2E sections, tightens error typing, and generalizes typed GitHub request helper.
Dispatch planning test suite
test/e2e-advisor-dispatch.test.ts
Adds Vitest tests for extracting dispatchable jobs and for planAutoDispatch eligibility across member/collaborator/draft scenarios, ignored non-dispatchable recommendations, and target-workflow filtering.
Nightly workflow selective targeting
.github/workflows/nightly-e2e.yaml
Adds workflow_dispatch inputs target_ref and pr_number; updates many job checkouts to use inputs.target_ref when provided; tightens failure-reporting conditions and resolves PR number for reporting with issues: write permissions.
E2E advisor dispatcher wiring
.github/workflows/e2e-advisor.yaml
Grants actions: write, enumerates trusted advisor files fetched from main, adds a PR-only Auto-dispatch required E2E jobs step that runs dispatch.mts (or writes skipped artifacts), and passes dispatch output into the comment step.
Documentation and configuration
tools/e2e-advisor/README.md, tsconfig.cli.json
Documents auto-dispatch behavior and safety model, clarifies token usage and artifacts, and includes tools/**/*.ts/.mts in CLI tsconfig includes.
Pi runner async/heartbeat improvements
tools/e2e-advisor/pi-analyze.mjs
Replaces synchronous Pi invocation with an async runPi that streams stdout/stderr, emits heartbeat logs, enforces timeouts (SIGTERM→SIGKILL), and returns structured execution results; adds progress logging.

Sequence Diagram(s)

sequenceDiagram
  participant CLI as dispatch.mts (runner)
  participant Planner as planAutoDispatch
  participant Workflow as nightly-e2e.yaml (target)
  participant GitHub as GitHub Actions API
  CLI->>Planner: read advisorResult + event + workflowText
  Planner->>Workflow: extractDispatchableJobs(workflowText)
  Planner->>Planner: collectRecommendedJobs(advisorResult)
  Planner->>GitHub: validate inputs
  alt eligible
    Planner->>GitHub: POST /actions/workflows/{workflow}/dispatches (inputs/jobs/target_ref)
    GitHub-->>Planner: 2xx
    Planner-->>CLI: write dispatched result + summary
  else skipped/failed
    Planner-->>CLI: write skipped/failed result + summary
  end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰 A rabbit hops through workflows with glee,
It reads the advisor and nods, "Let tests be,"
With ref and jobs checked, it sends the dispatch call,
Leaves summaries and comments tacked upon the wall,
Safe hops, small runs — hooray, selective E2E!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: auto-dispatching advised E2E jobs based on the PR advisor results, which is the primary objective across all modified files.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch ci/e2e-advisor-auto-dispatch

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Comment @coderabbitai help to get the list of available commands and usage tips.

Comment thread tools/e2e-advisor/dispatch.mts Fixed
Comment thread tools/e2e-advisor/dispatch.mjs Fixed
Comment thread tools/e2e-advisor/dispatch.mts Fixed
@github-actions

github-actions Bot commented May 12, 2026

Copy link
Copy Markdown
Contributor

E2E Advisor Recommendation

Required E2E: None
Optional E2E: docs-validation-e2e, network-policy-e2e

Workflow run

Full advisor summary

Pi Semantic E2E Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required E2E

  • None. This PR only modifies CI advisor tooling (e2e-advisor workflow, dispatch.mts, comment.mts, pi-analyze.mjs, README, tsconfig.cli.json) and adds backward-compatible workflow_dispatch inputs (target_ref, pr_number) plus per-job ref: overrides to nightly-e2e.yaml. No product source under src/, no installer, no sandbox/onboard/inference/credential code paths, and no E2E test scripts are touched. Existing nightly schedule and manual dispatch flows fall back to github.ref when the new inputs are empty, so user-visible behavior of every E2E job is unchanged. Security-critical pieces of the new auto-dispatch (author_association allowlist, ref/input validation, derivation of dispatchable jobs from the workflow's own predicates) are covered by the new vitest unit test test/e2e-advisor-dispatch.test.ts, which runs in normal CI. No merge-blocking E2E coverage is warranted.

Optional E2E

  • docs-validation-e2e (low): Cheapest nightly-e2e job (15min, no NVIDIA_API_KEY-heavy sandbox lifecycle). Useful as a one-shot smoke that the new target_ref/pr_number workflow_dispatch inputs and the per-job ref: ${{ inputs.target_ref || github.ref }} checkout still resolve correctly via selective dispatch. Not merge-blocking because the change is a backward-compatible default and the dispatcher itself is unit-tested.
  • network-policy-e2e (medium): Representative selective-dispatch target used as the canonical example in the new unit tests. Running it once via the new auto-dispatch path validates the end-to-end advisor → trusted-main dispatch → PR-head-SHA checkout flow without exercising secret-bearing inference paths. Optional confidence check, not required.

New E2E recommendations

  • ci-tooling/auto-dispatch-integration (low): The advisor's auto-dispatch path (advisor job → workflow_dispatch nightly-e2e.yaml@main with target_ref=PR head SHA) is currently only covered by unit tests against parsed YAML and a synthetic event. There is no integration check that the produced dispatch payload actually triggers a runnable nightly-e2e selective dispatch end-to-end (e.g., that the predicate format stays in sync, that actions: write is sufficient, and that report-to-pr posts back to the originating PR via pr_number). A scheduled or manually-triggered self-test that dispatches a single cheap job (e.g., docs-validation-e2e) against a known SHA and verifies report-to-pr commented on a fixture PR/issue would close this gap.
    • Suggested test: e2e-advisor-self-test-e2e: scheduled workflow that calls dispatch.mts in dry-run-style mode and then performs one real workflow_dispatch of nightly-e2e.yaml with jobs=docs-validation-e2e against main, asserting the run was created and report-to-pr ran with the supplied pr_number.

Dispatch hint

  • Workflow: nightly-e2e.yaml
  • jobs input: ``

Comment thread tools/e2e-advisor/dispatch.mts Fixed
Comment thread tools/e2e-advisor/dispatch.mts Fixed
Comment thread tools/e2e-advisor/dispatch.mts Fixed
@cv cv marked this pull request as ready for review May 13, 2026 00:07

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.github/workflows/e2e-advisor.yaml:
- Around line 194-198: Guard the node invocation of e2e-advisor/dispatch.mts by
first checking whether the file e2e-advisor/dispatch.mts exists and, if not,
skip execution while writing a deterministic "skipped" artifact to the same
result path (artifacts/e2e-advisor/e2e-advisor-final-result.json) so downstream
steps see a valid JSON result; update the GitHub Actions step that runs node
--experimental-strip-types "$ADVISOR_DIR/tools/e2e-advisor/dispatch.mts" to
perform an existence check and either run dispatch.mts (with the same --result,
--workflow, --workflow-path, --out-dir args) or create the skipped JSON artifact
(containing a stable "skipped" status) and echo/log a clear message so PR runs
do not hard-fail when main lacks dispatch.mts.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 242e8124-3425-40c0-b81a-c0ed703e1890

📥 Commits

Reviewing files that changed from the base of the PR and between 4439b58 and e758b60.

📒 Files selected for processing (7)
  • .github/workflows/e2e-advisor.yaml
  • .github/workflows/nightly-e2e.yaml
  • test/e2e-advisor-dispatch.test.ts
  • tools/e2e-advisor/README.md
  • tools/e2e-advisor/comment.mjs
  • tools/e2e-advisor/dispatch.mts
  • tsconfig.cli.json

Comment thread .github/workflows/e2e-advisor.yaml Outdated

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tools/e2e-advisor/comment.mts`:
- Around line 99-107: The parseArgs function currently always consumes argv[i+1]
as a flag value which causes the next flag to be swallowed if no value is
provided; update parseArgs (inside the loop handling arg, key, and parsed) to
check that argv[i+1] exists and does not startWith("--") before assigning
parsed[key] = argv[i+1] and incrementing i; if the next token is missing or is
another flag, set parsed[key] to a boolean true (or undefined per project
convention) and do not increment i so flags are not consumed incorrectly.
- Around line 182-184: The function findExistingComment only requests the first
page of comments (per_page=100) so it can miss the marker on busy PRs; modify
findExistingComment to paginate through GitHub comments by adding a page
parameter (e.g., page=1,2,...) to the github call and loop requesting pages
until either a comment containing marker is found or the returned page is empty
(or fewer than per_page entries), returning the found comment immediately to
avoid extra requests; ensure you keep using the same token and per_page=100 for
efficiency and stop the loop when no more pages remain.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 751ba5b1-a923-49a6-af95-615e53f3e2fe

📥 Commits

Reviewing files that changed from the base of the PR and between e758b60 and 736d32b.

📒 Files selected for processing (2)
  • .github/workflows/e2e-advisor.yaml
  • tools/e2e-advisor/comment.mts
🚧 Files skipped from review as they are similar to previous changes (1)
  • .github/workflows/e2e-advisor.yaml

Comment thread tools/e2e-advisor/comment.mts
Comment thread tools/e2e-advisor/comment.mts Outdated
@cv cv added the v0.0.40 label May 13, 2026

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tools/e2e-advisor/pi-analyze.mjs`:
- Around line 25-27: The environment parsing for PI_E2E_ADVISOR_TIMEOUT_MS and
PI_E2E_ADVISOR_HEARTBEAT_MS is brittle: validate and sanitize values before
using them with setTimeout/setInterval by parsing into integers, ensuring they
are finite and non-negative, and falling back to safe defaults when invalid;
update the parsing logic that produces timeoutMs and heartbeatMs (and any
similar parsing later in the file around the other timer usage) to clamp
negative values to a minimum (e.g., 0 or a sane minimum heartbeat), use
Number.isFinite to reject NaN/Infinity, and then pass only the validated numbers
to setTimeout/setInterval to avoid immediate timeouts or busy loops.
- Around line 126-157: The subprocess stdout/stderr buffers are unbounded
(variables stdout, stderr) and can exhaust memory on noisy runs; limit each
buffer by introducing a MAX_BUFFER_BYTES constant and when appending in
child.stdout.on("data")/child.stderr.on("data") only keep the most recent bytes
up to that limit (e.g., append chunk then, if buffer.length > MAX_BUFFER_BYTES,
truncate to the tail or drop the oldest bytes) and optionally track a counter of
dropped bytes to log later; apply the same logic for both stdout and stderr and
ensure any log or error messages include the fact that output was truncated.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 41a1c41a-504a-4111-8fc3-0dbee058e858

📥 Commits

Reviewing files that changed from the base of the PR and between 86ce923 and fae29d3.

📒 Files selected for processing (2)
  • .github/workflows/e2e-advisor.yaml
  • tools/e2e-advisor/pi-analyze.mjs
🚧 Files skipped from review as they are similar to previous changes (1)
  • .github/workflows/e2e-advisor.yaml

Comment thread tools/e2e-advisor/pi-analyze.mjs Outdated
Comment thread tools/e2e-advisor/pi-analyze.mjs
@ericksoa ericksoa added v0.0.41 and removed v0.0.40 labels May 13, 2026

@ericksoa ericksoa left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR. I am requesting changes before this lands because this is CI automation with workflow dispatch/write-token behavior, and there are a few small hardening gaps that should be fixed first.

Required before merge:

  • Harden tools/e2e-advisor/comment.mts CLI flag parsing so a missing flag value cannot swallow the next flag. The same parser pattern also appears in tools/e2e-advisor/dispatch.mts, so please make that consistent too.
  • Paginate findExistingComment in tools/e2e-advisor/comment.mts; the current first-100-comments lookup can miss the sticky marker on busy PRs and create duplicate advisor comments.
  • Sanitize PI_E2E_ADVISOR_TIMEOUT_MS and PI_E2E_ADVISOR_HEARTBEAT_MS in tools/e2e-advisor/pi-analyze.mjs before using them with timers, falling back to safe defaults for invalid or negative values.
  • Cap captured Pi stdout/stderr in tools/e2e-advisor/pi-analyze.mjs so a noisy advisor run cannot grow memory unbounded before failure handling.

I did verify that the branch merges cleanly with current main; this is not a merge-conflict blocker. I am not asking for broader E2E here, just the targeted CI/advisor hardening plus the existing focused validation for these tools.

@cjagwani cjagwani left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

took a pass on this independent of the CodeRabbit + ericksoa review. agree with the existing changes-requested — all 4 hardening asks are still present at head sha fae29d3 (parseArgs swallow, comment pagination, env timer NaN-on-malformed, unbounded child buffers). worth fixing parseArgs in both comment.mts AND dispatch.mts:1152-1163 since the pattern got copy-pasted into the new file.

a few additional things worth surfacing before this lands:

  1. cost guardrail: E2E_ADVISOR_AUTO_DISPATCH_MAX_JOBS exists in dispatch.mts:1274 but isn't set in e2e-advisor.yaml. default 0 = no cap. a flaky Pi recommendation could fan out 20+ E2E jobs per PR sync. suggest setting something like MAX_JOBS: "6" in the workflow env until we have some dispatch history.

  2. concurrency: nightly-e2e.yaml:109 keys its concurrency group on github.ref only. for advisor-triggered runs that's always main, so rapid PR pushes cancel each other instead of queueing per-PR. second push wins — can mask first-push flakes. consider folding inputs.pr_number into the group.

  3. validators have no direct tests. validateGitRef/validateDispatchInputs (dispatch.mts:1414-1443) are the actual security boundary that the 6 CodeQL flags ride on, but the vitest suite only exercises planAutoDispatch and extractDispatchableJobs. one happy-path test plus rejection cases (.., backticks, newlines, 200+ char ref) would lock the boundary down.

  4. extractDispatchableJobs does a yaml string-scrape looking for inputs.jobs + ,${job},. if anyone changes the selective-dispatch predicate shape in nightly-e2e.yaml, the dispatcher silently stops dispatching that job and the PR comment doesn't surface that. either pin the predicate format in a comment in nightly-e2e.yaml or add a drift test.

security shape itself looks right — on: pull_request not _target, fork PRs blocked twice (workflow if: + plan gate), trusted main checkout for code execution, OWNER/MEMBER author gate, hardcoded repo+workflow in validators, GitHub API error bodies not landing in artifacts. nice work on that.

not approving — leaving for the existing change requests to be addressed first.

@cv

cv commented May 13, 2026

Copy link
Copy Markdown
Collaborator Author

@ericksoa I pushed 4b457c7 with the remaining cjagwani review follow-ups:

  • set E2E_ADVISOR_AUTO_DISPATCH_MAX_JOBS: "6" in the advisor workflow
  • keyed nightly-e2e workflow-dispatch concurrency by inputs.pr_number to avoid cross-PR cancellation
  • exported and directly tested validateGitRef / validateDispatchInputs happy paths and unsafe refs/inputs (.., //, trailing slash, .lock, backticks, newlines, 201-char refs, unsafe jobs/pr numbers)
  • pinned the selective-dispatch predicate contract in nightly-e2e.yaml and added a drift assertion against the real workflow predicates

Validation:

  • npm test -- test/e2e-advisor-dispatch.test.ts
  • push hooks, including CLI typecheck and CLI tests, passed

Could you please re-review when you have a chance?

@cv cv requested a review from ericksoa May 13, 2026 06:22
"X-GitHub-Api-Version": "2022-11-28",
"User-Agent": "nemoclaw-e2e-advisor-dispatcher",
},
body: JSON.stringify({ ref: safeRef, inputs: safeInputs }),
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area: ci CI workflows, checks, release automation, or GitHub Actions area: e2e End-to-end tests, nightly failures, or validation infrastructure chore Build, CI, dependency, or tooling maintenance

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants