Skip to content

ci: add E2E Advisor recommendations#3289

Merged
ericksoa merged 23 commits into
NVIDIA:mainfrom
jyaunches:ci/e2e-advisor-prototype
May 11, 2026
Merged

ci: add E2E Advisor recommendations#3289
ericksoa merged 23 commits into
NVIDIA:mainfrom
jyaunches:ci/e2e-advisor-prototype

Conversation

@jyaunches

@jyaunches jyaunches commented May 8, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Add an E2E Advisor workflow that recommends required/optional E2E tests for PRs.
  • Add a semantic Pi analysis layer using the NVIDIA inference-style Pi config template.
  • Add sticky PR comments with E2E recommendations and dispatch hints.
  • Add E2E manifest/rules/schema/config validation for advisor metadata.

Safety model

  • Static analysis only; does not execute PR-provided scripts/tests/package managers.
  • Pi runs with read-only tools only: read, grep, find, ls.
  • Pi runs with no session, extensions, skills, prompt templates, or context files.
  • Generated Pi credential config is written under /tmp, outside uploaded artifacts.
  • PR commenting is best-effort and can use E2E_ADVISOR_GITHUB_TOKEN if github.token lacks write permissions.

Secrets

  • PI_E2E_ADVISOR_API_KEY: required for Pi semantic analysis; otherwise semantic analysis skips and the deterministic baseline is used.
  • E2E_ADVISOR_GITHUB_TOKEN: optional repo-write token for PR comments when github.token cannot comment.

Validation

Follow-up

  • Add an E2E Recommendation Gate required check that verifies recommended E2E jobs passed for the same PR head SHA before merge.

Summary by CodeRabbit

  • New Features

    • Added an E2E Advisor workflow that analyzes PR changes with a Pi-powered advisor, produces structured recommendations, and can post a sticky PR comment with test suggestions.
  • Documentation

    • Added a guide describing workflow usage, required secrets, outputs, and manual invocation.
  • Chores

    • Emits stable artifacts (prompt, raw output, parsed JSON, final result, markdown summary) and a machine-readable schema for advisor results.

Review Change Stack

@coderabbitai

coderabbitai Bot commented May 8, 2026

Copy link
Copy Markdown
Contributor

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: f97fdf42-72ee-4b2c-b110-4da79c1fb578

📥 Commits

Reviewing files that changed from the base of the PR and between 5118e9a and 8f27e29.

📒 Files selected for processing (3)
  • .github/workflows/e2e-advisor.yaml
  • tools/e2e-advisor/comment.mjs
  • tools/e2e-advisor/pi-analyze.mjs
🚧 Files skipped from review as they are similar to previous changes (2)
  • tools/e2e-advisor/comment.mjs
  • tools/e2e-advisor/pi-analyze.mjs

📝 Walkthrough

Walkthrough

Adds a Pi-powered E2E Advisor system that analyzes PR diffs, classifies changed files into risk domains, and recommends required/optional E2E test coverage. It includes a schema contract, analysis CLI, GitHub comment integration, and Actions workflow orchestration.

Changes

E2E Advisor System

Layer / File(s) Summary
Data Schema & Output Contract
tools/e2e-advisor/schema.json
JSON Schema (draft 2020-12) defines E2E Advisor result structure with version, base/head refs, changed files, classified domains with confidence, required/optional test recommendations, and optional dispatch hint for workflow execution.
Pi Model Provider Configuration
tools/e2e-advisor/pi-models.template.json
Template JSON defining Anthropic and OpenAI provider settings with API endpoints, model entries, reasoning capabilities, context windows, and token limits for Pi agent configuration.
Pi Analysis: Initialization & Prompting
tools/e2e-advisor/pi-analyze.mjs
Loads schema and resolves base/head refs; computes changed files via git diff --name-only; generates truncated diff with rename/copy detection; builds JSON-only prompt embedding schema, files list, and diff context; exits early if Pi disabled or credentials unavailable.
Pi Analysis: Execution & Parsing
tools/e2e-advisor/pi-analyze.mjs
Selects provider/model from environment variables; prepares temporary Pi agent config directory with auth, settings, and models (token-substituted); executes pi binary via spawnSync with timeout/maxBuffer limits; writes combined stdout/stderr; extracts JSON via trimmed, fenced, tagged, or balanced-object parsing heuristics.
Pi Analysis: Normalization & Output
tools/e2e-advisor/pi-analyze.mjs
Normalizes parsed Pi JSON to fixed result contract; sanitizes and constrains domain/test/recommendation fields and confidence to low|medium|high; conditionally includes dispatchHint; renders markdown summary with base/head, confidence, required/optional sections, new recommendations, and optional dispatch details; writes JSON and markdown artifacts.
GitHub Comment Integration
tools/e2e-advisor/comment.mjs
Loads markdown summary (with fallback paths) and optional JSON result; parses CLI args with env fallbacks; builds marked comment body with test lists and optional dispatch hint; fetches up to 100 PR comments via GitHub REST API; searches for existing <!-- nemoclaw-e2e-advisor --> marker and updates via PATCH or creates new comment via POST; skips on missing credentials or permission errors.
GitHub Actions Workflow
.github/workflows/e2e-advisor.yaml
Triggered on PR events (opened, synchronize, reopened, ready_for_review) and manual dispatch; checks out trusted advisor code and PR/workdir with full history; sets up Node.js 22; installs Pi agent; optionally targets cross-repo PR by creating temporary git repo; runs pi-analyze.mjs with provider/model and API-key env wiring and run gate; appends summary to job summary; posts PR comment via comment.mjs; uploads artifacts/e2e-advisor/; restricted to NVIDIA/NemoClaw with per-PR/ref concurrency cancellation.
Documentation & Usage Guide
tools/e2e-advisor/README.md
Describes workflow triggers, fork-skip/internal-repo scoping, Pi installation, analysis step, artifacts output, sticky PR comment marker, Pi safety model (static analysis, read-only tools, no PR script execution, /tmp credential storage), required/optional secrets, expected artifacts, manual CLI invocation with base/head/schema/out-dir args, and the output contract defined by schema.json.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰 I nibble diffs and hum a tune,
I chase each changed file under the moon,
I whisper tests that ought to run,
Post findings neat—one by one.
Hop on, dear PR; let coverage bloom.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The pull request title clearly and concisely summarizes the main change: introducing an E2E Advisor workflow that provides test recommendations for pull requests.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.github/workflows/e2e-advisor.yaml:
- Around line 76-95: The workflow currently hard-codes fetching and diffing
against target/main; add and use the PR’s actual base branch (e.g.
inputs.target_base) instead: when inputs.target_repo and inputs.target_pr are
provided, replace the git fetch of "target main" with fetching "target
${inputs.target_base}" (change the git -C "$TARGET_DIR" fetch --no-tags target
main line) and change the BASE_REF branch selection to use
"target/${inputs.target_base}" instead of "target/main" (update the BASE_REF env
expression in the Run deterministic E2E advisor baseline step); keep HEAD_REF
logic the same.
- Around line 132-136: The fallback printf contains stray shell tokens "},{"
after the redirect which breaks the step when the summary is missing; edit the
conditional fallback (the printf call inside the e2e-advisor summary step) to
remove the extraneous "},{" so the line becomes a plain printf redirecting to
"$GITHUB_STEP_SUMMARY" and ensure the surrounding if/else/fi block remains
properly balanced.

In `@scripts/check-e2e-advisor-config.ts`:
- Around line 20-33: The manifest validation currently only ensures a workflow
field exists and doesn't detect duplicate test ids or verify the workflow
actually exists; update the loop that iterates manifest.tests (and the testIds
Set logic) to (1) detect duplicate test IDs by checking if an id is already in
testIds and push a failure like "duplicate test id X" when found, (2) verify the
referenced workflow id exists in manifest.workflows (e.g., check
manifest.workflows.some(w => w.id === test.workflow)) and push a failure if
missing, and (3) check that the workflow's mapped file exists on disk (use the
same fs.existsSync(path.join(root, <workflow>.file)) pattern) and push an
appropriate failure; keep using requireField for presence checks and append
failures to the existing failures array.

In `@tools/e2e-advisor/advisor.mjs`:
- Around line 131-145: The getChangedFiles function currently swallows failures
from execFileSync and returns [] (making callers think "no changes"); change
this to fail closed and surface degraded-state by throwing a descriptive Error
or returning a distinct sentinel (e.g., null) when both diff variants fail so
callers can emit a degraded-analysis result instead of "no changed files"; also
increase execFileSync's maxBuffer in the options used in getChangedFiles (where
execFileSync is called and the candidates array is iterated) to a larger value
(e.g., { encoding: "utf8", maxBuffer: 10 * 1024 * 1024 }) to avoid silent
truncation on large diffs.

In `@tools/e2e-advisor/comment.mjs`:
- Around line 36-48: When calling the GitHub API to create or patch the PR
comment (the github(...) calls inside the block that checks existing from
findExistingComment), wrap those calls in try/catch and detect permission-denied
errors (HTTP 403 with message like "Resource not accessible by integration" or
similar). If such an error is caught, log a clear info/debug message (e.g.,
"Skipping E2E advisor comment due to permission error") and return/exit the
function without throwing so the step is treated as best-effort; rethrow or
propagate any other unexpected errors. Ensure both the PATCH path (when
existing) and the POST path (when creating) use the same try/catch behavior.

In `@tools/e2e-advisor/pi-analyze.mjs`:
- Around line 236-258: The normalizePiResult function currently copies nested
arrays/objects from the model result into fields like classifiedDomains,
requiredTests, optionalTests, newE2eRecommendations, and dispatchHint without
validating their inner structure; update normalizePiResult to validate and
sanitize each of those fields (e.g., ensure classifiedDomains is an array of
objects with expected keys, requiredTests/optionalTests are arrays of
well-formed test descriptors, newE2eRecommendations are arrays of
strings/objects as expected, and dispatchHint is an object with allowed
properties) and if any field fails validation fallback to
baseline.<function>normalizePiResult</function> Apply the same validation
pattern used for confidence and top-level presence checks: check types, check
element shapes, filter out invalid entries, and only assign
normalized.dispatchHint/result fields when they meet the schema; otherwise
assign baseline.<function>normalized</function> fields should never be assigned
raw model payloads unless they pass these validations.

In `@tools/e2e-advisor/README.md`:
- Line 1: The file begins with the H1 "# E2E Advisor" but is missing the
repo-required SPDX license header; add the SPDX header line (e.g.,
"SPDX-License-Identifier: <LICENSE-ID>") as the very first line of the file,
above the "# E2E Advisor" heading so the SPDX identifier appears at the top of
the file per the coding guidelines.

In `@tools/e2e-advisor/schema.json`:
- Around line 67-74: The dispatchHint schema currently allows an empty object
and must require both properties; update the "dispatchHint" object schema so
that "workflow" and "jobsInput" are listed under "required" (ensuring they are
present when dispatchHint exists) while keeping "additionalProperties": false;
target the "dispatchHint" schema and add "required": ["workflow","jobsInput"] so
summary renderers can safely access dispatchHint.workflow and
dispatchHint.jobsInput.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: a74cd759-4b20-4613-8a03-9b36249c6550

📥 Commits

Reviewing files that changed from the base of the PR and between f1568f6 and f3202a7.

📒 Files selected for processing (11)
  • .github/workflows/e2e-advisor.yaml
  • scripts/check-e2e-advisor-config.ts
  • scripts/checks/run.ts
  • test/e2e/e2e-manifest.yaml
  • tools/e2e-advisor/README.md
  • tools/e2e-advisor/advisor.mjs
  • tools/e2e-advisor/comment.mjs
  • tools/e2e-advisor/pi-analyze.mjs
  • tools/e2e-advisor/pi-models.template.json
  • tools/e2e-advisor/rules.yaml
  • tools/e2e-advisor/schema.json

Comment thread .github/workflows/e2e-advisor.yaml Outdated
Comment thread .github/workflows/e2e-advisor.yaml Outdated
Comment thread scripts/check-e2e-advisor-config.ts Outdated
Comment thread tools/e2e-advisor/advisor.mjs Outdated
Comment thread tools/e2e-advisor/comment.mjs Outdated
Comment thread tools/e2e-advisor/pi-analyze.mjs Outdated
Comment thread tools/e2e-advisor/README.md
Comment thread tools/e2e-advisor/schema.json

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
scripts/check-e2e-advisor-config.ts (1)

20-33: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Add manifest cross-validation for duplicate test IDs and workflow references.

testIds is built as a set of mapped IDs, so duplicate IDs are silently collapsed, and test.workflow is only checked for presence—not whether it maps to a declared workflow (and file). This can let invalid configs pass CI.

Suggested minimal fix
-const testIds = new Set((manifest.tests ?? []).map((test: { id: string }) => test.id));
+const testIds = new Set<string>();
+const workflowById = new Map(
+  (manifest.workflows ?? []).map((workflow: { id?: string; file?: string }) => [workflow.id, workflow]),
+);

-if (testIds.size === 0) {
-  failures.push("test/e2e/e2e-manifest.yaml must define at least one test");
-}
-
 for (const test of manifest.tests ?? []) {
   requireField(test, "id", `manifest test ${test.id ?? "<unknown>"}`);
   requireField(test, "workflow", `manifest test ${test.id ?? "<unknown>"}`);
   requireField(test, "job", `manifest test ${test.id ?? "<unknown>"}`);
+  if (test.id) {
+    if (testIds.has(test.id)) {
+      failures.push(`duplicate test id ${test.id}`);
+    } else {
+      testIds.add(test.id);
+    }
+  }
+  if (test.workflow) {
+    const workflow = workflowById.get(test.workflow);
+    if (!workflow) {
+      failures.push(`manifest test ${test.id ?? "<unknown>"} references unknown workflow ${test.workflow}`);
+    } else if (!workflow.file || !fs.existsSync(path.join(root, workflow.file))) {
+      failures.push(`workflow ${test.workflow} references missing file ${workflow.file ?? "<unknown>"}`);
+    }
+  }
   if (test.script && !fs.existsSync(path.join(root, test.script))) {
     failures.push(`manifest test ${test.id} references missing script ${test.script}`);
   }
 }
+
+if (testIds.size === 0) {
+  failures.push("test/e2e/e2e-manifest.yaml must define at least one test");
+}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@scripts/check-e2e-advisor-config.ts` around lines 20 - 33, The manifest
validation silently ignores duplicate test IDs and doesn't validate that each
test.workflow points to a declared workflow file; update the loop that iterates
manifest.tests to detect duplicates by tracking seen IDs (instead of only using
testIds Set created from map) and push a failure when an ID is repeated
(referencing testIds and manifest.tests), and also validate that test.workflow
maps to an existing workflow entry/file by checking the workflows list and/or
fs.existsSync(path.join(root, workflowPath)) and pushing to failures when
missing (referencing requireField, failures, root, fs, and path).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Duplicate comments:
In `@scripts/check-e2e-advisor-config.ts`:
- Around line 20-33: The manifest validation silently ignores duplicate test IDs
and doesn't validate that each test.workflow points to a declared workflow file;
update the loop that iterates manifest.tests to detect duplicates by tracking
seen IDs (instead of only using testIds Set created from map) and push a failure
when an ID is repeated (referencing testIds and manifest.tests), and also
validate that test.workflow maps to an existing workflow entry/file by checking
the workflows list and/or fs.existsSync(path.join(root, workflowPath)) and
pushing to failures when missing (referencing requireField, failures, root, fs,
and path).

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: d99c4b6d-0131-4651-8006-6023c20f585f

📥 Commits

Reviewing files that changed from the base of the PR and between f3202a7 and 88ce405.

📒 Files selected for processing (4)
  • scripts/check-e2e-advisor-config.ts
  • tools/e2e-advisor/advisor.mjs
  • tools/e2e-advisor/comment.mjs
  • tools/e2e-advisor/pi-analyze.mjs
🚧 Files skipped from review as they are similar to previous changes (3)
  • tools/e2e-advisor/comment.mjs
  • tools/e2e-advisor/pi-analyze.mjs
  • tools/e2e-advisor/advisor.mjs

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
tools/e2e-advisor/comment.mjs (1)

36-57: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Handle permission errors from comment lookup too (Line 36).

findExistingComment(...) runs before the try, so a 403 on the list-comments call still throws and marks the step failed instead of best-effort skipping.

Suggested minimal fix
-const existing = await findExistingComment(repo, pr, token, marker);
 try {
+  const existing = await findExistingComment(repo, pr, token, marker);
   if (existing) {
     await github(`repos/${repo}/issues/comments/${existing.id}`, token, {
       method: "PATCH",
       body: { body },
     });
     console.log(`Updated E2E advisor comment on ${repo}#${pr}`);
   } else {
     await github(`repos/${repo}/issues/${pr}/comments`, token, {
       method: "POST",
       body: { body },
     });
     console.log(`Created E2E advisor comment on ${repo}#${pr}`);
   }
 } catch (error) {
   if (isPermissionError(error)) {
-    console.log(`Skipping E2E advisor comment due to permission error: ${error.message}`);
+    const message = error instanceof Error ? error.message : String(error);
+    console.log(`Skipping E2E advisor comment due to permission error: ${message}`);
   } else {
     throw error;
   }
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tools/e2e-advisor/comment.mjs` around lines 36 - 57, The call to
findExistingComment(repo, pr, token, marker) can throw a permission error before
the surrounding try/catch runs; wrap that lookup in the same try block (or add a
small try/catch around findExistingComment) and on permission errors (using
isPermissionError) skip creating/updating the comment the same way you do for
the github POST/PATCH calls—keep the existing variable name existing and then
proceed to the existing logic for PATCH/POST only if lookup succeeded, otherwise
log and return.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Duplicate comments:
In `@tools/e2e-advisor/comment.mjs`:
- Around line 36-57: The call to findExistingComment(repo, pr, token, marker)
can throw a permission error before the surrounding try/catch runs; wrap that
lookup in the same try block (or add a small try/catch around
findExistingComment) and on permission errors (using isPermissionError) skip
creating/updating the comment the same way you do for the github POST/PATCH
calls—keep the existing variable name existing and then proceed to the existing
logic for PATCH/POST only if lookup succeeded, otherwise log and return.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 15c06975-df92-4386-885a-3641d121d74a

📥 Commits

Reviewing files that changed from the base of the PR and between 88ce405 and 509214d.

📒 Files selected for processing (6)
  • .github/workflows/e2e-advisor.yaml
  • scripts/check-e2e-advisor-config.ts
  • tools/e2e-advisor/README.md
  • tools/e2e-advisor/advisor.mjs
  • tools/e2e-advisor/comment.mjs
  • tools/e2e-advisor/schema.json
🚧 Files skipped from review as they are similar to previous changes (5)
  • tools/e2e-advisor/schema.json
  • tools/e2e-advisor/README.md
  • scripts/check-e2e-advisor-config.ts
  • .github/workflows/e2e-advisor.yaml
  • tools/e2e-advisor/advisor.mjs

@wscurran

Copy link
Copy Markdown
Contributor

✨ Thanks for submitting this detailed PR about adding E2E Advisor recommendations. This change aims to improve the testing infrastructure by introducing a workflow that suggests required and optional E2E tests for PRs, enhancing the overall development and review process.


Related open PRs:

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
tools/e2e-advisor/pi-analyze.mjs (1)

226-246: ⚡ Quick win

Add a guard for non-object Pi results.

The sanitization functions address the previous review concern about validating nested structures. However, if extractJson returns null (from JSON.parse("null")), accessing result.classifiedDomains will throw a confusing error. Consider adding an early guard.

♻️ Proposed guard
 function normalizePiResult(result, metadata) {
+  if (!result || typeof result !== "object" || Array.isArray(result)) {
+    throw new Error("Pi returned a non-object result");
+  }
+
   const normalized = {
     version: 1,
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tools/e2e-advisor/pi-analyze.mjs` around lines 226 - 246, normalizePiResult
currently assumes `result` is an object and will throw if `result` is null or
not an object (e.g., JSON.parse("null")). Add an early guard in
normalizePiResult that checks whether `result` is a non-null object (typeof
result === "object" && result !== null); if not, return a normalized object with
the same top-level fields (version, baseRef/headRef from metadata, changedFiles,
and sensible defaults or empty values for classifiedDomains, requiredTests,
optionalTests, newE2eRecommendations, noE2eReason, confidence) or otherwise set
those fields to null/empty via the existing sanitize* helpers, then continue
with the existing dispatchHint handling. Ensure you reference the sanitize
functions (sanitizeDomains, sanitizeTests, sanitizeNewRecommendations,
sanitizeDispatchHint) and the metadata fields (baseRef, headRef, changedFiles)
when building the fallback normalized result.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.github/workflows/e2e-advisor.yaml:
- Around line 99-100: The workflow step named "Install Pi" currently installs
`@mariozechner/pi-coding-agent` without a version; change the npm install command
in that step to pin the package to a specific known-good semver (e.g., append
@<version>) so the action installs a fixed release instead of the latest, and
add/update repository Dependabot/Renovate configuration to track and safely
propose upgrades for `@mariozechner/pi-coding-agent`.

---

Nitpick comments:
In `@tools/e2e-advisor/pi-analyze.mjs`:
- Around line 226-246: normalizePiResult currently assumes `result` is an object
and will throw if `result` is null or not an object (e.g., JSON.parse("null")).
Add an early guard in normalizePiResult that checks whether `result` is a
non-null object (typeof result === "object" && result !== null); if not, return
a normalized object with the same top-level fields (version, baseRef/headRef
from metadata, changedFiles, and sensible defaults or empty values for
classifiedDomains, requiredTests, optionalTests, newE2eRecommendations,
noE2eReason, confidence) or otherwise set those fields to null/empty via the
existing sanitize* helpers, then continue with the existing dispatchHint
handling. Ensure you reference the sanitize functions (sanitizeDomains,
sanitizeTests, sanitizeNewRecommendations, sanitizeDispatchHint) and the
metadata fields (baseRef, headRef, changedFiles) when building the fallback
normalized result.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: b5a7ba02-9072-45ed-892c-920137126574

📥 Commits

Reviewing files that changed from the base of the PR and between 509214d and 5118e9a.

📒 Files selected for processing (3)
  • .github/workflows/e2e-advisor.yaml
  • tools/e2e-advisor/README.md
  • tools/e2e-advisor/pi-analyze.mjs

Comment thread .github/workflows/e2e-advisor.yaml Outdated
@jyaunches jyaunches requested a review from ericksoa May 11, 2026 14:51

@ericksoa ericksoa left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for putting this together. I think this needs changes before merge.

The blocking issue is the trusted-code boundary in the automatic pull_request path. The workflow checks out the PR workspace and then runs the advisor implementation from that same workspace (tools/e2e-advisor/pi-analyze.mjs / comment.mjs) while exposing the Pi/provider API key environment and an issue-comment token later in the job. That does not match the documented safety model: a future same-repo PR can modify the advisor scripts themselves and execute arbitrary Node in a secrets-bearing workflow. Please split this so the workflow runs advisor code and installs dependencies from trusted main or another pinned trusted ref, while checking out the target PR only as the read-only analysis workdir.

A second issue is the unpinned npm install -g @mariozechner/pi-coding-agent. Since this job is intended to run automatically and may carry API credentials, please pin this to a known-good version and use the normal dependency-update path for upgrades.

Two smaller fixes are also worth folding in while touching this: move findExistingComment(...) in tools/e2e-advisor/comment.mjs inside the permission-error try/catch, and add a clear guard in normalizePiResult for non-object Pi output such as JSON null.

I validated the current head locally with git diff --check, node --check tools/e2e-advisor/pi-analyze.mjs, node --check tools/e2e-advisor/comment.mjs, and the no-credential pi-analyze.mjs path. The no-credential degradation works, but the trusted-boundary issue still blocks approval.

jyaunches added 2 commits May 11, 2026 13:51
Addresses the two minor items from the PR NVIDIA#3289 review:

- tools/e2e-advisor/comment.mjs: move findExistingComment() inside the
  try/catch so a 403 from listing existing comments is treated as a
  permission error and skipped, matching the PATCH/POST error handling.
  Also coerce non-Error throwables to strings before logging.

- tools/e2e-advisor/pi-analyze.mjs: reject non-object Pi output (null,
  arrays, primitives) explicitly in normalizePiResult instead of letting
  property access on a non-object silently produce an empty result.

Refs: PR NVIDIA#3289 review by @ericksoa
Restructures the e2e-advisor workflow to fix the trusted-code boundary
violation flagged in PR NVIDIA#3289 review.

Before: the workflow checked out the PR workspace and then invoked
tools/e2e-advisor/pi-analyze.mjs and comment.mjs *from that same
workspace* while API keys (ANTHROPIC_API_KEY, OPENAI_API_KEY,
PI_E2E_ADVISOR_API_KEY, etc.) and an issues-write token were in scope.
A future same-repo PR could have modified the advisor scripts
themselves and executed arbitrary Node in a secrets-bearing job.

After:

- Advisor code (pi-analyze.mjs, comment.mjs, schema.json,
  pi-models.template.json) is always checked out from NVIDIA/NemoClaw
  main into $GITHUB_WORKSPACE/advisor and is the only code executed
  in the job. Updates to the advisor flow through main review.

- PR content (or dispatch ref, or cloned target PR) is checked out as
  inert analysis data into $GITHUB_WORKSPACE/pr-workdir. The advisor
  scripts read it via git diff / fs.readFileSync; nothing from this
  directory is executed as code.

- Both checkouts use persist-credentials: false.

- @mariozechner/pi-coding-agent is pinned via PI_AGENT_VERSION (0.73.1)
  so a compromised upstream release cannot execute automatically;
  upgrades go through a normal code-review change to this file.

- Removed the top-level 'npm ci' install step — the advisor scripts
  use only Node built-ins, so the previous install brought the full
  repo devDependency tree into the secret-bearing job for no benefit.

Refs: PR NVIDIA#3289 review by @ericksoa; CodeRabbit comment on pin.
@jyaunches

Copy link
Copy Markdown
Contributor Author

Thanks for the thorough review, @ericksoa — all four items addressed. New HEAD: 8f27e2943.

1. 🔴 Trusted-code boundary (blocking) — fixed

Commit 8f27e2943. The workflow now does a two-stage checkout:

  • advisor/NVIDIA/NemoClaw@main — the only code executed in the job (pi-analyze.mjs, comment.mjs, schema.json, pi-models.template.json). Changes to the advisor flow through main review.
  • pr-workdir/github.event.pull_request.head.shainert analysis data only. The trusted advisor scripts read it via git diff / fs.readFileSync; nothing from this directory is invoked as code.

Both checkouts use persist-credentials: false. The advisor is invoked as node "$ADVISOR_DIR/tools/e2e-advisor/pi-analyze.mjs" with cwd=$ADVISOR_WORKDIR (the PR workdir) so git diff still targets the PR content. comment.mjs is likewise invoked from $ADVISOR_DIR.

A future PR that modifies tools/e2e-advisor/*.mjs in the head ref can no longer execute its version in the secrets-bearing job.

2. 🔴 Unpinned pi-coding-agent (blocking) — fixed

Commit 8f27e2943. Pinned via a job-level env var:

env:
  PI_AGENT_VERSION: "0.73.1"
# ...
- name: Install Pi
  run: npm install -g "@mariozechner/pi-coding-agent@${PI_AGENT_VERSION}"

Version bumps are now a reviewed code change to this file. (Also resolves the CodeRabbit thread on line 100.)

3. 🟡 findExistingComment inside try/catch — fixed

Commit 2e7c5a5bc. The GET issues/{pr}/comments call now lives inside the try block, so a 403 from listing is treated as a permission error and skipped consistently with the PATCH/POST paths. Also coerced non-Error throwables to strings in the log line.

4. 🟡 normalizePiResult guard — fixed

Commit 2e7c5a5bc. Added an explicit check at the top of normalizePiResult:

if (!result || typeof result !== "object" || Array.isArray(result)) {
  throw new Error("Pi returned a non-object result");
}

null, arrays, and primitives now fail loudly via the existing writeFailure path instead of silently producing an empty result.

Incidental

  • Dropped the top-level npm ci step — the advisor scripts use only Node built-ins, so the previous install was pulling the full repo devDependency tree into a secrets-bearing job for no benefit.

Local re-validation

Re-ran the same checks you did against the new head:

  • git diff --check — clean
  • node --check tools/e2e-advisor/pi-analyze.mjs — ok
  • node --check tools/e2e-advisor/comment.mjs — ok
  • PI_E2E_ADVISOR_RUN_PI=0 node tools/e2e-advisor/pi-analyze.mjs --base origin/main --head HEAD --schema tools/e2e-advisor/schema.json --out-dir /tmp/verify — no-credential path produces the four expected artifacts (e2e-advisor-final-result.json, e2e-advisor-pi-prompt.md, e2e-advisor-pi-result.json, e2e-advisor-pi-summary.md).

@ericksoa ericksoa left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The prior blocker is addressed: the secret-bearing advisor job now runs trusted advisor code from main, treats PR contents as data, pins the Pi agent, and handles comment/write and malformed-output failure cases cleanly. Local smoke validation and live CI look good; the remaining risk is limited to the accepted same-repo secret-bearing workflow posture.

@ericksoa ericksoa merged commit 6899eb1 into NVIDIA:main May 11, 2026
20 checks passed
cv pushed a commit that referenced this pull request May 11, 2026
…3366)

## Summary

Adds `pull-requests: write` to the `e2e-advisor` workflow so the advisor
can actually post its recommendation comment on PRs.

## What's broken on `main` today

Since PR #3289 merged ~2 hours ago, the advisor workflow has been
running successfully on same-repo PRs — but the `Post E2E advisor PR
comment` step silently skips every time:

```
403 {"message":"Resource not accessible by integration"}
```

Concrete evidence on PR #3364:

| Run | Job | Comment posted? |
|---|---|---|
|
[25697796461](https://github.com/NVIDIA/NemoClaw/actions/runs/25697796461)
| ✅ success | ❌ no |
|
[25699366979](https://github.com/NVIDIA/NemoClaw/actions/runs/25699366979)
| ✅ success | ❌ no |

The runs look green because `comment.mjs` catches the 403 as a
permission error (a defensive path added during the #3289 review) and
keeps the job passing, but the user-visible artifact — the comment on
the PR — never materializes.

## Root cause

The `permissions:` block on `main` currently declares:

```yaml
permissions:
  contents: read
  pull-requests: read
  issues: write
```

The token effective permissions confirm this takes effect (from the `Set
up job` log group):

```
GITHUB_TOKEN Permissions
  Contents: read
  Issues: write
  Metadata: read
  PullRequests: read
```

Even so, `POST /repos//issues/3364/comments` returns `403
Resource not accessible by integration`.

The reason is a well-documented GitHub Actions permission-name trap: for
`GITHUB_TOKEN`, **PR comments are gated by `pull-requests: write`, not
`issues: write`** — even though the REST endpoint lives under
`/issues/:n/comments`. The `issues` permission only applies to true
issues; PRs fall under `pull-requests`. This is tracked in [community
discussion #56632](https://github.com/orgs/community/discussions/56632)
and has bitten many workflows.

## Fix

Flip `pull-requests` from `read` to `write`, and drop an inline comment
on the permissions block so the next reviewer doesn't have to rediscover
this.

`issues: write` is kept for now — it's unused on the PR-comment path but
it's dirt-cheap to leave in case we ever want the advisor to post on
true issues.

## Defense in depth

The graceful permission-error handling in
`tools/e2e-advisor/comment.mjs` (added in #3289 per @ericksoa's review)
stays in place unchanged. This PR just ensures that for a
correctly-configured repo the 403 path is no longer hit on normal
same-repo PRs.

## Testing

- `python3 -c "import yaml;
yaml.safe_load(open('.github/workflows/e2e-advisor.yaml'))"` — workflow
still parses.
- Non-code change, so local unit tests are unaffected.
- The real verification is the next same-repo PR after merge — expected
behavior is an advisor comment appearing within a few minutes.

## Out of scope for this PR

There's a second gap visible on upstream: `PI_E2E_ADVISOR_API_KEY` is
not set on `NVIDIA/NemoClaw`, so the analysis step currently degrades
to:

> `Skipped: No Pi provider credential was available in this workflow
environment`

That's an admin configuration task (Settings → Secrets and variables →
Actions → New repository secret), not a code change, so handling it
separately.

## Related

- PR #3289 (merged): introduced the advisor + the trust-boundary
restructure.
- Observed on PR #3364 (same-repo, member-authored, fresh push — exactly
the case the advisor is designed for).


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

* **Chores**
* Updated pull request workflow permissions to enable advisor to post
and update comments on pull requests.

[![Review Change
Stack](https://storage.googleapis.com/coderabbit_public_assets/review-stack-in-coderabbit-ui.svg)](https://app.coderabbit.ai/change-stack/NVIDIA/NemoClaw/pull/3366)

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
@wscurran wscurran added area: ci CI workflows, checks, release automation, or GitHub Actions area: e2e End-to-end tests, nightly failures, or validation infrastructure chore Build, CI, dependency, or tooling maintenance feature PR adds or expands user-visible functionality and removed CI/CD feature PR adds or expands user-visible functionality labels Jun 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area: ci CI workflows, checks, release automation, or GitHub Actions area: e2e End-to-end tests, nightly failures, or validation infrastructure chore Build, CI, dependency, or tooling maintenance

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants