ci: add E2E Advisor recommendations by jyaunches · Pull Request #3289 · NVIDIA/NemoClaw

jyaunches · 2026-05-08T20:48:14Z

Summary

Add an E2E Advisor workflow that recommends required/optional E2E tests for PRs.
Add a semantic Pi analysis layer using the NVIDIA inference-style Pi config template.
Add sticky PR comments with E2E recommendations and dispatch hints.
Add E2E manifest/rules/schema/config validation for advisor metadata.

Safety model

Static analysis only; does not execute PR-provided scripts/tests/package managers.
Pi runs with read-only tools only: read, grep, find, ls.
Pi runs with no session, extensions, skills, prompt templates, or context files.
Generated Pi credential config is written under /tmp, outside uploaded artifacts.
PR commenting is best-effort and can use E2E_ADVISOR_GITHUB_TOKEN if github.token lacks write permissions.

Secrets

PI_E2E_ADVISOR_API_KEY: required for Pi semantic analysis; otherwise semantic analysis skips and the deterministic baseline is used.
E2E_ADVISOR_GITHUB_TOKEN: optional repo-write token for PR comments when github.token cannot comment.

Validation

npm run checks
Manually ran advisor workflow from fork against fix(inference): suppress kimi reasoning output #3244 and received a successful semantic recommendation.

Follow-up

Add an E2E Recommendation Gate required check that verifies recommended E2E jobs passed for the same PR head SHA before merge.

Summary by CodeRabbit

New Features
- Added an E2E Advisor workflow that analyzes PR changes with a Pi-powered advisor, produces structured recommendations, and can post a sticky PR comment with test suggestions.
Documentation
- Added a guide describing workflow usage, required secrets, outputs, and manual invocation.
Chores
- Emits stable artifacts (prompt, raw output, parsed JSON, final result, markdown summary) and a machine-readable schema for advisor results.

coderabbitai · 2026-05-08T20:48:26Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: f97fdf42-72ee-4b2c-b110-4da79c1fb578

📥 Commits

Reviewing files that changed from the base of the PR and between 5118e9a and 8f27e29.

📒 Files selected for processing (3)

.github/workflows/e2e-advisor.yaml
tools/e2e-advisor/comment.mjs
tools/e2e-advisor/pi-analyze.mjs

🚧 Files skipped from review as they are similar to previous changes (2)

tools/e2e-advisor/comment.mjs
tools/e2e-advisor/pi-analyze.mjs

📝 Walkthrough

Walkthrough

Adds a Pi-powered E2E Advisor system that analyzes PR diffs, classifies changed files into risk domains, and recommends required/optional E2E test coverage. It includes a schema contract, analysis CLI, GitHub comment integration, and Actions workflow orchestration.

Changes

E2E Advisor System

Layer / File(s)	Summary
Data Schema & Output Contract `tools/e2e-advisor/schema.json`	JSON Schema (draft 2020-12) defines E2E Advisor result structure with version, base/head refs, changed files, classified domains with confidence, required/optional test recommendations, and optional dispatch hint for workflow execution.
Pi Model Provider Configuration `tools/e2e-advisor/pi-models.template.json`	Template JSON defining Anthropic and OpenAI provider settings with API endpoints, model entries, reasoning capabilities, context windows, and token limits for Pi agent configuration.
Pi Analysis: Initialization & Prompting `tools/e2e-advisor/pi-analyze.mjs`	Loads schema and resolves base/head refs; computes changed files via `git diff --name-only`; generates truncated diff with rename/copy detection; builds JSON-only prompt embedding schema, files list, and diff context; exits early if Pi disabled or credentials unavailable.
Pi Analysis: Execution & Parsing `tools/e2e-advisor/pi-analyze.mjs`	Selects provider/model from environment variables; prepares temporary Pi agent config directory with auth, settings, and models (token-substituted); executes `pi` binary via spawnSync with timeout/maxBuffer limits; writes combined stdout/stderr; extracts JSON via trimmed, fenced, tagged, or balanced-object parsing heuristics.
Pi Analysis: Normalization & Output `tools/e2e-advisor/pi-analyze.mjs`	Normalizes parsed Pi JSON to fixed result contract; sanitizes and constrains domain/test/recommendation fields and confidence to low\|medium\|high; conditionally includes dispatchHint; renders markdown summary with base/head, confidence, required/optional sections, new recommendations, and optional dispatch details; writes JSON and markdown artifacts.
GitHub Comment Integration `tools/e2e-advisor/comment.mjs`	Loads markdown summary (with fallback paths) and optional JSON result; parses CLI args with env fallbacks; builds marked comment body with test lists and optional dispatch hint; fetches up to 100 PR comments via GitHub REST API; searches for existing `<!-- nemoclaw-e2e-advisor -->` marker and updates via PATCH or creates new comment via POST; skips on missing credentials or permission errors.
GitHub Actions Workflow `.github/workflows/e2e-advisor.yaml`	Triggered on PR events (opened, synchronize, reopened, ready_for_review) and manual dispatch; checks out trusted advisor code and PR/workdir with full history; sets up Node.js 22; installs Pi agent; optionally targets cross-repo PR by creating temporary git repo; runs `pi-analyze.mjs` with provider/model and API-key env wiring and run gate; appends summary to job summary; posts PR comment via `comment.mjs`; uploads `artifacts/e2e-advisor/`; restricted to NVIDIA/NemoClaw with per-PR/ref concurrency cancellation.
Documentation & Usage Guide `tools/e2e-advisor/README.md`	Describes workflow triggers, fork-skip/internal-repo scoping, Pi installation, analysis step, artifacts output, sticky PR comment marker, Pi safety model (static analysis, read-only tools, no PR script execution, `/tmp` credential storage), required/optional secrets, expected artifacts, manual CLI invocation with base/head/schema/out-dir args, and the output contract defined by `schema.json`.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰 I nibble diffs and hum a tune,
I chase each changed file under the moon,
I whisper tests that ought to run,
Post findings neat—one by one.
Hop on, dear PR; let coverage bloom.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The pull request title clearly and concisely summarizes the main change: introducing an E2E Advisor workflow that provides test recommendations for pull requests.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

Generate code and open pull requests
Plan features and break down work
Investigate incidents and troubleshoot customer tickets together
Automate recurring tasks and respond to alerts with triggers
Summarize progress and report instantly

Built for teams:

Shared memory across your entire org—no repeating context
Per-thread sandboxes to safely plan and execute work
Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 8

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.github/workflows/e2e-advisor.yaml:
- Around line 76-95: The workflow currently hard-codes fetching and diffing
against target/main; add and use the PR’s actual base branch (e.g.
inputs.target_base) instead: when inputs.target_repo and inputs.target_pr are
provided, replace the git fetch of "target main" with fetching "target
${inputs.target_base}" (change the git -C "$TARGET_DIR" fetch --no-tags target
main line) and change the BASE_REF branch selection to use
"target/${inputs.target_base}" instead of "target/main" (update the BASE_REF env
expression in the Run deterministic E2E advisor baseline step); keep HEAD_REF
logic the same.
- Around line 132-136: The fallback printf contains stray shell tokens "},{"
after the redirect which breaks the step when the summary is missing; edit the
conditional fallback (the printf call inside the e2e-advisor summary step) to
remove the extraneous "},{" so the line becomes a plain printf redirecting to
"$GITHUB_STEP_SUMMARY" and ensure the surrounding if/else/fi block remains
properly balanced.

In `@scripts/check-e2e-advisor-config.ts`:
- Around line 20-33: The manifest validation currently only ensures a workflow
field exists and doesn't detect duplicate test ids or verify the workflow
actually exists; update the loop that iterates manifest.tests (and the testIds
Set logic) to (1) detect duplicate test IDs by checking if an id is already in
testIds and push a failure like "duplicate test id X" when found, (2) verify the
referenced workflow id exists in manifest.workflows (e.g., check
manifest.workflows.some(w => w.id === test.workflow)) and push a failure if
missing, and (3) check that the workflow's mapped file exists on disk (use the
same fs.existsSync(path.join(root, <workflow>.file)) pattern) and push an
appropriate failure; keep using requireField for presence checks and append
failures to the existing failures array.

In `@tools/e2e-advisor/advisor.mjs`:
- Around line 131-145: The getChangedFiles function currently swallows failures
from execFileSync and returns [] (making callers think "no changes"); change
this to fail closed and surface degraded-state by throwing a descriptive Error
or returning a distinct sentinel (e.g., null) when both diff variants fail so
callers can emit a degraded-analysis result instead of "no changed files"; also
increase execFileSync's maxBuffer in the options used in getChangedFiles (where
execFileSync is called and the candidates array is iterated) to a larger value
(e.g., { encoding: "utf8", maxBuffer: 10 * 1024 * 1024 }) to avoid silent
truncation on large diffs.

In `@tools/e2e-advisor/comment.mjs`:
- Around line 36-48: When calling the GitHub API to create or patch the PR
comment (the github(...) calls inside the block that checks existing from
findExistingComment), wrap those calls in try/catch and detect permission-denied
errors (HTTP 403 with message like "Resource not accessible by integration" or
similar). If such an error is caught, log a clear info/debug message (e.g.,
"Skipping E2E advisor comment due to permission error") and return/exit the
function without throwing so the step is treated as best-effort; rethrow or
propagate any other unexpected errors. Ensure both the PATCH path (when
existing) and the POST path (when creating) use the same try/catch behavior.

In `@tools/e2e-advisor/pi-analyze.mjs`:
- Around line 236-258: The normalizePiResult function currently copies nested
arrays/objects from the model result into fields like classifiedDomains,
requiredTests, optionalTests, newE2eRecommendations, and dispatchHint without
validating their inner structure; update normalizePiResult to validate and
sanitize each of those fields (e.g., ensure classifiedDomains is an array of
objects with expected keys, requiredTests/optionalTests are arrays of
well-formed test descriptors, newE2eRecommendations are arrays of
strings/objects as expected, and dispatchHint is an object with allowed
properties) and if any field fails validation fallback to
baseline.<function>normalizePiResult</function> Apply the same validation
pattern used for confidence and top-level presence checks: check types, check
element shapes, filter out invalid entries, and only assign
normalized.dispatchHint/result fields when they meet the schema; otherwise
assign baseline.<function>normalized</function> fields should never be assigned
raw model payloads unless they pass these validations.

In `@tools/e2e-advisor/README.md`:
- Line 1: The file begins with the H1 "# E2E Advisor" but is missing the
repo-required SPDX license header; add the SPDX header line (e.g.,
"SPDX-License-Identifier: <LICENSE-ID>") as the very first line of the file,
above the "# E2E Advisor" heading so the SPDX identifier appears at the top of
the file per the coding guidelines.

In `@tools/e2e-advisor/schema.json`:
- Around line 67-74: The dispatchHint schema currently allows an empty object
and must require both properties; update the "dispatchHint" object schema so
that "workflow" and "jobsInput" are listed under "required" (ensuring they are
present when dispatchHint exists) while keeping "additionalProperties": false;
target the "dispatchHint" schema and add "required": ["workflow","jobsInput"] so
summary renderers can safely access dispatchHint.workflow and
dispatchHint.jobsInput.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: a74cd759-4b20-4613-8a03-9b36249c6550

📥 Commits

Reviewing files that changed from the base of the PR and between f1568f6 and f3202a7.

📒 Files selected for processing (11)

.github/workflows/e2e-advisor.yaml
scripts/check-e2e-advisor-config.ts
scripts/checks/run.ts
test/e2e/e2e-manifest.yaml
tools/e2e-advisor/README.md
tools/e2e-advisor/advisor.mjs
tools/e2e-advisor/comment.mjs
tools/e2e-advisor/pi-analyze.mjs
tools/e2e-advisor/pi-models.template.json
tools/e2e-advisor/rules.yaml
tools/e2e-advisor/schema.json

coderabbitai

♻️ Duplicate comments (1)

scripts/check-e2e-advisor-config.ts (1)

20-33: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Add manifest cross-validation for duplicate test IDs and workflow references.

testIds is built as a set of mapped IDs, so duplicate IDs are silently collapsed, and test.workflow is only checked for presence—not whether it maps to a declared workflow (and file). This can let invalid configs pass CI.

Suggested minimal fix

-const testIds = new Set((manifest.tests ?? []).map((test: { id: string }) => test.id));
+const testIds = new Set<string>();
+const workflowById = new Map(
+  (manifest.workflows ?? []).map((workflow: { id?: string; file?: string }) => [workflow.id, workflow]),
+);

-if (testIds.size === 0) {
-  failures.push("test/e2e/e2e-manifest.yaml must define at least one test");
-}
-
 for (const test of manifest.tests ?? []) {
   requireField(test, "id", `manifest test ${test.id ?? "<unknown>"}`);
   requireField(test, "workflow", `manifest test ${test.id ?? "<unknown>"}`);
   requireField(test, "job", `manifest test ${test.id ?? "<unknown>"}`);
+  if (test.id) {
+    if (testIds.has(test.id)) {
+      failures.push(`duplicate test id ${test.id}`);
+    } else {
+      testIds.add(test.id);
+    }
+  }
+  if (test.workflow) {
+    const workflow = workflowById.get(test.workflow);
+    if (!workflow) {
+      failures.push(`manifest test ${test.id ?? "<unknown>"} references unknown workflow ${test.workflow}`);
+    } else if (!workflow.file || !fs.existsSync(path.join(root, workflow.file))) {
+      failures.push(`workflow ${test.workflow} references missing file ${workflow.file ?? "<unknown>"}`);
+    }
+  }
   if (test.script && !fs.existsSync(path.join(root, test.script))) {
     failures.push(`manifest test ${test.id} references missing script ${test.script}`);
   }
 }
+
+if (testIds.size === 0) {
+  failures.push("test/e2e/e2e-manifest.yaml must define at least one test");
+}

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@scripts/check-e2e-advisor-config.ts` around lines 20 - 33, The manifest
validation silently ignores duplicate test IDs and doesn't validate that each
test.workflow points to a declared workflow file; update the loop that iterates
manifest.tests to detect duplicates by tracking seen IDs (instead of only using
testIds Set created from map) and push a failure when an ID is repeated
(referencing testIds and manifest.tests), and also validate that test.workflow
maps to an existing workflow entry/file by checking the workflows list and/or
fs.existsSync(path.join(root, workflowPath)) and pushing to failures when
missing (referencing requireField, failures, root, fs, and path).

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Duplicate comments:
In `@scripts/check-e2e-advisor-config.ts`:
- Around line 20-33: The manifest validation silently ignores duplicate test IDs
and doesn't validate that each test.workflow points to a declared workflow file;
update the loop that iterates manifest.tests to detect duplicates by tracking
seen IDs (instead of only using testIds Set created from map) and push a failure
when an ID is repeated (referencing testIds and manifest.tests), and also
validate that test.workflow maps to an existing workflow entry/file by checking
the workflows list and/or fs.existsSync(path.join(root, workflowPath)) and
pushing to failures when missing (referencing requireField, failures, root, fs,
and path).

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: d99c4b6d-0131-4651-8006-6023c20f585f

📥 Commits

Reviewing files that changed from the base of the PR and between f3202a7 and 88ce405.

📒 Files selected for processing (4)

scripts/check-e2e-advisor-config.ts
tools/e2e-advisor/advisor.mjs
tools/e2e-advisor/comment.mjs
tools/e2e-advisor/pi-analyze.mjs

🚧 Files skipped from review as they are similar to previous changes (3)

tools/e2e-advisor/comment.mjs
tools/e2e-advisor/pi-analyze.mjs
tools/e2e-advisor/advisor.mjs

coderabbitai

♻️ Duplicate comments (1)

tools/e2e-advisor/comment.mjs (1)

36-57: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Handle permission errors from comment lookup too (Line 36).

findExistingComment(...) runs before the try, so a 403 on the list-comments call still throws and marks the step failed instead of best-effort skipping.

Suggested minimal fix

-const existing = await findExistingComment(repo, pr, token, marker);
 try {
+  const existing = await findExistingComment(repo, pr, token, marker);
   if (existing) {
     await github(`repos/${repo}/issues/comments/${existing.id}`, token, {
       method: "PATCH",
       body: { body },
     });
     console.log(`Updated E2E advisor comment on ${repo}#${pr}`);
   } else {
     await github(`repos/${repo}/issues/${pr}/comments`, token, {
       method: "POST",
       body: { body },
     });
     console.log(`Created E2E advisor comment on ${repo}#${pr}`);
   }
 } catch (error) {
   if (isPermissionError(error)) {
-    console.log(`Skipping E2E advisor comment due to permission error: ${error.message}`);
+    const message = error instanceof Error ? error.message : String(error);
+    console.log(`Skipping E2E advisor comment due to permission error: ${message}`);
   } else {
     throw error;
   }
 }

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tools/e2e-advisor/comment.mjs` around lines 36 - 57, The call to
findExistingComment(repo, pr, token, marker) can throw a permission error before
the surrounding try/catch runs; wrap that lookup in the same try block (or add a
small try/catch around findExistingComment) and on permission errors (using
isPermissionError) skip creating/updating the comment the same way you do for
the github POST/PATCH calls—keep the existing variable name existing and then
proceed to the existing logic for PATCH/POST only if lookup succeeded, otherwise
log and return.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Duplicate comments:
In `@tools/e2e-advisor/comment.mjs`:
- Around line 36-57: The call to findExistingComment(repo, pr, token, marker)
can throw a permission error before the surrounding try/catch runs; wrap that
lookup in the same try block (or add a small try/catch around
findExistingComment) and on permission errors (using isPermissionError) skip
creating/updating the comment the same way you do for the github POST/PATCH
calls—keep the existing variable name existing and then proceed to the existing
logic for PATCH/POST only if lookup succeeded, otherwise log and return.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 15c06975-df92-4386-885a-3641d121d74a

📥 Commits

Reviewing files that changed from the base of the PR and between 88ce405 and 509214d.

📒 Files selected for processing (6)

.github/workflows/e2e-advisor.yaml
scripts/check-e2e-advisor-config.ts
tools/e2e-advisor/README.md
tools/e2e-advisor/advisor.mjs
tools/e2e-advisor/comment.mjs
tools/e2e-advisor/schema.json

🚧 Files skipped from review as they are similar to previous changes (5)

tools/e2e-advisor/schema.json
tools/e2e-advisor/README.md
scripts/check-e2e-advisor-config.ts
.github/workflows/e2e-advisor.yaml
tools/e2e-advisor/advisor.mjs

wscurran · 2026-05-11T14:41:24Z

✨ Thanks for submitting this detailed PR about adding E2E Advisor recommendations. This change aims to improve the testing infrastructure by introducing a workflow that suggests required and optional E2E tests for PRs, enhancing the overall development and review process.

Related open PRs:

#3244 fix(inference): suppress kimi reasoning output

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

tools/e2e-advisor/pi-analyze.mjs (1)

226-246: ⚡ Quick win

Add a guard for non-object Pi results.

The sanitization functions address the previous review concern about validating nested structures. However, if extractJson returns null (from JSON.parse("null")), accessing result.classifiedDomains will throw a confusing error. Consider adding an early guard.

♻️ Proposed guard

 function normalizePiResult(result, metadata) {
+  if (!result || typeof result !== "object" || Array.isArray(result)) {
+    throw new Error("Pi returned a non-object result");
+  }
+
   const normalized = {
     version: 1,

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tools/e2e-advisor/pi-analyze.mjs` around lines 226 - 246, normalizePiResult
currently assumes `result` is an object and will throw if `result` is null or
not an object (e.g., JSON.parse("null")). Add an early guard in
normalizePiResult that checks whether `result` is a non-null object (typeof
result === "object" && result !== null); if not, return a normalized object with
the same top-level fields (version, baseRef/headRef from metadata, changedFiles,
and sensible defaults or empty values for classifiedDomains, requiredTests,
optionalTests, newE2eRecommendations, noE2eReason, confidence) or otherwise set
those fields to null/empty via the existing sanitize* helpers, then continue
with the existing dispatchHint handling. Ensure you reference the sanitize
functions (sanitizeDomains, sanitizeTests, sanitizeNewRecommendations,
sanitizeDispatchHint) and the metadata fields (baseRef, headRef, changedFiles)
when building the fallback normalized result.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.github/workflows/e2e-advisor.yaml:
- Around line 99-100: The workflow step named "Install Pi" currently installs
`@mariozechner/pi-coding-agent` without a version; change the npm install command
in that step to pin the package to a specific known-good semver (e.g., append
@<version>) so the action installs a fixed release instead of the latest, and
add/update repository Dependabot/Renovate configuration to track and safely
propose upgrades for `@mariozechner/pi-coding-agent`.

---

Nitpick comments:
In `@tools/e2e-advisor/pi-analyze.mjs`:
- Around line 226-246: normalizePiResult currently assumes `result` is an object
and will throw if `result` is null or not an object (e.g., JSON.parse("null")).
Add an early guard in normalizePiResult that checks whether `result` is a
non-null object (typeof result === "object" && result !== null); if not, return
a normalized object with the same top-level fields (version, baseRef/headRef
from metadata, changedFiles, and sensible defaults or empty values for
classifiedDomains, requiredTests, optionalTests, newE2eRecommendations,
noE2eReason, confidence) or otherwise set those fields to null/empty via the
existing sanitize* helpers, then continue with the existing dispatchHint
handling. Ensure you reference the sanitize functions (sanitizeDomains,
sanitizeTests, sanitizeNewRecommendations, sanitizeDispatchHint) and the
metadata fields (baseRef, headRef, changedFiles) when building the fallback
normalized result.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: b5a7ba02-9072-45ed-892c-920137126574

📥 Commits

Reviewing files that changed from the base of the PR and between 509214d and 5118e9a.

📒 Files selected for processing (3)

.github/workflows/e2e-advisor.yaml
tools/e2e-advisor/README.md
tools/e2e-advisor/pi-analyze.mjs

ericksoa

Thanks for putting this together. I think this needs changes before merge.

The blocking issue is the trusted-code boundary in the automatic pull_request path. The workflow checks out the PR workspace and then runs the advisor implementation from that same workspace (tools/e2e-advisor/pi-analyze.mjs / comment.mjs) while exposing the Pi/provider API key environment and an issue-comment token later in the job. That does not match the documented safety model: a future same-repo PR can modify the advisor scripts themselves and execute arbitrary Node in a secrets-bearing workflow. Please split this so the workflow runs advisor code and installs dependencies from trusted main or another pinned trusted ref, while checking out the target PR only as the read-only analysis workdir.

A second issue is the unpinned npm install -g @mariozechner/pi-coding-agent. Since this job is intended to run automatically and may carry API credentials, please pin this to a known-good version and use the normal dependency-update path for upgrades.

Two smaller fixes are also worth folding in while touching this: move findExistingComment(...) in tools/e2e-advisor/comment.mjs inside the permission-error try/catch, and add a clear guard in normalizePiResult for non-object Pi output such as JSON null.

I validated the current head locally with git diff --check, node --check tools/e2e-advisor/pi-analyze.mjs, node --check tools/e2e-advisor/comment.mjs, and the no-credential pi-analyze.mjs path. The no-credential degradation works, but the trusted-boundary issue still blocks approval.

@ericksoa

Addresses the two minor items from the PR NVIDIA#3289 review: - tools/e2e-advisor/comment.mjs: move findExistingComment() inside the try/catch so a 403 from listing existing comments is treated as a permission error and skipped, matching the PATCH/POST error handling. Also coerce non-Error throwables to strings before logging. - tools/e2e-advisor/pi-analyze.mjs: reject non-object Pi output (null, arrays, primitives) explicitly in normalizePiResult instead of letting property access on a non-object silently produce an empty result. Refs: PR NVIDIA#3289 review by @ericksoa

@ericksoa

Restructures the e2e-advisor workflow to fix the trusted-code boundary violation flagged in PR NVIDIA#3289 review. Before: the workflow checked out the PR workspace and then invoked tools/e2e-advisor/pi-analyze.mjs and comment.mjs *from that same workspace* while API keys (ANTHROPIC_API_KEY, OPENAI_API_KEY, PI_E2E_ADVISOR_API_KEY, etc.) and an issues-write token were in scope. A future same-repo PR could have modified the advisor scripts themselves and executed arbitrary Node in a secrets-bearing job. After: - Advisor code (pi-analyze.mjs, comment.mjs, schema.json, pi-models.template.json) is always checked out from NVIDIA/NemoClaw main into $GITHUB_WORKSPACE/advisor and is the only code executed in the job. Updates to the advisor flow through main review. - PR content (or dispatch ref, or cloned target PR) is checked out as inert analysis data into $GITHUB_WORKSPACE/pr-workdir. The advisor scripts read it via git diff / fs.readFileSync; nothing from this directory is executed as code. - Both checkouts use persist-credentials: false. - @mariozechner/pi-coding-agent is pinned via PI_AGENT_VERSION (0.73.1) so a compromised upstream release cannot execute automatically; upgrades go through a normal code-review change to this file. - Removed the top-level 'npm ci' install step — the advisor scripts use only Node built-ins, so the previous install brought the full repo devDependency tree into the secret-bearing job for no benefit. Refs: PR NVIDIA#3289 review by @ericksoa; CodeRabbit comment on pin.

jyaunches · 2026-05-11T17:58:46Z

Thanks for the thorough review, @ericksoa — all four items addressed. New HEAD: 8f27e2943.

1. 🔴 Trusted-code boundary (blocking) — fixed

Commit 8f27e2943. The workflow now does a two-stage checkout:

advisor/ ← NVIDIA/NemoClaw@main — the only code executed in the job (pi-analyze.mjs, comment.mjs, schema.json, pi-models.template.json). Changes to the advisor flow through main review.
pr-workdir/ ← github.event.pull_request.head.sha — inert analysis data only. The trusted advisor scripts read it via git diff / fs.readFileSync; nothing from this directory is invoked as code.

Both checkouts use persist-credentials: false. The advisor is invoked as node "$ADVISOR_DIR/tools/e2e-advisor/pi-analyze.mjs" with cwd=$ADVISOR_WORKDIR (the PR workdir) so git diff still targets the PR content. comment.mjs is likewise invoked from $ADVISOR_DIR.

A future PR that modifies tools/e2e-advisor/*.mjs in the head ref can no longer execute its version in the secrets-bearing job.

2. 🔴 Unpinned `pi-coding-agent` (blocking) — fixed

Commit 8f27e2943. Pinned via a job-level env var:

env:
  PI_AGENT_VERSION: "0.73.1"
# ...
- name: Install Pi
  run: npm install -g "@mariozechner/pi-coding-agent@${PI_AGENT_VERSION}"

Version bumps are now a reviewed code change to this file. (Also resolves the CodeRabbit thread on line 100.)

3. 🟡 `findExistingComment` inside try/catch — fixed

Commit 2e7c5a5bc. The GET issues/{pr}/comments call now lives inside the try block, so a 403 from listing is treated as a permission error and skipped consistently with the PATCH/POST paths. Also coerced non-Error throwables to strings in the log line.

4. 🟡 `normalizePiResult` guard — fixed

Commit 2e7c5a5bc. Added an explicit check at the top of normalizePiResult:

if (!result || typeof result !== "object" || Array.isArray(result)) {
  throw new Error("Pi returned a non-object result");
}

null, arrays, and primitives now fail loudly via the existing writeFailure path instead of silently producing an empty result.

Incidental

Dropped the top-level npm ci step — the advisor scripts use only Node built-ins, so the previous install was pulling the full repo devDependency tree into a secrets-bearing job for no benefit.

Local re-validation

Re-ran the same checks you did against the new head:

git diff --check — clean
node --check tools/e2e-advisor/pi-analyze.mjs — ok
node --check tools/e2e-advisor/comment.mjs — ok
PI_E2E_ADVISOR_RUN_PI=0 node tools/e2e-advisor/pi-analyze.mjs --base origin/main --head HEAD --schema tools/e2e-advisor/schema.json --out-dir /tmp/verify — no-credential path produces the four expected artifacts (e2e-advisor-final-result.json, e2e-advisor-pi-prompt.md, e2e-advisor-pi-result.json, e2e-advisor-pi-summary.md).

ericksoa

The prior blocker is addressed: the secret-bearing advisor job now runs trusted advisor code from main, treats PR contents as data, pins the Pi agent, and handles comment/write and malformed-output failure cases cleanly. Local smoke validation and live CI look good; the remaining risk is limited to the accepted same-repo secret-bearing workflow posture.

@ericksoa

…3366) ## Summary Adds `pull-requests: write` to the `e2e-advisor` workflow so the advisor can actually post its recommendation comment on PRs. ## What's broken on `main` today Since PR #3289 merged ~2 hours ago, the advisor workflow has been running successfully on same-repo PRs — but the `Post E2E advisor PR comment` step silently skips every time: ``` 403 {"message":"Resource not accessible by integration"} ``` Concrete evidence on PR #3364: | Run | Job | Comment posted? | |---|---|---| | [25697796461](https://github.com/NVIDIA/NemoClaw/actions/runs/25697796461) | ✅ success | ❌ no | | [25699366979](https://github.com/NVIDIA/NemoClaw/actions/runs/25699366979) | ✅ success | ❌ no | The runs look green because `comment.mjs` catches the 403 as a permission error (a defensive path added during the #3289 review) and keeps the job passing, but the user-visible artifact — the comment on the PR — never materializes. ## Root cause The `permissions:` block on `main` currently declares: ```yaml permissions: contents: read pull-requests: read issues: write ``` The token effective permissions confirm this takes effect (from the `Set up job` log group): ``` GITHUB_TOKEN Permissions Contents: read Issues: write Metadata: read PullRequests: read ``` Even so, `POST /repos//issues/3364/comments` returns `403 Resource not accessible by integration`. The reason is a well-documented GitHub Actions permission-name trap: for `GITHUB_TOKEN`, **PR comments are gated by `pull-requests: write`, not `issues: write`** — even though the REST endpoint lives under `/issues/:n/comments`. The `issues` permission only applies to true issues; PRs fall under `pull-requests`. This is tracked in [community discussion #56632](https://github.com/orgs/community/discussions/56632) and has bitten many workflows. ## Fix Flip `pull-requests` from `read` to `write`, and drop an inline comment on the permissions block so the next reviewer doesn't have to rediscover this. `issues: write` is kept for now — it's unused on the PR-comment path but it's dirt-cheap to leave in case we ever want the advisor to post on true issues. ## Defense in depth The graceful permission-error handling in `tools/e2e-advisor/comment.mjs` (added in #3289 per @ericksoa's review) stays in place unchanged. This PR just ensures that for a correctly-configured repo the 403 path is no longer hit on normal same-repo PRs. ## Testing - `python3 -c "import yaml; yaml.safe_load(open('.github/workflows/e2e-advisor.yaml'))"` — workflow still parses. - Non-code change, so local unit tests are unaffected. - The real verification is the next same-repo PR after merge — expected behavior is an advisor comment appearing within a few minutes. ## Out of scope for this PR There's a second gap visible on upstream: `PI_E2E_ADVISOR_API_KEY` is not set on `NVIDIA/NemoClaw`, so the analysis step currently degrades to: > `Skipped: No Pi provider credential was available in this workflow environment` That's an admin configuration task (Settings → Secrets and variables → Actions → New repository secret), not a code change, so handling it separately. ## Related - PR #3289 (merged): introduced the advisor + the trust-boundary restructure. - Observed on PR #3364 (same-repo, member-authored, fresh push — exactly the case the advisor is designed for).  ## Summary by CodeRabbit * **Chores** * Updated pull request workflow permissions to enable advisor to post and update comments on pull requests. [![Review Change Stack](https://storage.googleapis.com/coderabbit_public_assets/review-stack-in-coderabbit-ui.svg)](https://app.coderabbit.ai/change-stack/NVIDIA/NemoClaw/pull/3366)

jyaunches added 18 commits May 8, 2026 14:01

ci: prototype e2e advisor

f019094

ci: add pi semantic e2e analysis

18d91c5

ci: pass pi advisor prompt via stdin

c0d6f89

ci: skip pi analysis when credentials are unavailable

116fc53

ci: support generic pi advisor api key

f176032

ci: default pi advisor key to openrouter provider

118cf78

ci: preserve pi advisor artifacts on failure

c7af258

ci: configure pi advisor for nvidia inference key

01c4818

ci: keep generated pi config out of artifacts

49b8919

ci: resolve pi model template relative to advisor script

2db6d27

ci: allow advisor workflow to analyze target PRs

69a4af8

ci: stage advisor assets before target PR checkout

f27dc39

ci: resolve advisor dependencies from staged workspace

d221891

ci: analyze target PR in separate checkout

4a7c3b9

ci: comment e2e advisor recommendations on PRs

37d0bda

ci: allow advisor comment to use repo write token

1cb57fb

ci: brand workflow as e2e advisor

03863fb

test: add e2e advisor config validation

f3202a7

coderabbitai Bot reviewed May 8, 2026

View reviewed changes

ci: satisfy e2e advisor pre-push hooks

88ce405

coderabbitai Bot reviewed May 8, 2026

View reviewed changes

ci: address e2e advisor review feedback

509214d

coderabbitai Bot reviewed May 11, 2026

View reviewed changes

wscurran added CI/CD labels May 11, 2026

ci: make e2e advisor pi-first

5118e9a

coderabbitai Bot reviewed May 11, 2026

View reviewed changes

Comment thread .github/workflows/e2e-advisor.yaml Outdated

jyaunches added the v0.0.39 label May 11, 2026

jyaunches requested a review from ericksoa May 11, 2026 14:51

ericksoa requested changes May 11, 2026

View reviewed changes

jyaunches added 2 commits May 11, 2026 13:51

ericksoa approved these changes May 11, 2026

View reviewed changes

ericksoa merged commit 6899eb1 into NVIDIA:main May 11, 2026
20 checks passed

jyaunches mentioned this pull request May 11, 2026

fix(ci): grant pull-requests:write so advisor can post PR comments #3366

Merged

Conversation

jyaunches commented May 8, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Safety model

Secrets

Validation

Follow-up

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

wscurran commented May 11, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ericksoa left a comment

Choose a reason for hiding this comment

Uh oh!

jyaunches commented May 11, 2026

1. 🔴 Trusted-code boundary (blocking) — fixed

2. 🔴 Unpinned pi-coding-agent (blocking) — fixed

3. 🟡 findExistingComment inside try/catch — fixed

4. 🟡 normalizePiResult guard — fixed

Incidental

Local re-validation

Uh oh!

ericksoa left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

jyaunches commented May 8, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 8, 2026 •

edited

Loading

2. 🔴 Unpinned `pi-coding-agent` (blocking) — fixed

3. 🟡 `findExistingComment` inside try/catch — fixed

4. 🟡 `normalizePiResult` guard — fixed