Skip to content

refactor(ci): embed e2e advisor with Pi SDK#3504

Merged
jyaunches merged 14 commits into
mainfrom
refactor/e2e-advisor-sdk
May 14, 2026
Merged

refactor(ci): embed e2e advisor with Pi SDK#3504
jyaunches merged 14 commits into
mainfrom
refactor/e2e-advisor-sdk

Conversation

@cv

@cv cv commented May 14, 2026

Copy link
Copy Markdown
Collaborator

Summary

Refactors the E2E advisor from a spawned Pi CLI process into an embedded Pi SDK session. This keeps the recommendation behavior and artifact contract while simplifying process handling and adding an exported HTML session artifact.

Changes

  • Replace tools/e2e-advisor/pi-analyze.mjs with TypeScript tools/e2e-advisor/analyze.mts using the Pi SDK directly.
  • Update the E2E advisor workflow to install @earendil-works/pi-coding-agent, run the TypeScript analyzer, and upload renamed advisor artifacts including e2e-advisor-session.html.
  • Rename advisor reports/artifacts away from pi-* wording while keeping the existing PI_E2E_ADVISOR_API_KEY GitHub secret mapped into the analyzer environment.
  • Update E2E advisor docs and PR comment defaults for the new artifact names.

Type of Change

  • Code change (feature, bug fix, or refactor)
  • Code change with doc updates
  • Doc only (prose changes, no code sample modifications)
  • Doc only (includes code sample changes)

Verification

  • npx prek run --all-files passes
  • npm test passes
  • Tests added or updated for new or changed behavior
  • No secrets, API keys, or credentials committed
  • Docs updated for user-facing behavior changes
  • make docs builds without warnings (doc changes only)
  • Doc pages follow the style guide (doc changes only)
  • New doc pages include SPDX header and frontmatter (new pages only)

Note: npm test was run, but the existing installer-integration test skips onboarding when shared host preflight detects Docker is missing fails locally with status 1 instead of 0; targeted rerun reproduces the same unrelated failure.


Signed-off-by: Carlos Villela cvillela@nvidia.com

Summary by CodeRabbit

  • Chores

    • Updated E2E Advisor workflow to an SDK-backed analysis flow: renamed run toggle and job title, removed legacy provider/model inputs, reduced environment surface for API keys, isolated SDK install, and consolidated summary artifact used for PR comments.
    • Replaced the prior analysis script with a new CI-friendly analyzer and added the SDK dev dependency.
  • Documentation

    • Revised E2E Advisor docs and manual-run instructions to reflect the SDK-powered reviewer, artifacts, and safety model.

@cv cv self-assigned this May 14, 2026
@coderabbitai

coderabbitai Bot commented May 14, 2026

Copy link
Copy Markdown
Contributor

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: c86ab936-3672-4f8c-abab-ffbcd5103e26

📥 Commits

Reviewing files that changed from the base of the PR and between 576f34b and a615374.

📒 Files selected for processing (3)
  • .github/workflows/e2e-advisor.yaml
  • tools/e2e-advisor/README.md
  • tools/e2e-advisor/analyze.mts
✅ Files skipped from review due to trivial changes (1)
  • tools/e2e-advisor/README.md
🚧 Files skipped from review as they are similar to previous changes (2)
  • .github/workflows/e2e-advisor.yaml
  • tools/e2e-advisor/analyze.mts

📝 Walkthrough

Walkthrough

Migrate the E2E advisor from a Pi-specific CLI to an SDK-driven analyzer: add an SDK devDependency, add a TypeScript analyzer CLI that builds schema-driven prompts, normalizes JSON results, and writes artifacts; update the workflow to install and run the SDK-based analyzer; refresh docs, comment tooling, and artifact/summary paths.

Changes

E2E Advisor SDK Migration

Layer / File(s) Summary
SDK dependency
package.json
Add @earendil-works/pi-coding-agent@0.74.0 to devDependencies.
Workflow inputs & wiring
.github/workflows/e2e-advisor.yaml
Replace run_pi input with run_analysis (default true); remove shown Pi-specific provider/model inputs; rename job display to “E2E recommendation”; replace pinned env var PI_AGENT_VERSIONPI_SDK_VERSION; update artifact/PR comment --summary path to artifacts/e2e-advisor/e2e-advisor-summary.md.
Trusted install & execution
.github/workflows/e2e-advisor.yaml
Stop global install + pi-analyze.mjs. Instead install @earendil-works/pi-coding-agent@${PI_SDK_VERSION} into an isolated npm prefix (--ignore-scripts), symlink its node_modules into the trusted checkout, and run tools/e2e-advisor/analyze.mts via node --experimental-strip-types. Switch analyzer toggle to E2E_ADVISOR_RUN_ANALYSIS and reduce provider-env wiring to E2E_ADVISOR_API_KEY and OPENAI_API_KEY.
Analyzer CLI entry & orchestration
tools/e2e-advisor/analyze.mts
New TypeScript CLI: parse args/env (base/head/schema/outDir), compute changed files and truncated git diff, write prompt artifact, gate on E2E_ADVISOR_RUN_ANALYSIS, prepare in-memory SDK auth/config when E2E_ADVISOR_API_KEY/OPENAI_API_KEY present, run read-only agent session with timeout/heartbeat and capped capture buffers, export session HTML best-effort, extract JSON via multiple strategies, normalize to validated AdvisorResult, render markdown summary, write artifacts (prompt/raw/result/final/summary), and set exit codes (0 for skip, 1 for failure).
Capture, parsing & normalization internals
tools/e2e-advisor/analyze.mts (CappedBuffer, extractJson, normalizeAdvisorResult)
Byte-accurate UTF-8-capped buffer with dropped-byte accounting; multi-strategy JSON extraction (raw, fenced, tagged, balanced substring); sanitization/filtering of domains/tests/recommendations; confidence coercion; optional dispatchHint validation; standardized unavailable/failed artifact outputs.
Comment tool & docs, legacy removal
tools/e2e-advisor/comment.mts, tools/e2e-advisor/README.md, (deleted) tools/e2e-advisor/pi-analyze.mjs
Change default summary path to artifacts/e2e-advisor/e2e-advisor-summary.md; update README to describe SDK-based analyzer, manual run node --experimental-strip-types tools/e2e-advisor/analyze.mts, new artifact names, and revised safety/credentials wording; remove legacy pi-analyze.mjs.
sequenceDiagram
    participant GH as GitHub Actions
    participant Analyzer as analyze.mts
    participant Git as Git
    participant SDK as Advisor SDK
    participant Provider as AI Provider
    participant Artifacts as Artifacts

    GH->>Analyzer: workflow dispatch / env
    Analyzer->>Git: compute changed files & diff
    Git-->>Analyzer: files + diff
    Analyzer->>Artifacts: write prompt artifact
    alt skip (E2E_ADVISOR_RUN_ANALYSIS=0 or no creds)
        Analyzer->>Artifacts: write unavailable summary/result
        Analyzer->>GH: exit 0
    else run analysis
        Analyzer->>SDK: prepare in-memory config, select model
        SDK->>Provider: send prompt, enforce read-only tools
        Provider-->>SDK: response + tool events
        SDK-->>Analyzer: captured output (capped, timeout-bounded)
        Analyzer->>Analyzer: extract JSON, normalize AdvisorResult
        Analyzer->>Artifacts: write raw/result/final/summary
        Analyzer->>GH: publish summary, post PR comment
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • NVIDIA/NemoClaw#3426: Shares E2E advisor CI/commenting changes and depends on advisor artifacts consumed by downstream automation.

Suggested labels

CI/CD, E2E

Suggested reviewers

  • ericksoa
  • jyaunches

"I hopped through diffs and built a prompt,
Trimmed the bytes and kept it prompt-perfect,
SDK-powered hops, no scripts allowed,
Summary tucked in artifacts proud. 🐇"

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'refactor(ci): embed e2e advisor with Pi SDK' directly and clearly summarizes the main change: refactoring the CI workflow to embed the E2E advisor using the Pi SDK instead of spawning an external process.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch refactor/e2e-advisor-sdk

Comment @coderabbitai help to get the list of available commands and usage tips.

Comment thread tools/e2e-advisor/analyze.mts Fixed

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
.github/workflows/e2e-advisor.yaml (1)

185-189: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Guard trusted analyzer entrypoint existence before execution.

Line 185 invokes analyze.mts unconditionally from the trusted main checkout, which is causing MODULE_NOT_FOUND in CI and cascading artifact/comment failures.

💡 Suggested fix
       - name: Run E2E recommendation advisor
         id: analysis
         continue-on-error: true
         env:
@@
         run: |
           cd "$ADVISOR_WORKDIR"
+          if [ ! -f "$ADVISOR_DIR/tools/e2e-advisor/analyze.mts" ]; then
+            mkdir -p "$GITHUB_WORKSPACE/artifacts/e2e-advisor"
+            printf '# E2E Recommendation Advisor\n\nSkipped: trusted main checkout does not yet contain tools/e2e-advisor/analyze.mts.\n' > "$GITHUB_WORKSPACE/artifacts/e2e-advisor/e2e-advisor-summary.md"
+            exit 0
+          fi
           node --experimental-strip-types "$ADVISOR_DIR/tools/e2e-advisor/analyze.mts" \
             --base "$BASE_REF" \
             --head "$HEAD_REF" \
             --schema "$ADVISOR_DIR/tools/e2e-advisor/schema.json" \
             --out-dir "$GITHUB_WORKSPACE/artifacts/e2e-advisor"
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/e2e-advisor.yaml around lines 185 - 189, Guard the trusted
analyzer entrypoint before invoking node: check that
"$ADVISOR_DIR/tools/e2e-advisor/analyze.mts" exists and is readable and only run
the node command if it does; if the file is missing, log a clear message and
skip or exit gracefully so CI doesn't show MODULE_NOT_FOUND and cascade
failures. Update the step that runs analyze.mts to perform this existence check
(refer to analyze.mts and the ADVISOR_DIR variable and the node invocation that
writes to "$GITHUB_WORKSPACE/artifacts/e2e-advisor") and ensure downstream steps
handle the skipped case appropriately.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In @.github/workflows/e2e-advisor.yaml:
- Around line 185-189: Guard the trusted analyzer entrypoint before invoking
node: check that "$ADVISOR_DIR/tools/e2e-advisor/analyze.mts" exists and is
readable and only run the node command if it does; if the file is missing, log a
clear message and skip or exit gracefully so CI doesn't show MODULE_NOT_FOUND
and cascade failures. Update the step that runs analyze.mts to perform this
existence check (refer to analyze.mts and the ADVISOR_DIR variable and the node
invocation that writes to "$GITHUB_WORKSPACE/artifacts/e2e-advisor") and ensure
downstream steps handle the skipped case appropriately.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: c00fde76-cd5d-4365-acca-ee660299208b

📥 Commits

Reviewing files that changed from the base of the PR and between a78ea16 and 746e7f2.

⛔ Files ignored due to path filters (1)
  • package-lock.json is excluded by !**/package-lock.json
📒 Files selected for processing (7)
  • .github/workflows/e2e-advisor.yaml
  • package.json
  • tools/e2e-advisor/README.md
  • tools/e2e-advisor/analyze.mts
  • tools/e2e-advisor/comment.mts
  • tools/e2e-advisor/models.template.json
  • tools/e2e-advisor/pi-analyze.mjs
💤 Files with no reviewable changes (1)
  • tools/e2e-advisor/pi-analyze.mjs

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tools/e2e-advisor/analyze.mts`:
- Around line 349-354: The truncation banner can be lost because
raw.prepend(...) may be immediately trimmed away if raw is already full; change
the pattern to ensure space for the banner before inserting it (e.g., compute
banner = `<...truncated...>` then call a method that trims the buffer from the
opposite end or explicitly remove enough bytes from the end/front to make room,
and only then prepend), or alternatively append the banner to the end of the
buffer instead of prepending so trimming won't remove it. Update the occurrences
that use raw.prepend(...) (and the related text.droppedBytes logic) at the
locations shown (the prepend calls around the variables raw and text) to first
reserve room for the banner or append the banner so the truncation marker cannot
be immediately discarded.
- Around line 104-105: Temporary advisor configDir (variable configDir) and
files like auth.json/models.json and persisted session data must be removed
after the run to avoid leaving API keys on disk; add deterministic cleanup that
removes configDir (use fs.promises.rm or fs.rm with recursive:true, force:true)
in all exit scenarios by invoking deletion in a finally block around the main
async flow and/or registering handlers for process exit signals (exit, SIGINT,
uncaughtException) so configDir is always deleted whether the run succeeds or
errors; ensure deletion occurs after any code that reads/writes those files (the
blocks that write auth.json/models.json and persist session data) to avoid race
conditions.
- Around line 131-137: The credential check (hasLikelyAdvisorCredential()) and
the model materialization (prepareAdvisorConfig()) are inconsistent: preflight
allows provider-specific creds (ANTHROPIC_API_KEY, OPENAI_API_KEY, Bedrock etc.)
but prepareAdvisorConfig() only writes models.json when E2E_ADVISOR_API_KEY is
present, causing "Could not find configured advisor model" later. Fix by making
the flows consistent: either (A) tighten hasLikelyAdvisorCredential() to require
E2E_ADVISOR_API_KEY only, or (B) update prepareAdvisorConfig() to detect
provider-specific credentials (use the same detection logic as
hasLikelyAdvisorCredential()), resolve provider with E2E_ADVISOR_PROVIDER ||
defaultProvider detection, call defaultModelForProvider(provider) and write a
non-empty models.json for that provider even when E2E_ADVISOR_API_KEY is not
set; ensure code paths that reference modelPattern (E2E_ADVISOR_MODEL ||
defaultModelForProvider(provider)) match this logic so prepareAdvisorConfig()
materializes the same model names.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 8f0c5f45-3b31-40d0-b730-9eb1a4fa5f94

📥 Commits

Reviewing files that changed from the base of the PR and between 746e7f2 and 2d23d07.

📒 Files selected for processing (1)
  • tools/e2e-advisor/analyze.mts

Comment thread tools/e2e-advisor/analyze.mts
Comment thread tools/e2e-advisor/analyze.mts Outdated
Comment thread tools/e2e-advisor/analyze.mts

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
.github/workflows/e2e-advisor.yaml (1)

88-93: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Don't switch the trusted entrypoint to analyze.mts yet.

This job executes the analyzer from the separate ref: main checkout, not from the PR branch. main does not contain tools/e2e-advisor/analyze.mts yet, so the step hard-fails with MODULE_NOT_FOUND before any advisor artifacts are produced. Please keep the trusted run path on the existing analyzer until main has the new file, or add the same .mts/legacy fallback pattern you already use for dispatch.mts and comment.mts.

Also applies to: 157-175

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/e2e-advisor.yaml around lines 88 - 93, The workflow step
that checks out the trusted advisor code and then runs analyze.mts is switching
the trusted run to a file that doesn't exist on main and causes
MODULE_NOT_FOUND; restore the existing trusted analyzer path (keep running the
analyzer from the PR checkout) or add the same fallback logic used for
dispatch.mts and comment.mts so analyze resolution tries the legacy path first
then analyze.mts (mirror the pattern used around dispatch.mts/comment.mts);
update the checkout/step that references analyze.mts and the corresponding
resolution logic in the workflow (also apply the same change to the other
occurrence referenced in lines 157-175) so the job does not hard-fail when main
lacks tools/e2e-advisor/analyze.mts.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In @.github/workflows/e2e-advisor.yaml:
- Around line 88-93: The workflow step that checks out the trusted advisor code
and then runs analyze.mts is switching the trusted run to a file that doesn't
exist on main and causes MODULE_NOT_FOUND; restore the existing trusted analyzer
path (keep running the analyzer from the PR checkout) or add the same fallback
logic used for dispatch.mts and comment.mts so analyze resolution tries the
legacy path first then analyze.mts (mirror the pattern used around
dispatch.mts/comment.mts); update the checkout/step that references analyze.mts
and the corresponding resolution logic in the workflow (also apply the same
change to the other occurrence referenced in lines 157-175) so the job does not
hard-fail when main lacks tools/e2e-advisor/analyze.mts.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 1c50e94f-63ad-43d8-94eb-7ec3a930cfc0

📥 Commits

Reviewing files that changed from the base of the PR and between 576f34b and 60f0aa0.

📒 Files selected for processing (3)
  • .github/workflows/e2e-advisor.yaml
  • tools/e2e-advisor/README.md
  • tools/e2e-advisor/analyze.mts
✅ Files skipped from review due to trivial changes (1)
  • tools/e2e-advisor/README.md

@cv cv added the v0.0.42 label May 14, 2026
@wscurran wscurran added CI/CD refactor PR restructures code without intended behavior change labels May 14, 2026
@jyaunches jyaunches merged commit 85c6d7f into main May 14, 2026
26 checks passed
@cv cv deleted the refactor/e2e-advisor-sdk branch May 27, 2026 21:17
@wscurran wscurran added the area: ci CI workflows, checks, release automation, or GitHub Actions label Jun 3, 2026
@wscurran wscurran added area: e2e End-to-end tests, nightly failures, or validation infrastructure chore Build, CI, dependency, or tooling maintenance and removed CI/CD chore Build, CI, dependency, or tooling maintenance labels Jun 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area: ci CI workflows, checks, release automation, or GitHub Actions area: e2e End-to-end tests, nightly failures, or validation infrastructure refactor PR restructures code without intended behavior change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants