ci: prototype E2E advisor by jyaunches · Pull Request #1 · jyaunches/NemoClaw

jyaunches · 2026-05-08T17:25:25Z

Summary

add a Phase 1 static E2E advisor prototype workflow
add an E2E manifest that maps jobs/scripts to semantic risk domains
add deterministic rules that classify changed files into domains and emit JSON/Markdown artifacts

Safety model

dry-run only: uploads artifacts and writes the job summary
read-only permissions: contents and pull requests only
static analysis only: reads git diffs/metadata and does not execute PR-provided repo scripts
no PR comments, labels, workflow dispatch, model credentials, or privileged tokens

Outputs

artifacts/e2e-advisor/e2e-advisor-input.json
artifacts/e2e-advisor/e2e-advisor-result.json
artifacts/e2e-advisor/e2e-advisor-summary.md

Test

ran locally: node tools/e2e-advisor/advisor.mjs --base origin/main --head HEAD --out-dir /tmp/e2e-advisor-smoke

github-actions · 2026-05-08T17:26:24Z

🚀 Docs preview ready!

https://NVIDIA.github.io/NemoClaw/pr-preview/pr-1/

@ericksoa

Addresses the two minor items from the PR NVIDIA#3289 review: - tools/e2e-advisor/comment.mjs: move findExistingComment() inside the try/catch so a 403 from listing existing comments is treated as a permission error and skipped, matching the PATCH/POST error handling. Also coerce non-Error throwables to strings before logging. - tools/e2e-advisor/pi-analyze.mjs: reject non-object Pi output (null, arrays, primitives) explicitly in normalizePiResult instead of letting property access on a non-object silently produce an empty result. Refs: PR NVIDIA#3289 review by @ericksoa

@ericksoa

Restructures the e2e-advisor workflow to fix the trusted-code boundary violation flagged in PR NVIDIA#3289 review. Before: the workflow checked out the PR workspace and then invoked tools/e2e-advisor/pi-analyze.mjs and comment.mjs *from that same workspace* while API keys (ANTHROPIC_API_KEY, OPENAI_API_KEY, PI_E2E_ADVISOR_API_KEY, etc.) and an issues-write token were in scope. A future same-repo PR could have modified the advisor scripts themselves and executed arbitrary Node in a secrets-bearing job. After: - Advisor code (pi-analyze.mjs, comment.mjs, schema.json, pi-models.template.json) is always checked out from NVIDIA/NemoClaw main into $GITHUB_WORKSPACE/advisor and is the only code executed in the job. Updates to the advisor flow through main review. - PR content (or dispatch ref, or cloned target PR) is checked out as inert analysis data into $GITHUB_WORKSPACE/pr-workdir. The advisor scripts read it via git diff / fs.readFileSync; nothing from this directory is executed as code. - Both checkouts use persist-credentials: false. - @mariozechner/pi-coding-agent is pinned via PI_AGENT_VERSION (0.73.1) so a compromised upstream release cannot execute automatically; upgrades go through a normal code-review change to this file. - Removed the top-level 'npm ci' install step — the advisor scripts use only Node built-ins, so the previous install brought the full repo devDependency tree into the secret-bearing job for no benefit. Refs: PR NVIDIA#3289 review by @ericksoa; CodeRabbit comment on pin.

Lands shared fixtures, helpers, assertion helpers, install-method splits, conventions + lint, and the parity-compare CI harness that unblock the per-wave migration phases (2\u201312). Deliverables (per specs/2026-05-11_e2e-test-migration/spec.md Phase 1): Fixtures (test/e2e/lib/fixtures/): - fake-openai.sh: local OpenAI-compatible endpoint (Risk NVIDIA#2 mitigation) - fake-{telegram,discord,slack}.sh: messaging stubs via shared _fake-http-stub.sh harness - older-base-image.sh: tagged ghcr base-image Dockerfile generator Helpers (test/e2e/lib/): - logging.sh: canonical e2e_{section,info,pass,fail} with stable PASS:/FAIL:/=== Phase markers (absorbs reuse category #1) - sandbox-exec.sh: canonical nemoclaw-shell wrapper with safe quoting, exit-code propagation, and dry-run short-circuit (category NVIDIA#10) - env.sh: auto-sources logging.sh so every consumer gets it for free Assertion helpers (test/e2e/lib/assert/): - inference-works.sh: chat-completion round-trip - no-credentials-leaked.sh: credential-pattern scan - policy-preset-applied.sh: gateway policy preset verification - messaging-bridge-reachable.sh: L7 proxy / bridge reachability Install dispatcher splits (test/e2e/lib/setup/): - install-{repo,curl,ollama,launchable}.sh (four profiles) - install.sh: dispatcher routes by profile/method name (category NVIDIA#5) Runtime probe wiring: - run-scenario.sh: adds --validate-only flag (probe-only, no setup) - resolver/index.ts: E2E_PROBE_OVERRIDES_JSON escape hatch for keys with embedded underscores (e.g. security.policy_engine) Convention lint + parity harness: - scripts/e2e/lint-conventions.ts: enforces 6 conventions on suite step scripts + requires parity-map.yaml entries for legacy scripts - scripts/e2e/compare-parity.sh: diffs legacy vs scenario PASS/FAIL via parity-map.yaml; flaky: true marker supported (Risk NVIDIA#4) - test/e2e/parity-map.yaml: seeded with one entry per existing legacy script; migration phases 2\u201312 append assertion mappings - .github/workflows/e2e-parity-compare.yaml: dispatches legacy script + migrated scenario on same runner and diffs outcomes Tests (all passing, 41 total): - test/e2e-lib-helpers.test.ts: +18 tests (1.A\u20131.E) - test/e2e-convention-lint.test.ts: +11 tests (1.G\u20131.H) - test/e2e-expected-state-validator.test.ts: +2 tests (1.F) No regressions: full cli Vitest project (3258 tests) still green.

jyaunches added 8 commits May 8, 2026 14:01

ci: prototype e2e advisor

f019094

ci: add pi semantic e2e analysis

18d91c5

ci: pass pi advisor prompt via stdin

c0d6f89

ci: skip pi analysis when credentials are unavailable

116fc53

ci: support generic pi advisor api key

f176032

ci: default pi advisor key to openrouter provider

118cf78

ci: preserve pi advisor artifacts on failure

c7af258

ci: configure pi advisor for nvidia inference key

01c4818

jyaunches force-pushed the ci/e2e-advisor-prototype branch from 8432f66 to 01c4818 Compare May 8, 2026 18:01

jyaunches added 15 commits May 8, 2026 14:08

ci: keep generated pi config out of artifacts

49b8919

ci: resolve pi model template relative to advisor script

2db6d27

ci: allow advisor workflow to analyze target PRs

69a4af8

ci: stage advisor assets before target PR checkout

f27dc39

ci: resolve advisor dependencies from staged workspace

d221891

ci: analyze target PR in separate checkout

4a7c3b9

ci: comment e2e advisor recommendations on PRs

37d0bda

ci: allow advisor comment to use repo write token

1cb57fb

ci: brand workflow as e2e advisor

03863fb

test: add e2e advisor config validation

f3202a7

ci: satisfy e2e advisor pre-push hooks

88ce405

ci: address e2e advisor review feedback

509214d

ci: make e2e advisor pi-first

5118e9a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ci: prototype E2E advisor#1

ci: prototype E2E advisor#1
jyaunches wants to merge 23 commits into
mainfrom
ci/e2e-advisor-prototype

jyaunches commented May 8, 2026

Uh oh!

github-actions Bot commented May 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

jyaunches commented May 8, 2026

Summary

Safety model

Outputs

Test

Uh oh!

github-actions Bot commented May 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant