ci: prototype E2E advisor#1
Open
jyaunches wants to merge 23 commits into
Open
Conversation
|
🚀 Docs preview ready! |
8432f66 to
01c4818
Compare
Addresses the two minor items from the PR NVIDIA#3289 review: - tools/e2e-advisor/comment.mjs: move findExistingComment() inside the try/catch so a 403 from listing existing comments is treated as a permission error and skipped, matching the PATCH/POST error handling. Also coerce non-Error throwables to strings before logging. - tools/e2e-advisor/pi-analyze.mjs: reject non-object Pi output (null, arrays, primitives) explicitly in normalizePiResult instead of letting property access on a non-object silently produce an empty result. Refs: PR NVIDIA#3289 review by @ericksoa
Restructures the e2e-advisor workflow to fix the trusted-code boundary violation flagged in PR NVIDIA#3289 review. Before: the workflow checked out the PR workspace and then invoked tools/e2e-advisor/pi-analyze.mjs and comment.mjs *from that same workspace* while API keys (ANTHROPIC_API_KEY, OPENAI_API_KEY, PI_E2E_ADVISOR_API_KEY, etc.) and an issues-write token were in scope. A future same-repo PR could have modified the advisor scripts themselves and executed arbitrary Node in a secrets-bearing job. After: - Advisor code (pi-analyze.mjs, comment.mjs, schema.json, pi-models.template.json) is always checked out from NVIDIA/NemoClaw main into $GITHUB_WORKSPACE/advisor and is the only code executed in the job. Updates to the advisor flow through main review. - PR content (or dispatch ref, or cloned target PR) is checked out as inert analysis data into $GITHUB_WORKSPACE/pr-workdir. The advisor scripts read it via git diff / fs.readFileSync; nothing from this directory is executed as code. - Both checkouts use persist-credentials: false. - @mariozechner/pi-coding-agent is pinned via PI_AGENT_VERSION (0.73.1) so a compromised upstream release cannot execute automatically; upgrades go through a normal code-review change to this file. - Removed the top-level 'npm ci' install step — the advisor scripts use only Node built-ins, so the previous install brought the full repo devDependency tree into the secret-bearing job for no benefit. Refs: PR NVIDIA#3289 review by @ericksoa; CodeRabbit comment on pin.
jyaunches
added a commit
that referenced
this pull request
May 11, 2026
Lands shared fixtures, helpers, assertion helpers, install-method splits, conventions + lint, and the parity-compare CI harness that unblock the per-wave migration phases (2\u201312). Deliverables (per specs/2026-05-11_e2e-test-migration/spec.md Phase 1): Fixtures (test/e2e/lib/fixtures/): - fake-openai.sh: local OpenAI-compatible endpoint (Risk NVIDIA#2 mitigation) - fake-{telegram,discord,slack}.sh: messaging stubs via shared _fake-http-stub.sh harness - older-base-image.sh: tagged ghcr base-image Dockerfile generator Helpers (test/e2e/lib/): - logging.sh: canonical e2e_{section,info,pass,fail} with stable PASS:/FAIL:/=== Phase markers (absorbs reuse category #1) - sandbox-exec.sh: canonical nemoclaw-shell wrapper with safe quoting, exit-code propagation, and dry-run short-circuit (category NVIDIA#10) - env.sh: auto-sources logging.sh so every consumer gets it for free Assertion helpers (test/e2e/lib/assert/): - inference-works.sh: chat-completion round-trip - no-credentials-leaked.sh: credential-pattern scan - policy-preset-applied.sh: gateway policy preset verification - messaging-bridge-reachable.sh: L7 proxy / bridge reachability Install dispatcher splits (test/e2e/lib/setup/): - install-{repo,curl,ollama,launchable}.sh (four profiles) - install.sh: dispatcher routes by profile/method name (category NVIDIA#5) Runtime probe wiring: - run-scenario.sh: adds --validate-only flag (probe-only, no setup) - resolver/index.ts: E2E_PROBE_OVERRIDES_JSON escape hatch for keys with embedded underscores (e.g. security.policy_engine) Convention lint + parity harness: - scripts/e2e/lint-conventions.ts: enforces 6 conventions on suite step scripts + requires parity-map.yaml entries for legacy scripts - scripts/e2e/compare-parity.sh: diffs legacy vs scenario PASS/FAIL via parity-map.yaml; flaky: true marker supported (Risk NVIDIA#4) - test/e2e/parity-map.yaml: seeded with one entry per existing legacy script; migration phases 2\u201312 append assertion mappings - .github/workflows/e2e-parity-compare.yaml: dispatches legacy script + migrated scenario on same runner and diffs outcomes Tests (all passing, 41 total): - test/e2e-lib-helpers.test.ts: +18 tests (1.A\u20131.E) - test/e2e-convention-lint.test.ts: +11 tests (1.G\u20131.H) - test/e2e-expected-state-validator.test.ts: +2 tests (1.F) No regressions: full cli Vitest project (3258 tests) still green.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Safety model
Outputs
Test