Skip to content

test(e2e): add Hermes live Vitest migration [ANCHOR-5]#5256

Merged
cv merged 1 commit into
mainfrom
e2e-migrate/test-hermes-e2e-simple-signed
Jun 11, 2026
Merged

test(e2e): add Hermes live Vitest migration [ANCHOR-5]#5256
cv merged 1 commit into
mainfrom
e2e-migrate/test-hermes-e2e-simple-signed

Conversation

@jyaunches

@jyaunches jyaunches commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Supersedes #5227 due to an unsigned historical commit that cannot be rewritten under branch rules. This branch has the identical final diff with verified signed history.

Summary

Migrate test/e2e/test-hermes-e2e.sh with simple live Vitest coverage.

Related Issues

Refs #5098

Contract mapping

  • Legacy assertion: non-interactive install selects and onboards Hermes via NEMOCLAW_AGENT=hermes.
    • Replacement: test/e2e-scenario/live/hermes-e2e.test.ts runs bash install.sh --non-interactive with Hermes env.
    • Boundary preserved: real installer shell, Docker/OpenShell, host PATH/install side effects.
  • Legacy assertion: Hermes sandbox exists, status works, session records agent=hermes, inference provider and policy are configured.
    • Replacement: live Vitest nemoclaw list/status, session JSON, openshell inference get, and openshell policy get --full assertions.
    • Boundary preserved: real nemoclaw and openshell commands.
  • Legacy assertion: Hermes health, binary, config/state directory, and optional dashboard respond from the sandbox/host.
    • Replacement: sandbox exec health/version/config probes plus optional dashboard registry/forward/HTTP checks.
    • Boundary preserved: real sandbox exec, HTTP, OpenShell forwards.
  • Legacy assertion: live NVIDIA Endpoints and sandbox inference.local chat return PONG.
    • Replacement: direct provider curl and sandbox curl https://inference.local/v1/chat/completions assertions.
    • Boundary preserved: real external provider call and sandbox routing path.
  • Legacy assertion: CLI logs and agent manifest loading still work.
    • Replacement: nemoclaw <sandbox> logs and bin/lib/agent-defs manifest checks.
    • Boundary preserved: real CLI and built repo module load.

Simplicity check

  • Test shape: simple live Vitest test.
  • New shared helpers: none.
  • New framework/registry/ledger: none.
  • Workflow changes: adds a selective hermes-e2e free-standing Vitest job in e2e-vitest-scenarios.yaml; legacy shell script deletion and nightly shell retirement are deferred to Epic: Migrate legacy bash E2E into the Vitest E2E system #5098 Phase 11.

Verification

  • npm ci --ignore-scripts
  • npm run build:cli
  • npx vitest run --project e2e-vitest-support test/e2e-scenario/support-tests/e2e-scenarios-workflow.test.ts test/e2e-scenario/support-tests/e2e-scenario-matrix.test.ts --silent=false --reporter=default
  • env -u NVIDIA_API_KEY NEMOCLAW_RUN_E2E_SCENARIOS=1 npx vitest run --project e2e-scenarios-live test/e2e-scenario/live/hermes-e2e.test.ts --silent=false --reporter=default
  • npx biome check test/e2e-scenario/live/hermes-e2e.test.ts .github/workflows/e2e-vitest-scenarios.yaml
  • git diff --check

Live validation

Selective hermes-e2e dispatch in E2E / Vitest Scenarios is pending after PR creation.

Summary by CodeRabbit

  • Tests

    • Added comprehensive end-to-end live test scenario for Hermes, including CLI validation, sandbox management, health checks, inference verification, and artifact collection.
  • Chores

    • Extended CI/CD workflow to support new Hermes E2E test job lane with proper job validation, matrix generation, and PR reporting integration.

@coderabbitai

coderabbitai Bot commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Caution

Review failed

Pull request was closed or merged during review

📝 Walkthrough

Walkthrough

This PR adds a new hermes-e2e-vitest free-standing job to the E2E workflow that validates Hermes installation, sandbox health, dual-path inference (NVIDIA API and local endpoint), and agent manifest integrity. The workflow's matrix generation is extended with a hermes_selected output to gate the job, and workflow boundary validation is updated to enforce Hermes-specific environment, secret, and action pinning requirements.

Changes

Hermes E2E Vitest scenario

Layer / File(s) Summary
Hermes live E2E test implementation
test/e2e-scenario/live/hermes-e2e.test.ts
Vitest scenario that installs Hermes, validates CLI tools and sandbox creation, checks health/config/logs, performs dual-path inference validation (NVIDIA API and inference.local), validates agent manifest display names, and handles optional cleanup with post-destroy registry verification.
Workflow matrix and job configuration
.github/workflows/e2e-vitest-scenarios.yaml
Matrix generation detects hermes-e2e or hermes-e2e-vitest inputs and outputs a hermes_selected flag; new hermes-e2e-vitest job is gated on that flag and runs the Hermes test with required environment variables (HERMES_MODEL, HERMES_AGENT, timeouts, artifact directory, CLI binary path) and secrets (NVIDIA_API_KEY); report-to-pr is updated to include Hermes job in its dependencies.
Workflow boundary validator—Hermes validation rules
tools/e2e-scenarios/workflow-boundary.mts
Adds hermes-e2e-vitest to the free-standing job registry, extends validate-jobs step to enforce its presence, validates matrix generation exposes hermes_selected output and correctly handles Hermes selection logic, adds validateHermesE2EVitestJob to assert job wiring (runner, selector, needs), environment variables, forbidden secret exposure in job env, pinned actions (checkout/setup-node/upload), Vitest run using NVIDIA_API_KEY from secrets, and artifact upload configuration.
Test coverage—Matrix dispatch and Hermes validation
test/e2e-scenario/support-tests/e2e-scenarios-workflow.test.ts
Adds test helpers to load and parse the workflow YAML, execute matrix generation with injected JOBS/SCENARIOS inputs, extends dispatch coverage with Hermes-specific assertions for hermes-e2e scenario selection, verifies hermes_selected output toggles correctly, updates boundary validator expectations to require Hermes job presence and report-to-pr dependency, and adds a regression test for unsafe variable interpolation in matrix output.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

  • NVIDIA/NemoClaw#5243: Introduces the jobs input selector framework in e2e-vitest-scenarios.yaml and validate-jobs validation that this PR extends with the hermes-e2e-vitest free-standing job.
  • NVIDIA/NemoClaw#5236: Adds a similar free-standing Vitest job (token-rotation-vitest) to the same E2E workflow matrix and boundary validation infrastructure.
  • NVIDIA/NemoClaw#5152: Adds another free-standing Vitest job (onboard-negative-paths-vitest) using the same workflow boundary validator and test coverage patterns.

Suggested labels

area: ci, area: e2e

Suggested reviewers

  • cv
  • prekshivyas

Poem

🐰 A Hermes hops through tests so grand,
With inference checks across the land,
The sandbox runs, the APIs call,
One workflow to validate them all!

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically describes the main change: migrating a legacy shell E2E test to a Vitest test for Hermes functionality.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch e2e-migrate/test-hermes-e2e-simple-signed

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions

Copy link
Copy Markdown
Contributor

PR Review Advisor

Findings: 0 needs attention, 1 worth checking, 0 nice ideas
Top item: PR review advisor unavailable

Review findings

🛠️ Needs attention

  • None.

🔎 Worth checking

  • PR review advisor unavailable: The automated advisor could not complete: Could not parse JSON from PR review advisor output; see /home/runner/work/NemoClaw/NemoClaw/artifacts/pr-review-advisor/pr-review-advisor-raw-output.txt
    • Recommendation: Re-run the PR Review Advisor or perform a manual review.
    • Evidence: Could not parse JSON from PR review advisor output; see /home/runner/work/NemoClaw/NemoClaw/artifacts/pr-review-advisor/pr-review-advisor-raw-output.txt

🌱 Nice ideas

  • None.
Consider writing more tests for
  • **Runtime validation** — Add or identify targeted runtime/integration validation for the changed behavior; do not report external E2E job pass/fail here.. Runtime/sandbox/infrastructure paths need behavioral runtime validation: .github/workflows/e2e-vitest-scenarios.yaml, tools/e2e-scenarios/workflow-boundary.mts.

Workflow run details

This is an automated advisory review. A human maintainer must make the final merge decision.

@github-actions

Copy link
Copy Markdown
Contributor

E2E Advisor Recommendation

Required E2E: hermes-e2e-vitest, openshell-version-pin-vitest
Optional E2E: network-policy-vitest, onboard-negative-paths-vitest

Dispatch hint: hermes-e2e-vitest,openshell-version-pin-vitest

Workflow run

Full advisor summary

E2E Recommendation Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required E2E

  • hermes-e2e-vitest (high; live install/onboard/sandbox plus hosted inference, timeout 75 minutes): Required because this PR adds and wires the Hermes live E2E job. It validates the changed installer/onboarding, sandbox lifecycle, Hermes health, live NVIDIA endpoint access, and inference.local routing path introduced by the PR.
  • openshell-version-pin-vitest (low; hermetic installer-script behavioral test): Required as a low-cost workflow-dispatch boundary check alongside the Hermes selector changes: dispatching a non-Hermes free-standing job verifies jobs-only selection still works and that Hermes is not selected unless explicitly requested.

Optional E2E

  • network-policy-vitest (high; live OpenShell install/onboard and network allow/deny probes, timeout 90 minutes): Optional adjacent confidence because Hermes E2E relies on sandbox network policy and inference.local routing. Run if reviewers want a broader live security-boundary check after the workflow selector changes.
  • onboard-negative-paths-vitest (medium; CLI build and invalid-key onboarding path, timeout 15 minutes): Optional confidence for onboarding CLI error handling because this PR changes the free-standing job selector list and adds a new installer/onboard live path, but does not modify runtime onboarding source directly.

New E2E recommendations

  • hermes-dashboard-forwarding (medium): The new Hermes test contains dashboard assertions behind NEMOCLAW_E2E_HERMES_DASHBOARD/NEMOCLAW_HERMES_DASHBOARD, but the added CI job does not enable those variables. A dedicated dashboard-enabled Hermes E2E variant would cover dashboard port-forwarding and host/sandbox dashboard probes.
    • Suggested test: Add a Hermes dashboard-enabled Vitest E2E job or scenario that runs hermes-e2e.test.ts with NEMOCLAW_E2E_HERMES_DASHBOARD=1 and validates dashboard forwarding.

Dispatch hint

  • Workflow: E2E / Vitest Scenarios
  • jobs input: hermes-e2e-vitest,openshell-version-pin-vitest

@github-actions

Copy link
Copy Markdown
Contributor

Vitest E2E Scenario Recommendation

Required Vitest E2E scenarios: e2e-scenarios-all
Optional Vitest E2E scenarios: None

Dispatch required Vitest E2E scenarios:

  • gh workflow run e2e-vitest-scenarios.yaml --ref <pr-head-ref>

Workflow run

Full Vitest E2E advisor summary

Vitest E2E Scenario Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required Vitest E2E scenarios

  • e2e-scenarios-all: The PR changes the canonical Vitest scenario workflow, matrix/selector behavior, a live Vitest test entry, and workflow boundary support tests/tooling. Policy requires the all-scenarios fan-out for workflow and matrix emission changes.
    • Dispatch: gh workflow run e2e-vitest-scenarios.yaml --ref <pr-head-ref>

Optional Vitest E2E scenarios

  • None.

Relevant changed files

  • .github/workflows/e2e-vitest-scenarios.yaml
  • test/e2e-scenario/live/hermes-e2e.test.ts
  • test/e2e-scenario/support-tests/e2e-scenarios-workflow.test.ts
  • tools/e2e-scenarios/workflow-boundary.mts

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

v0.0.64 Release target

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants