Skip to content

test(e2e): migrate test-model-router-provider-routed-inference.sh to vitest [IND-2]#5221

Merged
jyaunches merged 7 commits into
mainfrom
e2e-migrate/test-model-router-provider-routed-simple
Jun 12, 2026
Merged

test(e2e): migrate test-model-router-provider-routed-inference.sh to vitest [IND-2]#5221
jyaunches merged 7 commits into
mainfrom
e2e-migrate/test-model-router-provider-routed-simple

Conversation

@jyaunches

@jyaunches jyaunches commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Summary

Migrate test/e2e/test-model-router-provider-routed-inference.sh with focused live Vitest coverage.

Related Issues

Refs #5098
Refs #3255

Contract mapping

  • Legacy assertion: Docker is available before provider-routed onboarding.
    • Replacement: test/e2e-scenario/live/model-router-provider-routed-inference.test.ts checks docker info.
    • Boundary preserved: real host Docker CLI.
  • Legacy assertion: NVIDIA_API_KEY is present and nvapi--prefixed.
    • Replacement: live Vitest secret gate plus prefix assertion.
    • Boundary preserved: workflow secret/env boundary with redacted artifacts.
  • Legacy assertion: nemoclaw onboard --fresh succeeds with NEMOCLAW_PROVIDER=routed, NEMOCLAW_PROVIDER_KEY, and open policy tier.
    • Replacement: focused live Vitest test spawns node bin/nemoclaw.js onboard --fresh --non-interactive --yes-i-accept-third-party-software with the same routed env.
    • Boundary preserved: real CLI onboarding, OpenShell/Docker sandbox creation, cleanup via real nemoclaw <sandbox> destroy.
  • Legacy assertion: host Model Router health reports a healthy endpoint.
    • Replacement: polling curl http://127.0.0.1:4000/health and asserting healthy_count > 0.
    • Boundary preserved: real host HTTP model-router health endpoint.
  • Legacy assertion: sandbox inference.local returns a routed Model Router PONG completion.
    • Replacement: openshell sandbox exec ... curl https://inference.local/v1/chat/completions asserts model starts with nvidia-routed and content includes PONG.
    • Boundary preserved: real OpenShell sandbox exec, HTTPS inference.local route, provider-routed completion response.

Simplicity check

  • Test shape: focused live Vitest test.
  • New shared helpers: none; parsing/polling helpers are local to the test.
  • New framework/registry/ledger: none.
  • Workflow changes: add one opt-in e2e-vitest-scenarios.yaml job for this free-standing live test; legacy bash script and regression-e2e.yaml lane deletion are deferred to Epic: Migrate legacy bash E2E into the Vitest E2E system #5098 Phase 11.

Verification

  • NEMOCLAW_RUN_E2E_SCENARIOS=1 npx vitest run --project e2e-scenarios-live test/e2e-scenario/live/model-router-provider-routed-inference.test.ts --typecheck.enabled false (local skip: missing live prerequisites/secrets)
  • npx vitest run --project e2e-vitest-support test/e2e-scenario/support-tests/e2e-scenarios-workflow.test.ts --typecheck.enabled false
  • npm run build:cli
  • npm run typecheck:cli
  • npx @biomejs/biome lint test/e2e-scenario/live/model-router-provider-routed-inference.test.ts
  • git diff --check

Summary by CodeRabbit

  • Tests

    • Added a live end-to-end Vitest that validates a routed model-router inference flow (health checks, retries, response verification, diagnostic artifacts) and extended workflow-boundary tests for the new scenario ↔ job mapping and dispatch-matrix behavior.
  • Chores

    • CI updated with a new selectable E2E Vitest job (runs by default when applicable), optional retryable Docker Hub auth, isolated Docker config, artifact uploads with retention, and inclusion in PR results and workflow validations.

@coderabbitai

coderabbitai Bot commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds a Vitest live end-to-end test for provider-routed model-router inference, updates workflow selector/matrix/validator logic to register the new scenario/job, adds a gated CI job that optionally logs into Docker Hub (with retries), runs the targeted Vitest test, and includes the job in PR result reporting.

Changes

Provider-routed inference end-to-end scenario

Layer / File(s) Summary
Test constants, interfaces, and validation helpers
test/e2e-scenario/live/model-router-provider-routed-inference.test.ts
Defines repository/CLI/sandbox constants, timing/attempt limits, TypeScript interfaces, JSON parsing helpers, health readiness detection (healthy_count), and routed completion validation (model id and PONG).
Provider-routed environment and onboarding helper
test/e2e-scenario/live/model-router-provider-routed-inference.test.ts
Adds withProviderRoutedEnv to assemble NVIDIA_API_KEY and non-interactive/policy/provider environment variables for nemoclaw onboard --fresh.
Live e2e test implementation
test/e2e-scenario/live/model-router-provider-routed-inference.test.ts
Gated Vitest scenario: CLI/sandbox prerequisites, artifact recording, sandbox teardown, provider-routed onboarding, health polling (up to 20 attempts), routed inference completion retries (up to 3), JSON parsing and assertions (model prefix and PONG), early-exit handling, and scenario-result.json output.
Support tests: selector and matrix expectations
test/e2e-scenario/support-tests/e2e-scenarios-workflow.test.ts
Adds selector evaluation and matrix-generation test cases covering model-router-provider-routed-inference mapping to the *-vitest job and asserting hermes_selected: "false" with an empty generated matrix.
Workflow inputs, matrix wiring, and validation
.github/workflows/e2e-vitest-scenarios.yaml
Extends the workflow allowlist and matrix generation to include model-router-provider-routed-inference/model-router-provider-routed-inference-vitest and updates validation of inputs.jobs.
CI job: model-router-provider-routed-inference-vitest
.github/workflows/e2e-vitest-scenarios.yaml
Adds a new job gated by selectors that conditionally logs into Docker Hub (3 attempts when creds exist), installs deps, builds the CLI, runs model-router-provider-routed-inference.test.ts, and uploads artifacts with 14-day retention; also added to report-to-pr.needs.
Workflow-boundary validator updates
tools/e2e-scenarios/workflow-boundary.mts
Registers the new scenario/job in FREE_STANDING_SCENARIO_JOBS, updates validate-jobs/generate-matrix script checks, validates the free-standing job selector, and adds the job to the report-to-pr dependency list.

Sequence Diagram(s)

sequenceDiagram
  participant Test
  participant CLI as "nemoclaw CLI"
  participant Sandbox
  participant HealthAPI as "127.0.0.1:4000/health"
  participant InferenceAPI as "sandbox /v1/chat/completions"

  Test->>CLI: verify CLI exists
  Test->>Sandbox: destroy existing sandbox
  Test->>CLI: nemoclaw onboard --fresh (NEMOCLAW_PROVIDER=routed)
  Test->>HealthAPI: poll /health (up to 20)
  HealthAPI->>Test: healthy_count >= 1
  Test->>InferenceAPI: POST completion (up to 3 attempts)
  InferenceAPI->>Test: completion JSON (model, choices)
  Test->>Test: assert model prefix and PONG in choices
  Test->>Test: write scenario-result.json
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Suggested labels

area: e2e

Suggested reviewers

  • cv
  • prekshivyas

Poem

🐰 A routed inference hop and spring,
Onboard commands make tiny hearts sing,
Health checks counted, PONG returns true,
CI bundles artifacts for me and you,
Rabbit hops on code — tests dancing through!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title clearly and specifically describes the main change: migrating a legacy bash E2E test to Vitest format for the model-router provider-routed inference scenario.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch e2e-migrate/test-model-router-provider-routed-simple

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions

github-actions Bot commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

E2E Advisor Recommendation

Required E2E: model-router-provider-routed-inference-vitest
Optional E2E: inference-routing-vitest

Dispatch hint: model-router-provider-routed-inference-vitest

Workflow run

Full advisor summary

E2E Recommendation Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required E2E

  • model-router-provider-routed-inference-vitest (medium-high; live Docker sandbox plus NVIDIA API inference, 45 minute timeout): Validates the newly added workflow job and live test for provider-routed Model Router inference, including CLI build, onboard --fresh, Docker-backed sandbox lifecycle, Model Router health, and inference.local PONG completion.

Optional E2E

  • inference-routing-vitest (medium; live Vitest inference routing job): Adjacent confidence check for existing inference-routing behavior, useful if reviewers want to compare the new provider-routed Model Router lane against the established routing E2E path.

New E2E recommendations

  • None.

Dispatch hint

  • Workflow: .github/workflows/e2e-vitest-scenarios.yaml
  • jobs input: model-router-provider-routed-inference-vitest

@github-actions

github-actions Bot commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Vitest E2E Scenario Recommendation

Required Vitest E2E scenarios: model-router-provider-routed-inference-vitest
Optional Vitest E2E scenarios: None

Dispatch required Vitest E2E scenarios:

  • gh workflow run e2e-vitest-scenarios.yaml --ref <pr-head-ref> --field jobs=model-router-provider-routed-inference-vitest

Workflow run

Full Vitest E2E advisor summary

Vitest E2E Scenario Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required Vitest E2E scenarios

  • model-router-provider-routed-inference-vitest: Focused free-standing Vitest job wired for changed live test test/e2e-scenario/live/model-router-provider-routed-inference.test.ts.
    • Dispatch: gh workflow run e2e-vitest-scenarios.yaml --ref <pr-head-ref> --field jobs=model-router-provider-routed-inference-vitest

Optional Vitest E2E scenarios

  • None.

Relevant changed files

  • .github/workflows/e2e-vitest-scenarios.yaml
  • test/e2e-scenario/live/model-router-provider-routed-inference.test.ts
  • test/e2e-scenario/support-tests/e2e-scenarios-workflow.test.ts
  • tools/e2e-scenarios/workflow-boundary.mts

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
.github/workflows/e2e-vitest-scenarios.yaml (1)

266-290: 💤 Low value

Optional: Consider extracting Docker Hub authentication if pattern grows.

The Docker Hub authentication block (lines 266-290) is identical to the one in openclaw-tui-chat-correlation-vitest (lines 344-368). While GitHub Actions lacks native step-reuse between jobs, if more e2e jobs adopt this authentication pattern, consider extracting it to a composite action under .github/actions/ to reduce maintenance burden.

For now, with only two call sites, the duplication is acceptable.

[optional_refactor]

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/e2e-vitest-scenarios.yaml around lines 266 - 290, The
repeated "Authenticate to Docker Hub" step that uses DOCKERHUB_USERNAME and
DOCKERHUB_TOKEN is duplicated across workflows; extract this shell block (the
for-loop login with timeout and retry logic) into a reusable composite GitHub
Action under .github/actions/docker-hub-auth (or similar) and replace the inline
step in the workflows with a single call to that composite action, passing
secrets.DOCKERHUB_USERNAME and secrets.DOCKERHUB_TOKEN as inputs so both jobs
(e.g., the step named "Authenticate to Docker Hub" in this workflow and the
identical step in openclaw-tui-chat-correlation-vitest) reuse the same
implementation and reduce duplication.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In @.github/workflows/e2e-vitest-scenarios.yaml:
- Around line 266-290: The repeated "Authenticate to Docker Hub" step that uses
DOCKERHUB_USERNAME and DOCKERHUB_TOKEN is duplicated across workflows; extract
this shell block (the for-loop login with timeout and retry logic) into a
reusable composite GitHub Action under .github/actions/docker-hub-auth (or
similar) and replace the inline step in the workflows with a single call to that
composite action, passing secrets.DOCKERHUB_USERNAME and secrets.DOCKERHUB_TOKEN
as inputs so both jobs (e.g., the step named "Authenticate to Docker Hub" in
this workflow and the identical step in openclaw-tui-chat-correlation-vitest)
reuse the same implementation and reduce duplication.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 129bb8ae-1fa4-42f4-b6d2-74b15e8c79c4

📥 Commits

Reviewing files that changed from the base of the PR and between 6622476 and 45eaea8.

📒 Files selected for processing (2)
  • .github/workflows/e2e-vitest-scenarios.yaml
  • test/e2e-scenario/live/model-router-provider-routed-inference.test.ts

@github-actions

github-actions Bot commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

PR Review Advisor

Findings: 0 needs attention, 5 worth checking, 0 nice ideas
Since last review: 3 prior items resolved, 3 still apply, 0 new items found

Review findings

🛠️ Needs attention

  • None.

🔎 Worth checking

  • Source-of-truth review needed: Docker Hub login retry/fallback in model-router-provider-routed-inference-vitest: The advisor marked localized patch analysis as needs_followup.
    • Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
    • Evidence: The Authenticate to Docker Hub step retries docker login three times, continues with anonymous pulls on failure, writes auth under DOCKER_CONFIG, and cleans up after the test.
  • Source-of-truth review needed: Detailed workflow boundary source-of-truth for model-router-provider-routed-inference-vitest: The advisor marked localized patch analysis as needs_followup.
    • Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
    • Evidence: validateModelRouterProviderRoutedInferenceVitestJob enforces the intended job shape, but support tests added in the diff only verify selector and matrix behavior for the new job.
  • sandbox.exec still loses OPENSHELL_GATEWAY through the fixture env filter (test/e2e-scenario/live/model-router-provider-routed-inference.test.ts:203): The workflow now sets OPENSHELL_GATEWAY=nemoclaw, but this sandbox exec call passes env: buildAvailabilityProbeEnv(). That helper builds a filtered child env and does not allowlist OPENSHELL_GATEWAY, while SandboxClient.exec does not inject the openshellProbeEnv default used by specialized recovery helpers. The test can still fail with the same no-active-gateway path identified in the previous review.
    • Recommendation: Preserve OPENSHELL_GATEWAY for this exec path, either by routing SandboxClient.exec through the same default OpenShell env helper or by explicitly passing an env that includes OPENSHELL_GATEWAY=nemoclaw after fixture filtering. Add a support test that proves sandbox.exec receives the gateway env.
    • Evidence: The new test calls sandbox.exec(..., { env: buildAvailabilityProbeEnv(), ... }); buildAvailabilityProbeEnv only allows Docker-related extra env keys, and fixtures/clients/sandbox.ts only injects OPENSHELL_GATEWAY in specialized helpers, not in exec().
  • Docker Hub auth remains reachable by branch-controlled build and test code (.github/workflows/e2e-vitest-scenarios.yaml:1131): The new job improves the previous Docker credential handling by isolating DOCKER_CONFIG under runner.temp and cleaning it up in an always() step. However, DOCKER_CONFIG is job-level and docker login happens before npm ci, npm run build:cli, and the checked-out Vitest test. Branch-controlled code in those later steps can still read or use the Docker credential file until cleanup. The retry/fallback behavior also continues with anonymous pulls without documenting the invalid state, source boundary, source-fix constraint, regression test, or removal condition.
    • Recommendation: Minimize the lifetime and scope of Docker Hub auth as much as the scenario allows: use a pull-only credential, move login as close as possible to the operation that needs it, unset/remove DOCKER_CONFIG before any step that no longer needs it, and add boundary tests that require the isolation and cleanup policy. Also document the fallback source-of-truth: why anonymous-pull fallback is necessary, when it can be removed, and what regression test covers it.
    • Evidence: The job sets DOCKER_CONFIG in job env, logs in during Authenticate to Docker Hub, then runs Install root dependencies, Build CLI, and Run Model Router provider-routed inference live test before Clean up Docker auth.
  • Model Router workflow boundary validator lacks focused mutation tests (test/e2e-scenario/support-tests/e2e-scenarios-workflow.test.ts:69): A dedicated validateModelRouterProviderRoutedInferenceVitestJob function now exists, which addresses the previous generic-validator gap at the implementation level. The support tests added in this PR cover selector and matrix behavior, but I did not find focused negative tests mutating the new secret-bearing job to prove the validator rejects drift in action pinning, credential placement, Docker auth isolation/cleanup, artifact policy, or exact build/test commands.
    • Recommendation: Add targeted workflow-boundary tests that mutate model-router-provider-routed-inference-vitest and assert the validator rejects each high-risk drift case: unpinned actions, checkout persist-credentials true, NVIDIA_API_KEY outside the Vitest step, Docker Hub secrets outside auth, missing DOCKER_CONFIG isolation or cleanup, missing npm ci --ignore-scripts/build, changed test path, and broad artifact uploads.
    • Evidence: The diff adds positive selector/matrix assertions for model-router-provider-routed-inference, while the existing targeted negative tests in this file cover runtime-overrides and Hermes root-entrypoint rather than the new model-router job.

🌱 Nice ideas

  • None.
Consider writing more tests for
  • **Runtime validation** — workflow-boundary rejects model-router-provider-routed-inference-vitest when checkout, setup-node, or upload-artifact is not pinned to a full SHA or checkout persist-credentials is true. The live scenario itself exercises real CLI, Docker, Model Router health, and sandbox inference boundaries, but the high-risk workflow enforcement changes need deterministic negative validation beyond positive selector tests.
  • **Runtime validation** — workflow-boundary rejects model-router-provider-routed-inference-vitest when NVIDIA_API_KEY is set at job level or any step except Run Model Router provider-routed inference live test. The live scenario itself exercises real CLI, Docker, Model Router health, and sandbox inference boundaries, but the high-risk workflow enforcement changes need deterministic negative validation beyond positive selector tests.
  • **Runtime validation** — workflow-boundary rejects model-router-provider-routed-inference-vitest when Docker Hub secrets are exposed outside Authenticate to Docker Hub. The live scenario itself exercises real CLI, Docker, Model Router health, and sandbox inference boundaries, but the high-risk workflow enforcement changes need deterministic negative validation beyond positive selector tests.
  • **Runtime validation** — workflow-boundary rejects model-router-provider-routed-inference-vitest when DOCKER_CONFIG isolation or always() Docker auth cleanup is missing. The live scenario itself exercises real CLI, Docker, Model Router health, and sandbox inference boundaries, but the high-risk workflow enforcement changes need deterministic negative validation beyond positive selector tests.
  • **Runtime validation** — workflow-boundary rejects model-router-provider-routed-inference-vitest when npm ci --ignore-scripts, npm run build:cli, or the exact model-router Vitest test path is changed. The live scenario itself exercises real CLI, Docker, Model Router health, and sandbox inference boundaries, but the high-risk workflow enforcement changes need deterministic negative validation beyond positive selector tests.
  • **Model Router workflow boundary validator lacks focused mutation tests** — Add targeted workflow-boundary tests that mutate model-router-provider-routed-inference-vitest and assert the validator rejects each high-risk drift case: unpinned actions, checkout persist-credentials true, NVIDIA_API_KEY outside the Vitest step, Docker Hub secrets outside auth, missing DOCKER_CONFIG isolation or cleanup, missing npm ci --ignore-scripts/build, changed test path, and broad artifact uploads.
  • **Acceptance clause:** Refs Epic: Migrate legacy bash E2E into the Vitest E2E system #5098 — add test evidence or identify existing coverage. No literal issue Epic: Migrate legacy bash E2E into the Vitest E2E system #5098 clauses or comments were available in deterministic context. The diff is additive and retains the legacy bash script, matching the PR body's statement that deletion is deferred.
  • **Acceptance clause:** Refs [Brev][Onboard] Model Router (Provider Routed) inference broken — TUI returns HTTP 503 after successful onboard #3255 — add test evidence or identify existing coverage. The new Vitest test asserts Model Router health and routed inference.local completion, and failure messages reference expected [Brev][Onboard] Model Router (Provider Routed) inference broken — TUI returns HTTP 503 after successful onboard #3255 main-equivalent failures. No literal issue [Brev][Onboard] Model Router (Provider Routed) inference broken — TUI returns HTTP 503 after successful onboard #3255 clauses were available to verify complete acceptance.
Since last review details

Current findings:

  • Source-of-truth review needed: Docker Hub login retry/fallback in model-router-provider-routed-inference-vitest: The advisor marked localized patch analysis as needs_followup.
    • Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
    • Evidence: The Authenticate to Docker Hub step retries docker login three times, continues with anonymous pulls on failure, writes auth under DOCKER_CONFIG, and cleans up after the test.
  • Source-of-truth review needed: Detailed workflow boundary source-of-truth for model-router-provider-routed-inference-vitest: The advisor marked localized patch analysis as needs_followup.
    • Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
    • Evidence: validateModelRouterProviderRoutedInferenceVitestJob enforces the intended job shape, but support tests added in the diff only verify selector and matrix behavior for the new job.
  • sandbox.exec still loses OPENSHELL_GATEWAY through the fixture env filter (test/e2e-scenario/live/model-router-provider-routed-inference.test.ts:203): The workflow now sets OPENSHELL_GATEWAY=nemoclaw, but this sandbox exec call passes env: buildAvailabilityProbeEnv(). That helper builds a filtered child env and does not allowlist OPENSHELL_GATEWAY, while SandboxClient.exec does not inject the openshellProbeEnv default used by specialized recovery helpers. The test can still fail with the same no-active-gateway path identified in the previous review.
    • Recommendation: Preserve OPENSHELL_GATEWAY for this exec path, either by routing SandboxClient.exec through the same default OpenShell env helper or by explicitly passing an env that includes OPENSHELL_GATEWAY=nemoclaw after fixture filtering. Add a support test that proves sandbox.exec receives the gateway env.
    • Evidence: The new test calls sandbox.exec(..., { env: buildAvailabilityProbeEnv(), ... }); buildAvailabilityProbeEnv only allows Docker-related extra env keys, and fixtures/clients/sandbox.ts only injects OPENSHELL_GATEWAY in specialized helpers, not in exec().
  • Docker Hub auth remains reachable by branch-controlled build and test code (.github/workflows/e2e-vitest-scenarios.yaml:1131): The new job improves the previous Docker credential handling by isolating DOCKER_CONFIG under runner.temp and cleaning it up in an always() step. However, DOCKER_CONFIG is job-level and docker login happens before npm ci, npm run build:cli, and the checked-out Vitest test. Branch-controlled code in those later steps can still read or use the Docker credential file until cleanup. The retry/fallback behavior also continues with anonymous pulls without documenting the invalid state, source boundary, source-fix constraint, regression test, or removal condition.
    • Recommendation: Minimize the lifetime and scope of Docker Hub auth as much as the scenario allows: use a pull-only credential, move login as close as possible to the operation that needs it, unset/remove DOCKER_CONFIG before any step that no longer needs it, and add boundary tests that require the isolation and cleanup policy. Also document the fallback source-of-truth: why anonymous-pull fallback is necessary, when it can be removed, and what regression test covers it.
    • Evidence: The job sets DOCKER_CONFIG in job env, logs in during Authenticate to Docker Hub, then runs Install root dependencies, Build CLI, and Run Model Router provider-routed inference live test before Clean up Docker auth.
  • Model Router workflow boundary validator lacks focused mutation tests (test/e2e-scenario/support-tests/e2e-scenarios-workflow.test.ts:69): A dedicated validateModelRouterProviderRoutedInferenceVitestJob function now exists, which addresses the previous generic-validator gap at the implementation level. The support tests added in this PR cover selector and matrix behavior, but I did not find focused negative tests mutating the new secret-bearing job to prove the validator rejects drift in action pinning, credential placement, Docker auth isolation/cleanup, artifact policy, or exact build/test commands.
    • Recommendation: Add targeted workflow-boundary tests that mutate model-router-provider-routed-inference-vitest and assert the validator rejects each high-risk drift case: unpinned actions, checkout persist-credentials true, NVIDIA_API_KEY outside the Vitest step, Docker Hub secrets outside auth, missing DOCKER_CONFIG isolation or cleanup, missing npm ci --ignore-scripts/build, changed test path, and broad artifact uploads.
    • Evidence: The diff adds positive selector/matrix assertions for model-router-provider-routed-inference, while the existing targeted negative tests in this file cover runtime-overrides and Hermes root-entrypoint rather than the new model-router job.

Workflow run details

This is an automated advisory review. A human maintainer must make the final merge decision.

@jyaunches jyaunches changed the title test(e2e): migrate model router guard Migrate test-model-router-provider-routed-inference.sh to vitest Jun 11, 2026
@jyaunches jyaunches changed the title Migrate test-model-router-provider-routed-inference.sh to vitest test(e2e): migrate test-model-router-provider-routed-inference.sh to vitest Jun 11, 2026
@jyaunches jyaunches changed the title test(e2e): migrate test-model-router-provider-routed-inference.sh to vitest test(e2e): migrate model router guard Jun 11, 2026
@jyaunches jyaunches changed the title test(e2e): migrate model router guard test(e2e): migrate test-model-router-provider-routed-inference.sh to vitest Jun 11, 2026
@jyaunches jyaunches changed the title test(e2e): migrate test-model-router-provider-routed-inference.sh to vitest test(e2e): P2 independent 2 migrate test-model-router-provider-routed-inference.sh to vitest Jun 11, 2026

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
.github/workflows/e2e-vitest-scenarios.yaml (2)

50-64: 💤 Low value

JOBS env variable set but not passed to matrix generator.

The JOBS environment variable is set at line 50 and validated at lines 61-64, but is not passed to the matrix generation script at line 65. This is intentional—the inputs.jobs filter controls which workflow jobs execute (via job-level if conditions), not which scenarios appear in the matrix. However, this may not be immediately obvious to maintainers.

📝 Optional: Add clarifying comment
       - id: matrix
         name: Generate Vitest scenario matrix
         env:
           SCENARIOS: ${{ inputs.scenarios }}
+          # JOBS is validated here but not passed to the matrix generator;
+          # it controls job execution via workflow-level if conditions.
           JOBS: ${{ inputs.jobs }}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/e2e-vitest-scenarios.yaml around lines 50 - 64, Add a
short clarifying comment above the JOBS validation block explaining that JOBS
(from inputs.jobs) is validated to control which workflow jobs run via job-level
if conditions and is intentionally not added to the matrix generation args
(args+=(--scenarios ...)), so maintainers understand that matrix generation vs
job filtering are separate concerns; reference the JOBS env variable,
inputs.jobs, the args array, and the matrix generation invocation in the comment
so it's clear why JOBS isn’t appended to args.

279-304: ⚖️ Poor tradeoff

Docker Hub authentication logic duplicated across jobs.

The retry logic at lines 279-304 (in model-router-provider-routed-inference-vitest) and lines 357-382 (in openclaw-tui-chat-correlation-vitest) is identical. While the logic is correct, this duplication could be eliminated by extracting it into a reusable composite action.

This can be deferred to a future refactor focused on workflow helpers.

Also applies to: 357-382

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/e2e-vitest-scenarios.yaml around lines 279 - 304,
Duplicate "Authenticate to Docker Hub" step appears in the
model-router-provider-routed-inference-vitest and
openclaw-tui-chat-correlation-vitest jobs; extract this retry/login logic into a
reusable composite GitHub Action (e.g., actions/docker-hub-login-composite) that
accepts inputs DOCKERHUB_USERNAME and DOCKERHUB_TOKEN and implements the same
set -euo pipefail + retry loop + timeout behavior, then replace the inline step
in both jobs with a single uses: ./path/to/composite action invocation passing
the secrets as inputs to remove duplication.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In @.github/workflows/e2e-vitest-scenarios.yaml:
- Around line 50-64: Add a short clarifying comment above the JOBS validation
block explaining that JOBS (from inputs.jobs) is validated to control which
workflow jobs run via job-level if conditions and is intentionally not added to
the matrix generation args (args+=(--scenarios ...)), so maintainers understand
that matrix generation vs job filtering are separate concerns; reference the
JOBS env variable, inputs.jobs, the args array, and the matrix generation
invocation in the comment so it's clear why JOBS isn’t appended to args.
- Around line 279-304: Duplicate "Authenticate to Docker Hub" step appears in
the model-router-provider-routed-inference-vitest and
openclaw-tui-chat-correlation-vitest jobs; extract this retry/login logic into a
reusable composite GitHub Action (e.g., actions/docker-hub-login-composite) that
accepts inputs DOCKERHUB_USERNAME and DOCKERHUB_TOKEN and implements the same
set -euo pipefail + retry loop + timeout behavior, then replace the inline step
in both jobs with a single uses: ./path/to/composite action invocation passing
the secrets as inputs to remove duplication.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: ed446ae5-7576-43dd-948c-687999e63ef3

📥 Commits

Reviewing files that changed from the base of the PR and between 45eaea8 and 1a3fa6c.

📒 Files selected for processing (1)
  • .github/workflows/e2e-vitest-scenarios.yaml

@jyaunches jyaunches changed the title test(e2e): P2 independent 2 migrate test-model-router-provider-routed-inference.sh to vitest test(e2e): migrate test-model-router-provider-routed-inference.sh to vitest Jun 11, 2026
@jyaunches jyaunches changed the title test(e2e): migrate test-model-router-provider-routed-inference.sh to vitest test(e2e): migrate test-model-router-provider-routed-inference.sh to vitest [IND-2] Jun 11, 2026
@cv cv added the v0.0.64 Release target label Jun 11, 2026
@github-actions

Copy link
Copy Markdown
Contributor

Vitest E2E Scenario Results — ✅ All jobs passed

Run: 27377303092
Workflow ref: e2e-migrate/test-model-router-provider-routed-simple
Requested scenarios: (default — all supported)
Requested jobs: model-router-provider-routed-inference-vitest
Summary: 3 passed, 0 failed, 5 skipped

Job Result
gateway-guard-recovery ⏭️ skipped
generate-matrix ✅ success
live-scenarios ⏭️ skipped
model-router-provider-routed-inference-vitest ✅ success
onboard-negative-paths-vitest ⏭️ skipped
openclaw-tui-chat-correlation-vitest ⏭️ skipped
openshell-version-pin-vitest ⏭️ skipped
validate-jobs ✅ success

@cv cv added v0.0.65 Release target and removed v0.0.64 Release target labels Jun 12, 2026

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.github/workflows/e2e-vitest-scenarios.yaml:
- Line 1125: Update the job's if condition so it also checks inputs.scenarios
(like the working inference-routing-vitest job) to handle
free_standing_scenarios; specifically, wherever the current condition uses
(inputs.jobs == '' && inputs.scenarios == '') || contains(format(',{0},',
inputs.jobs), ',model-router-provider-routed-inference-vitest,'), mirror that
logic to also test contains(format(',{0},', inputs.scenarios),
',model-router-provider-routed-inference,') (or the exact scenario name used in
free_standing_scenarios) so the job runs when selected via inputs.scenarios as
well as inputs.jobs.
- Line 43: The workflow dispatch/validator is missing the new scenario IDs:
update tools/e2e-scenarios/workflow-boundary.mts to include both
"model-router-provider-routed-inference-vitest" and
"model-router-provider-routed-inference" in the allowed/validated job/scenario
lists, and modify .github/workflows/e2e-vitest-scenarios.yaml so the job
"model-router-provider-routed-inference-vitest" checks both inputs.jobs and
inputs.scenarios (not only inputs.jobs) — specifically adjust the job's if
condition that currently gates on inputs.jobs and ensure free_standing_scenarios
includes the matching id so selecting via inputs.scenarios will dispatch the
job.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 465cc85d-c2fd-438b-b0b9-068c7ca6f987

📥 Commits

Reviewing files that changed from the base of the PR and between 1a3fa6c and c9865f7.

📒 Files selected for processing (1)
  • .github/workflows/e2e-vitest-scenarios.yaml

Comment thread .github/workflows/e2e-vitest-scenarios.yaml
Comment thread .github/workflows/e2e-vitest-scenarios.yaml Outdated

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.github/workflows/e2e-vitest-scenarios.yaml:
- Line 1129: The job-level env sets DOCKER_CONFIG to ${{ runner.temp
}}/docker-config-model-router-provider-routed-inference but the runner context
isn't available at job-level; move the DOCKER_CONFIG setting out of the
jobs.<job_id>.env and instead add an env DOCKER_CONFIG with the same value into
each step that needs it (for example the "Authenticate to Docker Hub" and "Clean
up Docker auth" steps), or replace the job-level value with a job-safe path such
as ${{ github.workspace }}/.docker-config-model-router-provider-routed-inference
if you must keep it at job-level; update references to DOCKER_CONFIG
accordingly.

In `@tools/e2e-scenarios/workflow-boundary.mts`:
- Around line 1299-1307: The validator currently checks job-level env via
jobEnv.DOCKER_CONFIG against the runner.temp pattern (the block reading const
jobEnv = asRecord(job.env) and the DOCKER_CONFIG comparison); update this to
stop asserting a job-level runner.temp usage — either remove this job-level
DOCKER_CONFIG check or change it to inspect step-level envs (e.g., iterate
job.steps and check each step.env for DOCKER_CONFIG) and add corresponding
assertions in the Docker login and cleanup step validators (the checks around
the Docker auth and cleanup step validations) to ensure DOCKER_CONFIG is present
on those specific steps instead of at the job level.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 23b8d20f-4255-4c67-8f1e-9d6382e5dfa4

📥 Commits

Reviewing files that changed from the base of the PR and between 710256a and 88f771f.

📒 Files selected for processing (2)
  • .github/workflows/e2e-vitest-scenarios.yaml
  • tools/e2e-scenarios/workflow-boundary.mts

runs-on: ubuntu-latest
timeout-minutes: 45
env:
DOCKER_CONFIG: ${{ runner.temp }}/docker-config-model-router-provider-routed-inference

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

runner context is not available at job-level env; workflow will fail.

The runner context (including runner.temp) is only available within step execution, not in the jobs.<job_id>.env block. Per GitHub Actions context availability rules, job-level env only supports: github, needs, strategy, matrix, secrets, inputs, vars.

This will cause the workflow to error when the job starts.

🐛 Proposed fix: move DOCKER_CONFIG to step-level env

Remove DOCKER_CONFIG from job-level env and set it in each step that needs it:

     env:
-      DOCKER_CONFIG: ${{ runner.temp }}/docker-config-model-router-provider-routed-inference
       E2E_ARTIFACT_DIR: ${{ github.workspace }}/e2e-artifacts/vitest/model-router-provider-routed-inference

Then add to the "Authenticate to Docker Hub" step and "Clean up Docker auth" step:

       - name: Authenticate to Docker Hub
         env:
           DOCKERHUB_USERNAME: ${{ secrets.DOCKERHUB_USERNAME }}
           DOCKERHUB_TOKEN: ${{ secrets.DOCKERHUB_TOKEN }}
+          DOCKER_CONFIG: ${{ runner.temp }}/docker-config-model-router-provider-routed-inference
         shell: bash
         run: |
       - name: Clean up Docker auth
         if: always()
+        env:
+          DOCKER_CONFIG: ${{ runner.temp }}/docker-config-model-router-provider-routed-inference
         run: |

Alternatively, you can use a path that doesn't require runner.temp, such as ${{ github.workspace }}/.docker-config-model-router-provider-routed-inference, which is available at job-level.

🧰 Tools
🪛 actionlint (1.7.12)

[error] 1129-1129: context "runner" is not allowed here. available contexts are "github", "inputs", "matrix", "needs", "secrets", "strategy", "vars". see https://docs.github.com/en/actions/learn-github-actions/contexts#context-availability for more details

(expression)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/e2e-vitest-scenarios.yaml at line 1129, The job-level env
sets DOCKER_CONFIG to ${{ runner.temp
}}/docker-config-model-router-provider-routed-inference but the runner context
isn't available at job-level; move the DOCKER_CONFIG setting out of the
jobs.<job_id>.env and instead add an env DOCKER_CONFIG with the same value into
each step that needs it (for example the "Authenticate to Docker Hub" and "Clean
up Docker auth" steps), or replace the job-level value with a job-safe path such
as ${{ github.workspace }}/.docker-config-model-router-provider-routed-inference
if you must keep it at job-level; update references to DOCKER_CONFIG
accordingly.

Source: Linters/SAST tools

Comment on lines +1299 to +1307
const jobEnv = asRecord(job.env);
if (
jobEnv.DOCKER_CONFIG !==
"${{ runner.temp }}/docker-config-model-router-provider-routed-inference"
) {
errors.push(
"model-router-provider-routed-inference-vitest job must isolate Docker auth with DOCKER_CONFIG under runner.temp",
);
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Validator expects invalid job-level runner.temp usage.

This validation enforces the exact pattern that actionlint flagged as invalid in the workflow (runner context unavailable at job-level env). When the workflow is fixed by moving DOCKER_CONFIG to step-level env, this validator block should be updated accordingly.

♻️ Suggested adjustment after workflow fix

If the workflow fix moves DOCKER_CONFIG to step-level env (in Docker auth and cleanup steps), update this validator to either remove the job-level check or validate step-level env instead:

-  if (
-    jobEnv.DOCKER_CONFIG !==
-    "${{ runner.temp }}/docker-config-model-router-provider-routed-inference"
-  ) {
-    errors.push(
-      "model-router-provider-routed-inference-vitest job must isolate Docker auth with DOCKER_CONFIG under runner.temp",
-    );
-  }
+  // DOCKER_CONFIG is set at step-level (Docker auth and cleanup steps) where runner context is available

Then add validation in the Docker login and cleanup step checks (around lines 1367-1383 and 1456-1461) to verify DOCKER_CONFIG is set in those step-level envs.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tools/e2e-scenarios/workflow-boundary.mts` around lines 1299 - 1307, The
validator currently checks job-level env via jobEnv.DOCKER_CONFIG against the
runner.temp pattern (the block reading const jobEnv = asRecord(job.env) and the
DOCKER_CONFIG comparison); update this to stop asserting a job-level runner.temp
usage — either remove this job-level DOCKER_CONFIG check or change it to inspect
step-level envs (e.g., iterate job.steps and check each step.env for
DOCKER_CONFIG) and add corresponding assertions in the Docker login and cleanup
step validators (the checks around the Docker auth and cleanup step validations)
to ensure DOCKER_CONFIG is present on those specific steps instead of at the job
level.

@jyaunches jyaunches merged commit 561bd2f into main Jun 12, 2026
42 checks passed
@jyaunches jyaunches deleted the e2e-migrate/test-model-router-provider-routed-simple branch June 12, 2026 13:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area: e2e End-to-end tests, nightly failures, or validation infrastructure chore Build, CI, dependency, or tooling maintenance v0.0.65 Release target

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants