Skip to content

test(e2e): migrate OpenClaw inference switch scenario#5357

Merged
cv merged 7 commits into
mainfrom
codex/5098-openclaw-inference-switch
Jun 13, 2026
Merged

test(e2e): migrate OpenClaw inference switch scenario#5357
cv merged 7 commits into
mainfrom
codex/5098-openclaw-inference-switch

Conversation

@cv

@cv cv commented Jun 12, 2026

Copy link
Copy Markdown
Collaborator

Summary

Migrates test/e2e/test-openclaw-inference-switch.sh to a focused live Vitest scenario while preserving the real install.sh/OpenShell/Docker/inference switch boundary. Adds manual workflow dispatch wiring so the free-standing scenario can run through e2e-vitest-scenarios.

Related Issue

Fixes #5098

Changes

  • Added test/e2e-scenario/live/openclaw-inference-switch.test.ts to cover install, inference route switching, OpenClaw config/hash, registry/session state, inference.local, and OpenClaw agent PONG checks.
  • Registered openclaw-inference-switch in the free-standing Vitest scenario inventory and workflow dispatch job.
  • Extended workflow selector tests for the new scenario/job mapping and gave the free-standing matrix guard a larger timeout.

Type of Change

  • Code change (feature, bug fix, or refactor)
  • Code change with doc updates
  • Doc only (prose changes, no code sample modifications)
  • Doc only (includes code sample changes)

Verification

  • npx prek run --all-files passes
  • npm test passes
  • Tests added or updated for new or changed behavior
  • No secrets, API keys, or credentials committed
  • Docs updated for user-facing behavior changes
  • npm run docs builds without warnings (doc changes only)
  • Doc pages follow the style guide (doc changes only)
  • New doc pages include SPDX header and frontmatter (new pages only)

Signed-off-by: Carlos Villela cvillela@nvidia.com

Summary by CodeRabbit

  • Tests
    • Added a live end-to-end scenario that validates switching inference provider/model, confirms runtime behavior (including a “PONG” response) and saved artifacts, supports optional local mock provider, and includes robust retries, skips, and cleanup.
  • Chores
    • Added a dedicated CI job to run the live inference-switch test, uploads test artifacts/logs, and updated PR reporting to include the job’s result.

Signed-off-by: Carlos Villela <cvillela@nvidia.com>
@cv cv self-assigned this Jun 12, 2026
@copy-pr-bot

copy-pr-bot Bot commented Jun 12, 2026

Copy link
Copy Markdown

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@coderabbitai

coderabbitai Bot commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds a Vitest live E2E scenario that migrates the legacy openclaw inference-switch script into TypeScript tests (including a mock Anthropic provider, resilient inference/agent checks, retries, cleanup) and wires a new conditional GitHub Actions job to run the test and report results.

Changes

OpenClaw Inference Switch E2E Test Migration

Layer / File(s) Summary
Test setup, types, and execution infrastructure
test/e2e-scenario/live/openclaw-inference-switch.test.ts
Test SPDX/docs, live-test gating, environment-driven constants, response/config interface types, string/shell helpers, host/sandbox execution utilities, and best-effort cleanup and HTTP server helpers.
Mock Anthropic provider server and registration
test/e2e-scenario/live/openclaw-inference-switch.test.ts
Local HTTP mock server exposing /health, /v1/models*, and /v1/messages (supports SSE); ensureCompatibleAnthropicSwitchProvider registers/updates provider in OpenShell when configured.
Route, config, and registry verification helpers
test/e2e-scenario/live/openclaw-inference-switch.test.ts
Capture gateway PID, fetch OpenShell route inference output and assert provider/model, validate sandboxes.json and onboard-session.json, verify openclaw.json wiring and .config-hash, and provide HTTP response parsing helpers.
Sandbox inference and agent interaction with resilience
test/e2e-scenario/live/openclaw-inference-switch.test.ts
checkSandboxInference runs curl from sandbox with retries and transient-skips; resilient agent JSON parsing extracts embedded JSON despite wrapper chatter; checkOpenClawAgentTurn runs sandbox SSH agent command with timeout skip; transient failure classification and retry/backoff for nemoclaw inference set.
Main test orchestration and validation
test/e2e-scenario/live/openclaw-inference-switch.test.ts
RUN_OPENCLAW_INFERENCE_SWITCH_TEST writes scenario.json, validates prerequisites, provisions temp home, runs install.sh, optionally starts mock Anthropic and registers provider, performs inference switch with retries/fallback, validates gateway PID stability and all checks, optionally destroys sandbox, and writes scenario-result.json.
GitHub Actions workflow job definition
.github/workflows/e2e-vitest-scenarios.yaml
Adds openclaw-inference-switch-vitest free-standing job with conditional dispatch gating, extended timeout, NemoClaw/OpenShell gateway env configuration, Docker Hub auth with retries and anonymous-pull fallback, CLI build, Vitest run using NVIDIA_API_KEY, artifact uploads (scenario dir and /tmp/...install.log), and always-run Docker logout cleanup.
Workflow dispatch and PR reporting integration
.github/workflows/e2e-vitest-scenarios.yaml, test/e2e-scenario/support-tests/e2e-scenarios-workflow.test.ts
Adds openclaw-inference-switch-vitest to report-to-pr job needs; extends dispatch-selector tests to resolve openclaw-inference-switch to the new job when selected via scenarios or jobs selector with liveScenariosRuns: false and empty registryScenarios.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Possibly related PRs

  • NVIDIA/NemoClaw#5243: Extends the free-standing Vitest jobs selector framework referenced by this PR's dispatch-selector tests.
  • NVIDIA/NemoClaw#5234: Related selector/dispatch normalization work around free-standing live Vitest jobs and wiring detection.
  • NVIDIA/NemoClaw#5336: Adds another free-standing live Vitest job and updates dispatch/report wiring similarly.

Suggested labels

area: e2e

Suggested reviewers

  • prekshivyas

Poem

🐰 A switch flips true in the inference day,
Mock servers hum and PONGs come out to play,
JSON peels through chatter, retries pave the way,
Sandbox boots, the agent speaks—a bright hooray,
From bash to Vitest, the E2E hops onstage.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: migrating a legacy bash E2E test into a Vitest scenario.
Linked Issues check ✅ Passed The PR meets all Definition of Done requirements from #5098: equivalent Vitest coverage exists, user-visible contracts preserved, deterministic artifacts/cleanup implemented, PR documents the mapping, test wired into e2e-vitest-scenarios.yaml, and unique naming used.
Out of Scope Changes check ✅ Passed All changes are directly scoped to the migration objective: new Vitest test, workflow job setup, workflow selector tests, and boundary validation—no unrelated modifications present.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch codex/5098-openclaw-inference-switch

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions

github-actions Bot commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

E2E Advisor Recommendation

Required E2E: None
Optional E2E: None

Workflow run

Full advisor summary

E2E Recommendation Advisor

Failed: Could not parse JSON from advisor output; see /home/runner/work/NemoClaw/NemoClaw/artifacts/e2e-advisor/e2e-advisor-raw-output.txt

@github-actions

github-actions Bot commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

Vitest E2E Scenario Recommendation

Required Vitest E2E scenarios: None
Optional Vitest E2E scenarios: None

Workflow run

Full Vitest E2E advisor summary

Vitest E2E Scenario Advisor

Failed: Could not parse JSON from advisor output; see /home/runner/work/NemoClaw/NemoClaw/artifacts/e2e-advisor/e2e-scenario-advisor-raw-output.txt

Comment thread test/e2e-scenario/live/openclaw-inference-switch.test.ts Fixed
@github-actions

github-actions Bot commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

PR Review Advisor

Findings: 0 needs attention, 1 worth checking, 0 nice ideas
Top item: PR review advisor unavailable

Review findings

🛠️ Needs attention

  • None.

🔎 Worth checking

  • PR review advisor unavailable: The automated advisor could not complete: Could not parse JSON from PR review advisor output; see /home/runner/work/NemoClaw/NemoClaw/artifacts/pr-review-advisor/pr-review-advisor-raw-output.txt
    • Recommendation: Re-run the PR Review Advisor or perform a manual review.
    • Evidence: Could not parse JSON from PR review advisor output; see /home/runner/work/NemoClaw/NemoClaw/artifacts/pr-review-advisor/pr-review-advisor-raw-output.txt

🌱 Nice ideas

  • None.
Consider writing more tests for
  • **Runtime validation** — Add or identify targeted runtime/integration validation for the changed behavior; do not report external E2E job pass/fail here.. Runtime/sandbox/infrastructure paths need behavioral runtime validation: .github/workflows/e2e-vitest-scenarios.yaml, tools/e2e-scenarios/workflow-boundary.mts.

Workflow run details

This is an automated advisory review. A human maintainer must make the final merge decision.

@github-actions

Copy link
Copy Markdown
Contributor

Vitest E2E Scenario Results — ❌ Some jobs failed

Run: 27436819548
Workflow ref: codex/5098-openclaw-inference-switch
Requested scenarios: (default — all supported)
Requested jobs: openclaw-inference-switch-vitest
Summary: 1 passed, 1 failed, 22 skipped

Job Result
credential-migration-vitest ⏭️ skipped
credential-sanitization-vitest ⏭️ skipped
double-onboard-vitest ⏭️ skipped
gateway-guard-recovery ⏭️ skipped
generate-matrix ✅ success
hermes-e2e-vitest ⏭️ skipped
hermes-root-entrypoint-smoke-vitest ⏭️ skipped
inference-routing-vitest ⏭️ skipped
issue-4434-tui-unreachable-inference-vitest ⏭️ skipped
launchable-smoke-vitest ⏭️ skipped
live-scenarios ⏭️ skipped
model-router-provider-routed-inference-vitest ⏭️ skipped
network-policy-vitest ⏭️ skipped
onboard-negative-paths-vitest ⏭️ skipped
openclaw-inference-switch-vitest ❌ failure
openclaw-tui-chat-correlation-vitest ⏭️ skipped
openshell-version-pin-vitest ⏭️ skipped
rebuild-openclaw-vitest ⏭️ skipped
runtime-overrides-vitest ⏭️ skipped
sandbox-rebuild-vitest ⏭️ skipped
sandbox-survival-vitest ⏭️ skipped
shields-config-vitest ⏭️ skipped
skill-agent-vitest ⏭️ skipped
token-rotation-vitest ⏭️ skipped

Failed jobs: openclaw-inference-switch-vitest. Check run artifacts for logs.

Signed-off-by: Carlos Villela <cvillela@nvidia.com>
@github-actions

Copy link
Copy Markdown
Contributor

Vitest E2E Scenario Results — ✅ All jobs passed

Run: 27437824573
Workflow ref: codex/5098-openclaw-inference-switch
Requested scenarios: (default — all supported)
Requested jobs: openclaw-inference-switch-vitest
Summary: 2 passed, 0 failed, 22 skipped

Job Result
credential-migration-vitest ⏭️ skipped
credential-sanitization-vitest ⏭️ skipped
double-onboard-vitest ⏭️ skipped
gateway-guard-recovery ⏭️ skipped
generate-matrix ✅ success
hermes-e2e-vitest ⏭️ skipped
hermes-root-entrypoint-smoke-vitest ⏭️ skipped
inference-routing-vitest ⏭️ skipped
issue-4434-tui-unreachable-inference-vitest ⏭️ skipped
launchable-smoke-vitest ⏭️ skipped
live-scenarios ⏭️ skipped
model-router-provider-routed-inference-vitest ⏭️ skipped
network-policy-vitest ⏭️ skipped
onboard-negative-paths-vitest ⏭️ skipped
openclaw-inference-switch-vitest ✅ success
openclaw-tui-chat-correlation-vitest ⏭️ skipped
openshell-version-pin-vitest ⏭️ skipped
rebuild-openclaw-vitest ⏭️ skipped
runtime-overrides-vitest ⏭️ skipped
sandbox-rebuild-vitest ⏭️ skipped
sandbox-survival-vitest ⏭️ skipped
shields-config-vitest ⏭️ skipped
skill-agent-vitest ⏭️ skipped
token-rotation-vitest ⏭️ skipped

Signed-off-by: Carlos Villela <cvillela@nvidia.com>
@github-actions

Copy link
Copy Markdown
Contributor

Vitest E2E Scenario Results — ✅ All jobs passed

Run: 27438779472
Workflow ref: codex/5098-openclaw-inference-switch
Requested scenarios: (default — all supported)
Requested jobs: openclaw-inference-switch-vitest
Summary: 2 passed, 0 failed, 22 skipped

Job Result
credential-migration-vitest ⏭️ skipped
credential-sanitization-vitest ⏭️ skipped
double-onboard-vitest ⏭️ skipped
gateway-guard-recovery ⏭️ skipped
generate-matrix ✅ success
hermes-e2e-vitest ⏭️ skipped
hermes-root-entrypoint-smoke-vitest ⏭️ skipped
inference-routing-vitest ⏭️ skipped
issue-4434-tui-unreachable-inference-vitest ⏭️ skipped
launchable-smoke-vitest ⏭️ skipped
live-scenarios ⏭️ skipped
model-router-provider-routed-inference-vitest ⏭️ skipped
network-policy-vitest ⏭️ skipped
onboard-negative-paths-vitest ⏭️ skipped
openclaw-inference-switch-vitest ✅ success
openclaw-tui-chat-correlation-vitest ⏭️ skipped
openshell-version-pin-vitest ⏭️ skipped
rebuild-openclaw-vitest ⏭️ skipped
runtime-overrides-vitest ⏭️ skipped
sandbox-rebuild-vitest ⏭️ skipped
sandbox-survival-vitest ⏭️ skipped
shields-config-vitest ⏭️ skipped
skill-agent-vitest ⏭️ skipped
token-rotation-vitest ⏭️ skipped

@cv cv marked this pull request as ready for review June 12, 2026 21:03

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@test/e2e-scenario/live/openclaw-inference-switch.test.ts`:
- Around line 34-39: The test currently uses fixed defaults for SANDBOX_NAME and
SWITCH_MOCK_PORT which cause collisions; update the initialization to generate
unique sandbox names and use a validated port parser that accepts 0
(auto-assign) for SWITCH_MOCK_PORT (e.g., implement a parsePortEnv(name,
fallback) that validates integer 0–65535 and returns Number.parseInt or
fallback), and replace direct uses of SWITCH_MOCK_PORT and SANDBOX_NAME with
these new validated values so startMockAnthropicProvider() can read the bound
port and cleanup/pre-final steps use the unique sandbox name.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 2cd3681f-b72d-42c7-a6ea-e0cb27eaf8e3

📥 Commits

Reviewing files that changed from the base of the PR and between 0e30bff and abe968d.

📒 Files selected for processing (4)
  • .github/workflows/e2e-vitest-scenarios.yaml
  • test/e2e-scenario/live/openclaw-inference-switch.test.ts
  • test/e2e-scenario/support-tests/e2e-scenarios-workflow.test.ts
  • tools/e2e-scenarios/free-standing-jobs.env

Comment thread test/e2e-scenario/live/openclaw-inference-switch.test.ts Outdated
Signed-off-by: Carlos Villela <cvillela@nvidia.com>
@github-actions

Copy link
Copy Markdown
Contributor

Vitest E2E Scenario Results — ✅ All jobs passed

Run: 27446110743
Workflow ref: codex/5098-openclaw-inference-switch
Requested scenarios: openclaw-inference-switch
Requested jobs: (default — all free-standing when no scenarios are requested)
Summary: 2 passed, 0 failed, 22 skipped

Job Result
credential-migration-vitest ⏭️ skipped
credential-sanitization-vitest ⏭️ skipped
double-onboard-vitest ⏭️ skipped
gateway-guard-recovery ⏭️ skipped
generate-matrix ✅ success
hermes-e2e-vitest ⏭️ skipped
hermes-root-entrypoint-smoke-vitest ⏭️ skipped
inference-routing-vitest ⏭️ skipped
issue-4434-tui-unreachable-inference-vitest ⏭️ skipped
launchable-smoke-vitest ⏭️ skipped
live-scenarios ⏭️ skipped
model-router-provider-routed-inference-vitest ⏭️ skipped
network-policy-vitest ⏭️ skipped
onboard-negative-paths-vitest ⏭️ skipped
openclaw-inference-switch-vitest ✅ success
openclaw-tui-chat-correlation-vitest ⏭️ skipped
openshell-version-pin-vitest ⏭️ skipped
rebuild-openclaw-vitest ⏭️ skipped
runtime-overrides-vitest ⏭️ skipped
sandbox-rebuild-vitest ⏭️ skipped
sandbox-survival-vitest ⏭️ skipped
shields-config-vitest ⏭️ skipped
skill-agent-vitest ⏭️ skipped
token-rotation-vitest ⏭️ skipped

@cv cv added the v0.0.65 Release target label Jun 13, 2026
cv added 3 commits June 12, 2026 18:19
…nference-switch

# Conflicts:
#	.github/workflows/e2e-vitest-scenarios.yaml
#	test/e2e-scenario/support-tests/e2e-scenarios-workflow.test.ts

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
tools/e2e-scenarios/workflow-boundary.mts (1)

2543-2543: ⚡ Quick win

Add full boundary assertions for openclaw-inference-switch-vitest, not just selector mapping.

Line 2543 currently validates only the if selector contract. Please add a dedicated validator (like other live free-standing jobs) to enforce required env values, secret scoping, run command target, artifact paths, timeout, and cleanup step behavior; otherwise workflow drift in this job can slip past boundary tests.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tools/e2e-scenarios/workflow-boundary.mts` at line 2543, The selector-only
check for openclaw-inference-switch-vitest is insufficient; add a full boundary
validator alongside validateFreeStandingJobSelector by creating/invoking the
same per-job validator used for other live free-standing jobs (e.g.,
validateFreeStandingJob or validateFreeStandingJobBoundary) for
"openclaw-inference-switch-vitest" and assert the required env vars, scoped
secrets, expected run command target, artifact input/output paths, timeout
value, and presence/behavior of the cleanup step so this job cannot drift
undetected; reference the existing validateFreeStandingJobSelector call and
mirror the validation pattern and assertions used for other jobs in this file
when adding the new validator invocation.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@tools/e2e-scenarios/workflow-boundary.mts`:
- Line 2543: The selector-only check for openclaw-inference-switch-vitest is
insufficient; add a full boundary validator alongside
validateFreeStandingJobSelector by creating/invoking the same per-job validator
used for other live free-standing jobs (e.g., validateFreeStandingJob or
validateFreeStandingJobBoundary) for "openclaw-inference-switch-vitest" and
assert the required env vars, scoped secrets, expected run command target,
artifact input/output paths, timeout value, and presence/behavior of the cleanup
step so this job cannot drift undetected; reference the existing
validateFreeStandingJobSelector call and mirror the validation pattern and
assertions used for other jobs in this file when adding the new validator
invocation.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 52253fa8-e86d-4aaf-b493-84f364b3ddb2

📥 Commits

Reviewing files that changed from the base of the PR and between e203f42 and 8447c4d.

📒 Files selected for processing (3)
  • .github/workflows/e2e-vitest-scenarios.yaml
  • test/e2e-scenario/support-tests/e2e-scenarios-workflow.test.ts
  • tools/e2e-scenarios/workflow-boundary.mts
✅ Files skipped from review due to trivial changes (1)
  • test/e2e-scenario/support-tests/e2e-scenarios-workflow.test.ts

@cv cv merged commit 6b01347 into main Jun 13, 2026
42 checks passed
@cv cv deleted the codex/5098-openclaw-inference-switch branch June 13, 2026 03:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

v0.0.65 Release target

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Epic: Migrate legacy bash E2E into the Vitest E2E system

3 participants