Skip to content

test(e2e): migrate state backup restore to vitest#5353

Merged
cv merged 6 commits into
mainfrom
codex/5098-state-backup-restore
Jun 13, 2026
Merged

test(e2e): migrate state backup restore to vitest#5353
cv merged 6 commits into
mainfrom
codex/5098-state-backup-restore

Conversation

@cv

@cv cv commented Jun 12, 2026

Copy link
Copy Markdown
Collaborator

Summary

Migrates the legacy test/e2e/test-state-backup-restore.sh contract into a live Vitest scenario that exercises the real scripts/backup-workspace.sh backup and restore boundaries. Adds manual E2E Vitest dispatch wiring and workflow contract coverage for the new free-standing scenario.

Related Issue

Refs #5098

Changes

  • Adds test/e2e-scenario/live/state-backup-restore.test.ts for the real backup -> destroy -> re-onboard -> restore lifecycle.
  • Adds state-backup-restore-vitest to .github/workflows/e2e-vitest-scenarios.yaml and the free-standing job inventory.
  • Extends workflow-boundary validation and selector tests for the new scenario alias and job.

Type of Change

  • Code change (feature, bug fix, or refactor)
  • Code change with doc updates
  • Doc only (prose changes, no code sample modifications)
  • Doc only (includes code sample changes)

Verification

Targeted checks run: live Vitest registration/skip, workflow support Vitest, Biome formatting/linting, source-shape/test-size checks, and git diff --check. Full local prek/push hooks were attempted but are not marked passing because unrelated CLI timeout failures occurred in src/lib/onboard/web-search-flow.test.ts.

  • npx prek run --all-files passes
  • npm test passes
  • Tests added or updated for new or changed behavior
  • No secrets, API keys, or credentials committed
  • Docs updated for user-facing behavior changes
  • npm run docs builds without warnings (doc changes only)
  • Doc pages follow the style guide (doc changes only)
  • New doc pages include SPDX header and frontmatter (new pages only)

Signed-off-by: Carlos Villela cvillela@nvidia.com

Summary by CodeRabbit

  • Tests

    • Added a new live end-to-end test that verifies workspace backup and restore, including verification of preserved files and memory markers.
    • Expanded scenario workflow tests and improved detection of sandbox cleanup signatures.
  • Chores

    • CI updated to run the new scenario as a standalone workflow job and to always collect artifacts for diagnosis.

Signed-off-by: Carlos Villela <cvillela@nvidia.com>
@cv cv self-assigned this Jun 12, 2026
@copy-pr-bot

copy-pr-bot Bot commented Jun 12, 2026

Copy link
Copy Markdown

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@coderabbitai

coderabbitai Bot commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 2ead9b12-715e-4a12-b4a2-0d2ca6f0b712

📥 Commits

Reviewing files that changed from the base of the PR and between 288b5ee and d4b4814.

📒 Files selected for processing (4)
  • .github/workflows/e2e-vitest-scenarios.yaml
  • test/e2e-scenario/fixtures/phases/onboarding.ts
  • test/e2e-scenario/support-tests/e2e-scenarios-workflow.test.ts
  • tools/e2e-scenarios/workflow-boundary.mts
🚧 Files skipped from review as they are similar to previous changes (4)
  • test/e2e-scenario/fixtures/phases/onboarding.ts
  • test/e2e-scenario/support-tests/e2e-scenarios-workflow.test.ts
  • .github/workflows/e2e-vitest-scenarios.yaml
  • tools/e2e-scenarios/workflow-boundary.mts

📝 Walkthrough

Walkthrough

Adds a live multi-phase state backup-restore E2E test, a selector-gated state-backup-restore-vitest workflow job that runs that test and uploads artifacts, and a workflow-boundary validator plus selector-test coverage enforcing the job's structure and secrets constraints.

Changes

State Backup-Restore E2E Test and Workflow Integration

Layer / File(s) Summary
Live backup-restore test implementation
test/e2e-scenario/live/state-backup-restore.test.ts, test/e2e-scenario/fixtures/phases/onboarding.ts
Implements constants, helpers (sandbox validation, backup discovery, shell failure detection, retry/destroy loops), and the multi-phase live test: onboard with marker seed, run backup-workspace.sh and verify host backup contents, destroy and re-onboard sandbox, run restore and validate restored files and memory marker, emit artifacts. Extends sandbox delete error pattern recognition.
Workflow job definition and PR reporter integration
.github/workflows/e2e-vitest-scenarios.yaml
Defines state-backup-restore-vitest job with checkout, root dependency install, CLI build, Docker Hub login (anonymous fallback), hermetic OpenShell installation, Vitest test execution targeting test/e2e-scenario/live/state-backup-restore.test.ts, and unconditional artifact upload. Adds job to PR reporter needs list.
Workflow boundary validation and selector wiring
tools/e2e-scenarios/workflow-boundary.mts, test/e2e-scenario/support-tests/e2e-scenarios-workflow.test.ts
Adds validateStateBackupRestoreVitestJob to enforce job configuration (runs-on/timeout, env, forbidden secrets exposure, pinned checkout/setup-node, required install/build steps, OpenShell install constraints, exact Vitest command, and artifact upload settings), wires the validator into validateE2eVitestScenariosWorkflowBoundary, and extends selector dispatch tests to verify state-backup-restore mapping.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • NVIDIA/NemoClaw#5243: Introduces the inputs.jobs selector infrastructure used by free-standing Vitest jobs and validators.
  • NVIDIA/NemoClaw#5370: Refactors free-standing selector inventory derivation and workflow-boundary validation that this PR extends.
  • NVIDIA/NemoClaw#5330: Extends the Vitest free-standing job selector/validation machinery related to this PR's mapping.

Suggested labels

area: e2e, area: ci

Suggested reviewers

  • prekshivyas

Poem

🐇 In a sandbox where markers gleam and hide,
I planted notes with carrot-coded pride.
Backup ran, restore replied, and files came home,
CI hummed softly while I hopped and combed.
A tiny rabbit cheers — the flow is tested, wide-eyed! 🥕

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately and concisely summarizes the main change: migrating a state backup restore test from shell to Vitest, which is the primary objective of the pull request.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch codex/5098-state-backup-restore

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions

github-actions Bot commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

E2E Advisor Recommendation

Required E2E: None
Optional E2E: None

Workflow run

Full advisor summary

E2E Recommendation Advisor

Failed: Could not parse JSON from advisor output; see /home/runner/work/NemoClaw/NemoClaw/artifacts/e2e-advisor/e2e-advisor-raw-output.txt

@github-actions

github-actions Bot commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

Vitest E2E Scenario Recommendation

Required Vitest E2E scenarios: None
Optional Vitest E2E scenarios: None

Workflow run

Full Vitest E2E advisor summary

Vitest E2E Scenario Advisor

Failed: Could not parse JSON from advisor output; see /home/runner/work/NemoClaw/NemoClaw/artifacts/e2e-advisor/e2e-scenario-advisor-raw-output.txt

@github-actions

github-actions Bot commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

PR Review Advisor

Findings: 0 needs attention, 1 worth checking, 0 nice ideas
Top item: PR review advisor unavailable

Review findings

🛠️ Needs attention

  • None.

🔎 Worth checking

  • PR review advisor unavailable: The automated advisor could not complete: Could not parse JSON from PR review advisor output; see /home/runner/work/NemoClaw/NemoClaw/artifacts/pr-review-advisor/pr-review-advisor-raw-output.txt
    • Recommendation: Re-run the PR Review Advisor or perform a manual review.
    • Evidence: Could not parse JSON from PR review advisor output; see /home/runner/work/NemoClaw/NemoClaw/artifacts/pr-review-advisor/pr-review-advisor-raw-output.txt

🌱 Nice ideas

  • None.
Consider writing more tests for
  • **Runtime validation** — Add or identify targeted runtime/integration validation for the changed behavior; do not report external E2E job pass/fail here.. Runtime/sandbox/infrastructure paths need behavioral runtime validation: .github/workflows/e2e-vitest-scenarios.yaml, tools/e2e-scenarios/workflow-boundary.mts.

Workflow run details

This is an automated advisory review. A human maintainer must make the final merge decision.

@github-actions

Copy link
Copy Markdown
Contributor

Vitest E2E Scenario Results — ❌ Some jobs failed

Run: 27436103496
Workflow ref: codex/5098-state-backup-restore
Requested scenarios: (default — all supported)
Requested jobs: state-backup-restore-vitest
Summary: 1 passed, 1 failed, 22 skipped

Job Result
credential-migration-vitest ⏭️ skipped
credential-sanitization-vitest ⏭️ skipped
double-onboard-vitest ⏭️ skipped
gateway-guard-recovery ⏭️ skipped
generate-matrix ✅ success
hermes-e2e-vitest ⏭️ skipped
hermes-root-entrypoint-smoke-vitest ⏭️ skipped
inference-routing-vitest ⏭️ skipped
issue-4434-tui-unreachable-inference-vitest ⏭️ skipped
launchable-smoke-vitest ⏭️ skipped
live-scenarios ⏭️ skipped
model-router-provider-routed-inference-vitest ⏭️ skipped
network-policy-vitest ⏭️ skipped
onboard-negative-paths-vitest ⏭️ skipped
openclaw-tui-chat-correlation-vitest ⏭️ skipped
openshell-version-pin-vitest ⏭️ skipped
rebuild-openclaw-vitest ⏭️ skipped
runtime-overrides-vitest ⏭️ skipped
sandbox-rebuild-vitest ⏭️ skipped
sandbox-survival-vitest ⏭️ skipped
shields-config-vitest ⏭️ skipped
skill-agent-vitest ⏭️ skipped
state-backup-restore-vitest ❌ failure
token-rotation-vitest ⏭️ skipped

Failed jobs: state-backup-restore-vitest. Check run artifacts for logs.

Signed-off-by: Carlos Villela <cvillela@nvidia.com>
@github-actions

Copy link
Copy Markdown
Contributor

Vitest E2E Scenario Results — ❌ Some jobs failed

Run: 27437481487
Workflow ref: codex/5098-state-backup-restore
Requested scenarios: (default — all supported)
Requested jobs: state-backup-restore-vitest
Summary: 1 passed, 1 failed, 22 skipped

Job Result
credential-migration-vitest ⏭️ skipped
credential-sanitization-vitest ⏭️ skipped
double-onboard-vitest ⏭️ skipped
gateway-guard-recovery ⏭️ skipped
generate-matrix ✅ success
hermes-e2e-vitest ⏭️ skipped
hermes-root-entrypoint-smoke-vitest ⏭️ skipped
inference-routing-vitest ⏭️ skipped
issue-4434-tui-unreachable-inference-vitest ⏭️ skipped
launchable-smoke-vitest ⏭️ skipped
live-scenarios ⏭️ skipped
model-router-provider-routed-inference-vitest ⏭️ skipped
network-policy-vitest ⏭️ skipped
onboard-negative-paths-vitest ⏭️ skipped
openclaw-tui-chat-correlation-vitest ⏭️ skipped
openshell-version-pin-vitest ⏭️ skipped
rebuild-openclaw-vitest ⏭️ skipped
runtime-overrides-vitest ⏭️ skipped
sandbox-rebuild-vitest ⏭️ skipped
sandbox-survival-vitest ⏭️ skipped
shields-config-vitest ⏭️ skipped
skill-agent-vitest ⏭️ skipped
state-backup-restore-vitest ❌ failure
token-rotation-vitest ⏭️ skipped

Failed jobs: state-backup-restore-vitest. Check run artifacts for logs.

Signed-off-by: Carlos Villela <cvillela@nvidia.com>
@github-actions

Copy link
Copy Markdown
Contributor

Vitest E2E Scenario Results — ✅ All jobs passed

Run: 27438030535
Workflow ref: codex/5098-state-backup-restore
Requested scenarios: (default — all supported)
Requested jobs: state-backup-restore-vitest
Summary: 2 passed, 0 failed, 22 skipped

Job Result
credential-migration-vitest ⏭️ skipped
credential-sanitization-vitest ⏭️ skipped
double-onboard-vitest ⏭️ skipped
gateway-guard-recovery ⏭️ skipped
generate-matrix ✅ success
hermes-e2e-vitest ⏭️ skipped
hermes-root-entrypoint-smoke-vitest ⏭️ skipped
inference-routing-vitest ⏭️ skipped
issue-4434-tui-unreachable-inference-vitest ⏭️ skipped
launchable-smoke-vitest ⏭️ skipped
live-scenarios ⏭️ skipped
model-router-provider-routed-inference-vitest ⏭️ skipped
network-policy-vitest ⏭️ skipped
onboard-negative-paths-vitest ⏭️ skipped
openclaw-tui-chat-correlation-vitest ⏭️ skipped
openshell-version-pin-vitest ⏭️ skipped
rebuild-openclaw-vitest ⏭️ skipped
runtime-overrides-vitest ⏭️ skipped
sandbox-rebuild-vitest ⏭️ skipped
sandbox-survival-vitest ⏭️ skipped
shields-config-vitest ⏭️ skipped
skill-agent-vitest ⏭️ skipped
state-backup-restore-vitest ✅ success
token-rotation-vitest ⏭️ skipped

@cv cv marked this pull request as ready for review June 12, 2026 21:03

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
test/e2e-scenario/live/state-backup-restore.test.ts (1)

130-336: 📐 Maintainability & Code Quality | 🏗️ Heavy lift

Reduce scenario-function complexity by splitting phase helpers.

The single test body currently bundles preflight, onboarding, write/backup, destroy/re-onboard, restore, and verification logic in one function. Extracting phase helpers will make failures easier to isolate and future changes safer.

As per coding guidelines, **/*.{js,ts,tsx,jsx}: “Keep function complexity low in JavaScript and TypeScript code.”

Also applies to: 341-417

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/e2e-scenario/live/state-backup-restore.test.ts` around lines 130 - 336,
The test "state-backup-restore: backup-workspace.sh restores workspace files and
memory directory" is too large and should be split into smaller phase helper
functions to reduce complexity; extract logical phases into named helpers such
as preparePreflightChecks (handles docker check, API key assertions,
artifacts.writeJson contract), performOnboard (wraps environment.assertReady +
onboard.from and returns NemoClawInstance), writeMarkerFiles (writes markers via
stateValidation.writeMarkerFile using WORKSPACE_FILES and MEMORY_FILE),
runBackupAndValidate (invokes host.command to run backup-workspace.sh, verifies
output, determines createdBackupDir and validates backup contents),
performDestroy (wraps destroySandboxUntilAbsent / onboard.destroySandbox /
sandbox.openshell delete steps), and restoreAndVerify (runs restore logic and
final assertions); replace the large inline blocks in the test body with calls
to these helpers so each helper has a single responsibility and the main test
reads as a sequence of phases.

Source: Coding guidelines

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@test/e2e-scenario/live/state-backup-restore.test.ts`:
- Around line 337-340: The second onboarding call (onboard.from when creating
restoredInstance) currently hard-fails on endpoint-validation outages; wrap this
call in a try/catch and convert the specific endpoint-validation outage error
into a test skip path instead of failing the test. Detect the outage by checking
the error's unique marker (e.g., error.message or error.code matching the
endpoint-validation outage pattern), call the test-skip handler used in the file
(or return/skip the rest of the test) and avoid rethrowing for that case;
otherwise rethrow the error so real failures still fail. Ensure references
remain to onboard.from, restoredInstance, ready, SANDBOX_NAME, and
ONBOARD_TIMEOUT_MS so the change is localized to this onboarding step.

---

Nitpick comments:
In `@test/e2e-scenario/live/state-backup-restore.test.ts`:
- Around line 130-336: The test "state-backup-restore: backup-workspace.sh
restores workspace files and memory directory" is too large and should be split
into smaller phase helper functions to reduce complexity; extract logical phases
into named helpers such as preparePreflightChecks (handles docker check, API key
assertions, artifacts.writeJson contract), performOnboard (wraps
environment.assertReady + onboard.from and returns NemoClawInstance),
writeMarkerFiles (writes markers via stateValidation.writeMarkerFile using
WORKSPACE_FILES and MEMORY_FILE), runBackupAndValidate (invokes host.command to
run backup-workspace.sh, verifies output, determines createdBackupDir and
validates backup contents), performDestroy (wraps destroySandboxUntilAbsent /
onboard.destroySandbox / sandbox.openshell delete steps), and restoreAndVerify
(runs restore logic and final assertions); replace the large inline blocks in
the test body with calls to these helpers so each helper has a single
responsibility and the main test reads as a sequence of phases.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: a7011f51-b1d3-4d2d-a51f-2e1d5e0280d3

📥 Commits

Reviewing files that changed from the base of the PR and between 0e30bff and 46655ed.

📒 Files selected for processing (6)
  • .github/workflows/e2e-vitest-scenarios.yaml
  • test/e2e-scenario/fixtures/phases/onboarding.ts
  • test/e2e-scenario/live/state-backup-restore.test.ts
  • test/e2e-scenario/support-tests/e2e-scenarios-workflow.test.ts
  • tools/e2e-scenarios/free-standing-jobs.env
  • tools/e2e-scenarios/workflow-boundary.mts

Comment thread test/e2e-scenario/live/state-backup-restore.test.ts Outdated
Signed-off-by: Carlos Villela <cvillela@nvidia.com>
@github-actions

Copy link
Copy Markdown
Contributor

Vitest E2E Scenario Results — ✅ All jobs passed

Run: 27444251616
Workflow ref: codex/5098-state-backup-restore
Requested scenarios: state-backup-restore
Requested jobs: (default — all free-standing when no scenarios are requested)
Summary: 2 passed, 0 failed, 22 skipped

Job Result
credential-migration-vitest ⏭️ skipped
credential-sanitization-vitest ⏭️ skipped
double-onboard-vitest ⏭️ skipped
gateway-guard-recovery ⏭️ skipped
generate-matrix ✅ success
hermes-e2e-vitest ⏭️ skipped
hermes-root-entrypoint-smoke-vitest ⏭️ skipped
inference-routing-vitest ⏭️ skipped
issue-4434-tui-unreachable-inference-vitest ⏭️ skipped
launchable-smoke-vitest ⏭️ skipped
live-scenarios ⏭️ skipped
model-router-provider-routed-inference-vitest ⏭️ skipped
network-policy-vitest ⏭️ skipped
onboard-negative-paths-vitest ⏭️ skipped
openclaw-tui-chat-correlation-vitest ⏭️ skipped
openshell-version-pin-vitest ⏭️ skipped
rebuild-openclaw-vitest ⏭️ skipped
runtime-overrides-vitest ⏭️ skipped
sandbox-rebuild-vitest ⏭️ skipped
sandbox-survival-vitest ⏭️ skipped
shields-config-vitest ⏭️ skipped
skill-agent-vitest ⏭️ skipped
state-backup-restore-vitest ✅ success
token-rotation-vitest ⏭️ skipped

@cv cv added the v0.0.65 Release target label Jun 13, 2026
@cv cv merged commit 03f5f16 into main Jun 13, 2026
42 checks passed
@cv cv deleted the codex/5098-state-backup-restore branch June 13, 2026 02:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

v0.0.65 Release target

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants