Skip to content

test(e2e): migrate test-sandbox-rebuild.sh to vitest#5333

Merged
jyaunches merged 5 commits into
mainfrom
e2e-migrate/test-sandbox-rebuild-helpers
Jun 12, 2026
Merged

test(e2e): migrate test-sandbox-rebuild.sh to vitest#5333
jyaunches merged 5 commits into
mainfrom
e2e-migrate/test-sandbox-rebuild-helpers

Conversation

@jyaunches

@jyaunches jyaunches commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

Summary

Migrate test/e2e/test-sandbox-rebuild.sh with equivalent live Vitest coverage and small rebuild/state helper extractions for dependent rebuild migrations.

Related Issues

Refs #5098
Refs #6076156

Contract mapping

  • Legacy assertion: onboard/create sandbox, write marker state, simulate stale registry version, run rebuild, verify marker survives, registry refreshes, and backup files do not leak credentials.
    • Replacement: test/e2e-scenario/live/sandbox-rebuild.test.ts plus sandbox-rebuild-vitest workflow job.
    • Boundary preserved: real nemoclaw onboard, real OpenShell sandbox exec, local ~/.nemoclaw/sandboxes.json mutation, real nemoclaw <sandbox> rebuild --yes, backup directory scan.

Simplicity check

  • Test shape: simple live Vitest dependent migration.
  • Original runner/lane: manual legacy shell script on Ubuntu + Docker/OpenShell + NVIDIA_API_KEY; no scheduled workflow lane currently referenced this script.
  • Replacement runner: ubuntu-latest free-standing sandbox-rebuild-vitest job with Docker/OpenShell install and NVIDIA_API_KEY secret.
  • New shared helpers: small lifecycle/state helpers for repeated rebuild/state boundaries (rebuildSandbox, assertSandboxReadyAfterRebuild, marker read/write/assertions, registry snapshot/patch/assertions, backup discovery/leak scan); local helper was insufficient because rebuild dependents share state mutation/cleanup and credential-scan risk.
  • New framework/registry/ledger: none
  • Workflow changes: add selective sandbox-rebuild-vitest job to .github/workflows/e2e-vitest-scenarios.yaml; legacy shell deletion/retirement deferred to Epic: Migrate legacy bash E2E into the Vitest E2E system #5098 Phase 11.
  • Selective dispatch: gh workflow run e2e-vitest-scenarios.yaml --repo NVIDIA/NemoClaw --ref e2e-migrate/test-sandbox-rebuild-helpers -f jobs=sandbox-rebuild-vitest -f pr_number=<PR>

Verification

  • npm run build:cli
  • npm run typecheck:cli
  • npx vitest run --project e2e-vitest-support test/e2e-scenario/support-tests/e2e-phase-lifecycle.test.ts test/e2e-scenario/support-tests/e2e-phase-state-validation.test.ts test/e2e-scenario/support-tests/e2e-phase-onboarding.test.ts test/e2e-scenario/support-tests/e2e-scenarios-workflow.test.ts
  • NEMOCLAW_RUN_E2E_SCENARIOS=1 npx vitest run --project e2e-scenarios-live test/e2e-scenario/live/sandbox-rebuild.test.ts --silent=false --reporter=default (local no-secret/Docker path skipped)
  • git diff --check
  • Note: pre-push full CLI suite was attempted and failed on unrelated existing local failures/timeouts (test/generate-hermes-config.test.ts, test/ssrf-parity.test.ts missing nemoclaw/dist, Docker WSL assertions, shell-supervisor timing, etc.); focused migration checks above passed.
  • PR URL: test(e2e): migrate test-sandbox-rebuild.sh to vitest #5333
  • Same-runner selective run: https://github.com/NVIDIA/NemoClaw/actions/runs/27420621638 (in_progress)

Summary by CodeRabbit

  • Tests

    • Added a new end-to-end Vitest scenario for sandbox rebuilds: live rebuild, readiness polling, marker-file validation, registry snapshot/restore, rebuild backup inspection, and credential-leak scanning.
    • Extended test fixtures and support tests with helpers for rebuild orchestration, sandbox destruction, marker-file operations, registry/backup utilities, and related unit tests.
  • Chores

    • CI updated to run the sandbox-rebuild Vitest scenario and include its results in PR reporting.

@coderabbitai

coderabbitai Bot commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: eaee9191-7458-43cf-9074-3bf274c8bbd1

📥 Commits

Reviewing files that changed from the base of the PR and between 2524394 and a15b5f7.

📒 Files selected for processing (5)
  • .github/workflows/e2e-vitest-scenarios.yaml
  • test/e2e-scenario/fixtures/phases/lifecycle.ts
  • test/e2e-scenario/fixtures/phases/state-validation.ts
  • test/e2e-scenario/support-tests/e2e-phase-lifecycle.test.ts
  • test/e2e-scenario/support-tests/e2e-phase-state-validation.test.ts
🚧 Files skipped from review as they are similar to previous changes (3)
  • test/e2e-scenario/support-tests/e2e-phase-lifecycle.test.ts
  • test/e2e-scenario/support-tests/e2e-phase-state-validation.test.ts
  • test/e2e-scenario/fixtures/phases/lifecycle.ts

📝 Walkthrough

Walkthrough

Adds a sandbox-rebuild e2e Vitest scenario with fixture rebuild/readiness helpers, state/registry/backup utilities, a live test, supporting unit and workflow tests, a CI job sandbox-rebuild-vitest, and workflow-boundary validator updates.

Changes

Sandbox Rebuild E2E Testing

Layer / File(s) Summary
Lifecycle & Onboarding fixture helpers
test/e2e-scenario/fixtures/phases/lifecycle.ts, test/e2e-scenario/fixtures/phases/onboarding.ts
Adds rebuildSandbox(...), assertSandboxReadyAfterRebuild(...), and centralizes sandbox destruction into destroySandbox(...).
State & backup validation utilities
test/e2e-scenario/fixtures/phases/state-validation.ts
Adds snapshot/restore for registry/session, registry read/patch, latest-rebuild-backup discovery, rebuild-manifest reader, credential-leak scanning, marker-file write/read/expect helpers, and related types.
Live sandbox-rebuild Vitest scenario
test/e2e-scenario/live/sandbox-rebuild.test.ts
Live Vitest test that onboards a sandbox, forces stale agentVersion, runs rebuild, asserts marker persistence and registry update, and scans backup artifacts for credential leaks.
Fixture & support tests
test/e2e-scenario/support-tests/*
Adds host-side tests for marker I/O, registry entry patching, snapshot/restore, rebuild-backup discovery/manifest parsing, credential leak path detection, and readiness matcher behavior.
Workflow job and wiring
.github/workflows/e2e-vitest-scenarios.yaml
Registers sandbox-rebuild-vitest in selectors and matrix, adds the job (Docker Hub login, Node setup, deps/build, install OpenShell, run test, upload artifacts), and includes the job in report-to-pr needs.
Boundary validator updates
tools/e2e-scenarios/workflow-boundary.mts
Maps sandbox-rebuildsandbox-rebuild-vitest, adds validateSandboxRebuildVitestJob, and wires validation into the main flow and report-to-pr dependency checks.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

  • NVIDIA/NemoClaw#5243: Extends the free-standing Vitest selector and report-to-pr.needs plumbing; touches the same workflow file.
  • NVIDIA/NemoClaw#5231: Adds another free-standing Vitest job and alters selector/matrix wiring in the same workflows.
  • NVIDIA/NemoClaw#5234: Related workflow/advisor detection logic that interacts with the Vitest workflow wiring.

Suggested labels

area: sandbox, area: ci

Suggested reviewers

  • cv
  • prekshivyas

Poem

🐰 I hopped through CI with a curious grin,
I patched the registry, then watched rebuild begin,
Marker files stayed snug through each command,
Backups scanned clean by my little pawed hand,
Tests ticked green — carrots all around!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately and concisely describes the main change: migrating a shell script test to Vitest. It clearly indicates what was changed (test-sandbox-rebuild.sh) and how (migrated to vitest).
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch e2e-migrate/test-sandbox-rebuild-helpers

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions

github-actions Bot commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

E2E Advisor Recommendation

Required E2E: sandbox-rebuild-vitest
Optional E2E: rebuild-openclaw-vitest, double-onboard-vitest, sandbox-survival-vitest

Dispatch hint: sandbox-rebuild-vitest

Workflow run

Full advisor summary

E2E Recommendation Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required E2E

  • sandbox-rebuild-vitest (high): Primary coverage for the changed workflow job and new live sandbox rebuild path. It exercises real onboarding, OpenShell sandbox state preservation, nemoclaw <sandbox> rebuild --yes, registry metadata refresh, readiness polling, and rebuild backup credential hygiene.

Optional E2E

  • rebuild-openclaw-vitest (high): Adjacent rebuild coverage already present in the workflow. Useful to confirm the newly added sandbox rebuild job and fixture changes did not regress the existing OpenClaw rebuild live path.
  • double-onboard-vitest (high): Optional confidence for onboarding and cleanup fixture behavior touched by the PR, especially destructive pre-cleanup and teardown around real OpenShell sandboxes.
  • sandbox-survival-vitest (high): Optional adjacent sandbox lifecycle confidence because the lifecycle/state-validation fixtures are shared with survival-style sandbox recovery flows.

New E2E recommendations

  • None.

Dispatch hint

  • Workflow: e2e-vitest-scenarios.yaml
  • jobs input: sandbox-rebuild-vitest

@github-actions

github-actions Bot commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

Vitest E2E Scenario Recommendation

Required Vitest E2E scenarios: sandbox-rebuild-vitest
Optional Vitest E2E scenarios: None

Dispatch required Vitest E2E scenarios:

  • gh workflow run e2e-vitest-scenarios.yaml --ref <pr-head-ref> --field jobs=sandbox-rebuild-vitest

Workflow run

Full Vitest E2E advisor summary

Vitest E2E Scenario Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required Vitest E2E scenarios

  • sandbox-rebuild-vitest: Focused free-standing Vitest job wired for changed live test test/e2e-scenario/live/sandbox-rebuild.test.ts.
    • Dispatch: gh workflow run e2e-vitest-scenarios.yaml --ref <pr-head-ref> --field jobs=sandbox-rebuild-vitest

Optional Vitest E2E scenarios

  • None.

Relevant changed files

  • .github/workflows/e2e-vitest-scenarios.yaml
  • test/e2e-scenario/fixtures/phases/lifecycle.ts
  • test/e2e-scenario/fixtures/phases/onboarding.ts
  • test/e2e-scenario/fixtures/phases/state-validation.ts
  • test/e2e-scenario/live/sandbox-rebuild.test.ts
  • test/e2e-scenario/support-tests/e2e-phase-lifecycle.test.ts
  • test/e2e-scenario/support-tests/e2e-phase-state-validation.test.ts
  • test/e2e-scenario/support-tests/e2e-scenarios-workflow.test.ts
  • tools/e2e-scenarios/workflow-boundary.mts

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.github/workflows/e2e-vitest-scenarios.yaml:
- Line 1222: The DOCKER_CONFIG env var is set to the checkout
(${GITHUB_WORKSPACE}) which exposes Docker credentials to the repo; change the
DOCKER_CONFIG assignment in the workflow (replace DOCKER_CONFIG: ${{
github.workspace }}/.docker-config-model-router-provider-routed-inference) to
point to a runner temporary dir (e.g., ${{ runner.temp }}/.docker-config-...) or
another non-workspace path, and update the boundary validator in
tools/e2e-scenarios/workflow-boundary.mts to enforce that DOCKER_CONFIG (and any
docker auth paths) are outside the checkout by rejecting or warning on paths
inside ${GITHUB_WORKSPACE} rather than blessing them. Ensure the env variable
name DOCKER_CONFIG and the validator function(s) that currently allow workspace
paths are the targets of the change.

In `@test/e2e-scenario/fixtures/phases/state-validation.ts`:
- Around line 342-349: The equality check currently ignores leading/trailing
whitespace by using actual.trim(), which hides intentional whitespace
differences; in the marker verification (where actual is read via
this.readMarkerFile and compared to expected for markerPath), remove the
actual.trim() comparison and enforce exact equality, or if necessary normalize
only a single trailing newline (e.g., allow actual === expected OR
(actual.endsWith("\n") && actual.slice(0,-1) === expected)); update the
conditional that throws the Error to use this stricter comparison and keep the
same sandboxName/error message logic.

In `@test/e2e-scenario/live/sandbox-rebuild.test.ts`:
- Around line 127-132: The cleanup uses os.homedir() directly while
snapshotRegistryAndSession() and latestRebuildBackupDir() use process.env.HOME
?? os.homedir(), causing mismatched backup paths when HOME is overridden; update
the cleanup to compute backupRoot the same way (use process.env.HOME ??
os.homedir() or call the existing latestRebuildBackupDir() helper) and then use
that value in the fs.rmSync call and the cleanup description so the test removes
the same rebuild-backup tree created during the run (also ensure
restoreRegistryAndSession(stateSnapshot) remains invoked).

In `@tools/e2e-scenarios/workflow-boundary.mts`:
- Around line 701-705: The validator for the "Install OpenShell" job
(installOpenShell) lacks checks that the workflow scrubs Docker Hub credentials;
add requireRunContains assertions for the installOpenShell step that assert the
install command unsets DOCKERHUB_USERNAME and DOCKERHUB_TOKEN (i.e., add
requireRunContains(errors, installOpenShell, "-u DOCKERHUB_USERNAME") and
requireRunContains(errors, installOpenShell, "-u DOCKERHUB_TOKEN")) alongside
the existing env -u DOCKER_CONFIG and token unsets to mirror the
sandbox-rebuild-vitest behavior and prevent re-exposing Docker creds in
scripts/install-openshell.sh.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 913b1e82-ec9d-4088-9066-157aaebeaeeb

📥 Commits

Reviewing files that changed from the base of the PR and between 561bd2f and d706c57.

📒 Files selected for processing (8)
  • .github/workflows/e2e-vitest-scenarios.yaml
  • test/e2e-scenario/fixtures/phases/lifecycle.ts
  • test/e2e-scenario/fixtures/phases/onboarding.ts
  • test/e2e-scenario/fixtures/phases/state-validation.ts
  • test/e2e-scenario/live/sandbox-rebuild.test.ts
  • test/e2e-scenario/support-tests/e2e-phase-state-validation.test.ts
  • test/e2e-scenario/support-tests/e2e-scenarios-workflow.test.ts
  • tools/e2e-scenarios/workflow-boundary.mts

Comment thread .github/workflows/e2e-vitest-scenarios.yaml Outdated
Comment on lines +342 to +349
const actual = await this.readMarkerFile(instance, markerPath, options);
if (actual !== expected && actual.trim() !== expected) {
const sandboxName = typeof instance === "string" ? instance : instance.sandboxName;
throw new Error(
`marker ${markerPath} in ${sandboxName} did not match expected content: got ${JSON.stringify(
actual,
)}`,
);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Don’t trim marker contents during equality checks.

actual.trim() !== expected makes leading/trailing whitespace invisible, so a rebuild that changes "token" to " token " (or drops intentional whitespace) would still pass this assertion. This helper is meant to prove exact workspace persistence, so the comparison should stay exact, or at most normalize a single trailing newline if the transport is known to add one.

Suggested fix
   async expectMarkerFileContent(
     instance: NemoClawInstance | string,
     markerPath: string,
     expected: string,
     options: MarkerFileOptions = {},
   ): Promise<void> {
     const actual = await this.readMarkerFile(instance, markerPath, options);
-    if (actual !== expected && actual.trim() !== expected) {
+    if (actual !== expected) {
       const sandboxName = typeof instance === "string" ? instance : instance.sandboxName;
       throw new Error(
         `marker ${markerPath} in ${sandboxName} did not match expected content: got ${JSON.stringify(
           actual,
         )}`,
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const actual = await this.readMarkerFile(instance, markerPath, options);
if (actual !== expected && actual.trim() !== expected) {
const sandboxName = typeof instance === "string" ? instance : instance.sandboxName;
throw new Error(
`marker ${markerPath} in ${sandboxName} did not match expected content: got ${JSON.stringify(
actual,
)}`,
);
const actual = await this.readMarkerFile(instance, markerPath, options);
if (actual !== expected) {
const sandboxName = typeof instance === "string" ? instance : instance.sandboxName;
throw new Error(
`marker ${markerPath} in ${sandboxName} did not match expected content: got ${JSON.stringify(
actual,
)}`,
);
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/e2e-scenario/fixtures/phases/state-validation.ts` around lines 342 -
349, The equality check currently ignores leading/trailing whitespace by using
actual.trim(), which hides intentional whitespace differences; in the marker
verification (where actual is read via this.readMarkerFile and compared to
expected for markerPath), remove the actual.trim() comparison and enforce exact
equality, or if necessary normalize only a single trailing newline (e.g., allow
actual === expected OR (actual.endsWith("\n") && actual.slice(0,-1) ===
expected)); update the conditional that throws the Error to use this stricter
comparison and keep the same sandboxName/error message logic.

Comment on lines +127 to +132
const stateSnapshot = snapshotRegistryAndSession();
const backupRoot = path.join(os.homedir(), ".nemoclaw", "rebuild-backups", SANDBOX_NAME);
cleanup.add(`restore NemoClaw state files for ${SANDBOX_NAME}`, () => {
restoreRegistryAndSession(stateSnapshot);
fs.rmSync(backupRoot, { recursive: true, force: true });
});

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Align backup cleanup with the helper’s HOME resolution.

snapshotRegistryAndSession() and latestRebuildBackupDir() both key off process.env.HOME ?? os.homedir(), but this cleanup hardcodes os.homedir(). If the live runner overrides HOME for isolation, the test mutates/scans one rebuild-backup tree and deletes another, which leaves stale backups behind for reruns.

Suggested fix
     const stateSnapshot = snapshotRegistryAndSession();
-    const backupRoot = path.join(os.homedir(), ".nemoclaw", "rebuild-backups", SANDBOX_NAME);
+    const home = process.env.HOME ?? os.homedir();
+    const backupRoot = path.join(home, ".nemoclaw", "rebuild-backups", SANDBOX_NAME);
     cleanup.add(`restore NemoClaw state files for ${SANDBOX_NAME}`, () => {
       restoreRegistryAndSession(stateSnapshot);
       fs.rmSync(backupRoot, { recursive: true, force: true });
     });
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const stateSnapshot = snapshotRegistryAndSession();
const backupRoot = path.join(os.homedir(), ".nemoclaw", "rebuild-backups", SANDBOX_NAME);
cleanup.add(`restore NemoClaw state files for ${SANDBOX_NAME}`, () => {
restoreRegistryAndSession(stateSnapshot);
fs.rmSync(backupRoot, { recursive: true, force: true });
});
const stateSnapshot = snapshotRegistryAndSession();
const home = process.env.HOME ?? os.homedir();
const backupRoot = path.join(home, ".nemoclaw", "rebuild-backups", SANDBOX_NAME);
cleanup.add(`restore NemoClaw state files for ${SANDBOX_NAME}`, () => {
restoreRegistryAndSession(stateSnapshot);
fs.rmSync(backupRoot, { recursive: true, force: true });
});
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/e2e-scenario/live/sandbox-rebuild.test.ts` around lines 127 - 132, The
cleanup uses os.homedir() directly while snapshotRegistryAndSession() and
latestRebuildBackupDir() use process.env.HOME ?? os.homedir(), causing
mismatched backup paths when HOME is overridden; update the cleanup to compute
backupRoot the same way (use process.env.HOME ?? os.homedir() or call the
existing latestRebuildBackupDir() helper) and then use that value in the
fs.rmSync call and the cleanup description so the test removes the same
rebuild-backup tree created during the run (also ensure
restoreRegistryAndSession(stateSnapshot) remains invoked).

Comment thread tools/e2e-scenarios/workflow-boundary.mts
@github-actions

github-actions Bot commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

PR Review Advisor

Findings: 2 needs attention, 7 worth checking, 1 nice ideas
Since last review: 0 prior items resolved, 5 still apply, 1 new item found

Review findings

🛠️ Needs attention

  • Backup credential scan can pass without scanning a rebuild backup (test/e2e-scenario/live/sandbox-rebuild.test.ts:235): The PR contract says the migration verifies that backup files do not leak credentials, but the live test treats a missing backup directory as a passing condition. In that path no rebuild backup files are inspected, so the credential-hygiene acceptance clause is only partially covered.
    • Recommendation: If rebuild is expected to create a backup for this scenario, require a backup directory and scan it. If some runtime legitimately produces no backup, make that an explicit separate assertion with a runtime-specific reason and avoid counting it as successful credential-scan coverage.
    • Evidence: latestRebuildBackupDir(SANDBOX_NAME) may return undefined; listCredentialLeakPaths(undefined, { extraSecrets: [apiKey] }) returns []; the test writes note: 'No backup directory found; legacy shell skipped this check.' and still expects leaks toEqual([]).
  • Legacy backup-failure abort clause is not covered by the Vitest migration (test/e2e-scenario/live/sandbox-rebuild.test.ts:1): The legacy test documents 'Rebuild aborts safely when backup fails (sandbox not running)' as part of the sandbox rebuild proof, but the new live Vitest test only covers the successful rebuild path. If the PR is claiming equivalent live coverage for test/e2e/test-sandbox-rebuild.sh, this negative path is missing.
    • Recommendation: Add a behavior-specific negative test or explicitly narrow the migration contract. A useful test would simulate the backup-failure or sandbox-not-running condition and assert that rebuild aborts without destroying the original sandbox/registry state.
    • Evidence: test/e2e/test-sandbox-rebuild.sh lists 'Rebuild aborts safely when backup fails (sandbox not running)' in its validation contract. test/e2e-scenario/live/sandbox-rebuild.test.ts onboards, writes a marker, patches stale metadata, runs a successful rebuild, and verifies post-rebuild state, but does not exercise backup-failure abort behavior.

🔎 Worth checking

  • Source-of-truth review needed: sandbox-rebuild-vitest Docker auth storage: The advisor marked localized patch analysis as needs_followup.
    • Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
    • Evidence: sandbox-rebuild-vitest has an Authenticate to Docker Hub step with DOCKERHUB_USERNAME and DOCKERHUB_TOKEN but no DOCKER_CONFIG setup or Clean up Docker auth step.
  • Source-of-truth review needed: Lifecycle readiness parsing: The advisor marked localized patch analysis as needs_followup.
    • Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
    • Evidence: outputContainsReadySandbox checks `${sandboxName}.*\bReady\b`; support coverage only proves ANSI-colored Ready is accepted.
  • Source-of-truth review needed: Best-effort sandbox cleanup: The advisor marked localized patch analysis as needs_followup.
    • Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
    • Evidence: bestEffort catches all errors and does nothing; it wraps pre-cleanup and final cleanup destroy/delete calls.
  • Source-of-truth review needed: Missing rebuild backup directory tolerance: The advisor marked localized patch analysis as needs_followup.
    • Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
    • Evidence: latestRebuildBackupDir can return undefined; listCredentialLeakPaths returns [] for undefined; the live test records a note and still expects no leaks.
  • Keep Docker Hub auth out of the runner default config (.github/workflows/e2e-vitest-scenarios.yaml:990): The new sandbox-rebuild-vitest job logs in to Docker Hub with DOCKERHUB_USERNAME and DOCKERHUB_TOKEN but does not set an isolated DOCKER_CONFIG, log out, or remove Docker auth material in an always() cleanup step. That can leave Docker credentials in the runner user's default Docker config for later steps in the same job.
    • Recommendation: Add a step that creates a restrictive DOCKER_CONFIG under RUNNER_TEMP and exports it for Docker login, then add an if: always() cleanup step that runs docker logout docker.io || true and rm -rf "${DOCKER_CONFIG}". Update tools/e2e-scenarios/workflow-boundary.mts and support tests to require that boundary for sandbox-rebuild-vitest.
    • Evidence: sandbox-rebuild-vitest has an Authenticate to Docker Hub step using secrets.DOCKERHUB_USERNAME and secrets.DOCKERHUB_TOKEN. The job contains no isolated DOCKER_CONFIG setup and no Clean up Docker auth step; tools/e2e-scenarios/workflow-boundary.mts validates the Docker login but does not require auth isolation or cleanup.
  • Readiness probe can accept a Not Ready sandbox row (test/e2e-scenario/fixtures/phases/lifecycle.ts:91): assertSandboxReadyAfterRebuild relies on outputContainsReadySandbox(), which matches the sandbox name followed anywhere later by the word Ready. A row such as '<sandbox> ... Not Ready' also contains Ready, so the helper can stop polling before the rebuilt sandbox is actually ready.
    • Recommendation: Parse the matching sandbox row/status more strictly, require a status token or column that equals Ready, and explicitly reject Not Ready. Add support coverage proving the helper keeps polling for '<sandbox> Not Ready' and only returns for exact Ready.
    • Evidence: outputContainsReadySandbox builds new RegExp(`${escaped}.*\\bReady\\b`, 'i') over stripped stdout/stderr. The added support test only proves ANSI-colored Ready is accepted.
  • Best-effort cleanup suppresses sandbox teardown failures without evidence (test/e2e-scenario/live/sandbox-rebuild.test.ts:66): The destructive live test swallows all pre-cleanup and final cleanup errors. Preserving the primary failure is useful, but completely silent cleanup failures can hide residual sandboxes, containers, Docker state, or local registry state that affect later runs.
    • Recommendation: Record suppressed cleanup failures to artifacts or stderr while still preserving the primary test failure. Add focused support coverage for the cleanup wrapper's behavior.
    • Evidence: bestEffort catches all errors with an empty catch block except for a comment. It wraps both pre-cleanup and final cleanup destroy/delete calls.

🌱 Nice ideas

  • Add direct support coverage for rebuildSandbox behavior (test/e2e-scenario/support-tests/e2e-phase-lifecycle.test.ts:200): The new rebuildSandbox helper is the fixture contract for a long-running destructive rebuild path, but support tests do not verify its argv, environment/redaction plumbing, timeout, verbose flag, or non-zero rejection behavior.
    • Recommendation: Add fake-runner tests for 'rebuildSandbox invokes nemoclaw <sandbox> rebuild --yes with redacted env and rejects non-zero exits' and 'rebuildSandbox appends --verbose only when requested'.
    • Evidence: The added lifecycle support test covers ANSI-colored Ready output for assertSandboxReadyAfterRebuild. No support test directly exercises LifecyclePhaseFixture.rebuildSandbox().
Consider writing more tests for
  • **Runtime validation** — sandbox-rebuild-vitest configures DOCKER_CONFIG under RUNNER_TEMP and always logs out/removes Docker auth. This PR changes a secret-bearing workflow and destructive live sandbox rebuild coverage. Support tests cover selectors and several helper utilities, but the most important credential-boundary, readiness negative, backup-scan, and rebuild abort behaviors still need targeted validation.
  • **Runtime validation** — workflow boundary rejects sandbox-rebuild-vitest Docker login without isolated DOCKER_CONFIG cleanup. This PR changes a secret-bearing workflow and destructive live sandbox rebuild coverage. Support tests cover selectors and several helper utilities, but the most important credential-boundary, readiness negative, backup-scan, and rebuild abort behaviors still need targeted validation.
  • **Runtime validation** — assertSandboxReadyAfterRebuild keeps polling for '<sandbox> Not Ready' and returns only for exact Ready status. This PR changes a secret-bearing workflow and destructive live sandbox rebuild coverage. Support tests cover selectors and several helper utilities, but the most important credential-boundary, readiness negative, backup-scan, and rebuild abort behaviors still need targeted validation.
  • **Runtime validation** — sandbox-rebuild live test fails or records explicit expected-no-backup evidence when rebuild completes without a backup directory. This PR changes a secret-bearing workflow and destructive live sandbox rebuild coverage. Support tests cover selectors and several helper utilities, but the most important credential-boundary, readiness negative, backup-scan, and rebuild abort behaviors still need targeted validation.
  • **Runtime validation** — sandbox rebuild abort path preserves original sandbox state when backup fails or sandbox is not running. This PR changes a secret-bearing workflow and destructive live sandbox rebuild coverage. Support tests cover selectors and several helper utilities, but the most important credential-boundary, readiness negative, backup-scan, and rebuild abort behaviors still need targeted validation.
  • **Add direct support coverage for rebuildSandbox behavior** — Add fake-runner tests for 'rebuildSandbox invokes nemoclaw <sandbox> rebuild --yes with redacted env and rejects non-zero exits' and 'rebuildSandbox appends --verbose only when requested'.
  • **Acceptance clause:** Migrate `test/e2e/test-sandbox-rebuild.sh` with equivalent live Vitest coverage and small rebuild/state helper extractions for dependent rebuild migrations. — add test evidence or identify existing coverage. The PR adds test/e2e-scenario/live/sandbox-rebuild.test.ts, a sandbox-rebuild-vitest workflow job, and lifecycle/state/onboarding helper extractions. Equivalence is weakened because backup credential hygiene can pass with no backup directory, readiness parsing can accept Not Ready, and the legacy backup-failure abort path is not covered.
  • **Acceptance clause:** Refs Epic: Migrate legacy bash E2E into the Vitest E2E system #5098 — add test evidence or identify existing coverage. No linked issue clauses or comments for Epic: Migrate legacy bash E2E into the Vitest E2E system #5098 were provided in deterministic context; only the PR-body reference is available.
Since last review details

Current findings:

  • Source-of-truth review needed: sandbox-rebuild-vitest Docker auth storage: The advisor marked localized patch analysis as needs_followup.
    • Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
    • Evidence: sandbox-rebuild-vitest has an Authenticate to Docker Hub step with DOCKERHUB_USERNAME and DOCKERHUB_TOKEN but no DOCKER_CONFIG setup or Clean up Docker auth step.
  • Source-of-truth review needed: Lifecycle readiness parsing: The advisor marked localized patch analysis as needs_followup.
    • Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
    • Evidence: outputContainsReadySandbox checks `${sandboxName}.*\bReady\b`; support coverage only proves ANSI-colored Ready is accepted.
  • Source-of-truth review needed: Best-effort sandbox cleanup: The advisor marked localized patch analysis as needs_followup.
    • Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
    • Evidence: bestEffort catches all errors and does nothing; it wraps pre-cleanup and final cleanup destroy/delete calls.
  • Source-of-truth review needed: Missing rebuild backup directory tolerance: The advisor marked localized patch analysis as needs_followup.
    • Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
    • Evidence: latestRebuildBackupDir can return undefined; listCredentialLeakPaths returns [] for undefined; the live test records a note and still expects no leaks.
  • Keep Docker Hub auth out of the runner default config (.github/workflows/e2e-vitest-scenarios.yaml:990): The new sandbox-rebuild-vitest job logs in to Docker Hub with DOCKERHUB_USERNAME and DOCKERHUB_TOKEN but does not set an isolated DOCKER_CONFIG, log out, or remove Docker auth material in an always() cleanup step. That can leave Docker credentials in the runner user's default Docker config for later steps in the same job.
    • Recommendation: Add a step that creates a restrictive DOCKER_CONFIG under RUNNER_TEMP and exports it for Docker login, then add an if: always() cleanup step that runs docker logout docker.io || true and rm -rf "${DOCKER_CONFIG}". Update tools/e2e-scenarios/workflow-boundary.mts and support tests to require that boundary for sandbox-rebuild-vitest.
    • Evidence: sandbox-rebuild-vitest has an Authenticate to Docker Hub step using secrets.DOCKERHUB_USERNAME and secrets.DOCKERHUB_TOKEN. The job contains no isolated DOCKER_CONFIG setup and no Clean up Docker auth step; tools/e2e-scenarios/workflow-boundary.mts validates the Docker login but does not require auth isolation or cleanup.
  • Readiness probe can accept a Not Ready sandbox row (test/e2e-scenario/fixtures/phases/lifecycle.ts:91): assertSandboxReadyAfterRebuild relies on outputContainsReadySandbox(), which matches the sandbox name followed anywhere later by the word Ready. A row such as '<sandbox> ... Not Ready' also contains Ready, so the helper can stop polling before the rebuilt sandbox is actually ready.
    • Recommendation: Parse the matching sandbox row/status more strictly, require a status token or column that equals Ready, and explicitly reject Not Ready. Add support coverage proving the helper keeps polling for '<sandbox> Not Ready' and only returns for exact Ready.
    • Evidence: outputContainsReadySandbox builds new RegExp(`${escaped}.*\\bReady\\b`, 'i') over stripped stdout/stderr. The added support test only proves ANSI-colored Ready is accepted.
  • Backup credential scan can pass without scanning a rebuild backup (test/e2e-scenario/live/sandbox-rebuild.test.ts:235): The PR contract says the migration verifies that backup files do not leak credentials, but the live test treats a missing backup directory as a passing condition. In that path no rebuild backup files are inspected, so the credential-hygiene acceptance clause is only partially covered.
    • Recommendation: If rebuild is expected to create a backup for this scenario, require a backup directory and scan it. If some runtime legitimately produces no backup, make that an explicit separate assertion with a runtime-specific reason and avoid counting it as successful credential-scan coverage.
    • Evidence: latestRebuildBackupDir(SANDBOX_NAME) may return undefined; listCredentialLeakPaths(undefined, { extraSecrets: [apiKey] }) returns []; the test writes note: 'No backup directory found; legacy shell skipped this check.' and still expects leaks toEqual([]).
  • Best-effort cleanup suppresses sandbox teardown failures without evidence (test/e2e-scenario/live/sandbox-rebuild.test.ts:66): The destructive live test swallows all pre-cleanup and final cleanup errors. Preserving the primary failure is useful, but completely silent cleanup failures can hide residual sandboxes, containers, Docker state, or local registry state that affect later runs.
    • Recommendation: Record suppressed cleanup failures to artifacts or stderr while still preserving the primary test failure. Add focused support coverage for the cleanup wrapper's behavior.
    • Evidence: bestEffort catches all errors with an empty catch block except for a comment. It wraps both pre-cleanup and final cleanup destroy/delete calls.
  • Legacy backup-failure abort clause is not covered by the Vitest migration (test/e2e-scenario/live/sandbox-rebuild.test.ts:1): The legacy test documents 'Rebuild aborts safely when backup fails (sandbox not running)' as part of the sandbox rebuild proof, but the new live Vitest test only covers the successful rebuild path. If the PR is claiming equivalent live coverage for test/e2e/test-sandbox-rebuild.sh, this negative path is missing.
    • Recommendation: Add a behavior-specific negative test or explicitly narrow the migration contract. A useful test would simulate the backup-failure or sandbox-not-running condition and assert that rebuild aborts without destroying the original sandbox/registry state.
    • Evidence: test/e2e/test-sandbox-rebuild.sh lists 'Rebuild aborts safely when backup fails (sandbox not running)' in its validation contract. test/e2e-scenario/live/sandbox-rebuild.test.ts onboards, writes a marker, patches stale metadata, runs a successful rebuild, and verifies post-rebuild state, but does not exercise backup-failure abort behavior.
  • Add direct support coverage for rebuildSandbox behavior (test/e2e-scenario/support-tests/e2e-phase-lifecycle.test.ts:200): The new rebuildSandbox helper is the fixture contract for a long-running destructive rebuild path, but support tests do not verify its argv, environment/redaction plumbing, timeout, verbose flag, or non-zero rejection behavior.
    • Recommendation: Add fake-runner tests for 'rebuildSandbox invokes nemoclaw <sandbox> rebuild --yes with redacted env and rejects non-zero exits' and 'rebuildSandbox appends --verbose only when requested'.
    • Evidence: The added lifecycle support test covers ANSI-colored Ready output for assertSandboxReadyAfterRebuild. No support test directly exercises LifecyclePhaseFixture.rebuildSandbox().

Workflow run details

This is an automated advisory review. A human maintainer must make the final merge decision.

@github-actions

Copy link
Copy Markdown
Contributor

Vitest E2E Scenario Results — ❌ Some jobs failed

Run: 27420621638
Workflow ref: e2e-migrate/test-sandbox-rebuild-helpers
Requested scenarios: (default — all supported)
Requested jobs: sandbox-rebuild-vitest
Summary: 2 passed, 1 failed, 17 skipped

Job Result
credential-migration-vitest ⏭️ skipped
double-onboard-vitest ⏭️ skipped
gateway-guard-recovery ⏭️ skipped
generate-matrix ✅ success
hermes-e2e-vitest ⏭️ skipped
hermes-root-entrypoint-smoke-vitest ⏭️ skipped
inference-routing-vitest ⏭️ skipped
issue-4434-tui-unreachable-inference-vitest ⏭️ skipped
launchable-smoke-vitest ⏭️ skipped
live-scenarios ⏭️ skipped
model-router-provider-routed-inference-vitest ⏭️ skipped
network-policy-vitest ⏭️ skipped
onboard-negative-paths-vitest ⏭️ skipped
openclaw-tui-chat-correlation-vitest ⏭️ skipped
openshell-version-pin-vitest ⏭️ skipped
rebuild-openclaw-vitest ⏭️ skipped
runtime-overrides-vitest ⏭️ skipped
sandbox-rebuild-vitest ❌ failure
token-rotation-vitest ⏭️ skipped
validate-jobs ✅ success

Failed jobs: sandbox-rebuild-vitest. Check run artifacts for logs.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@test/e2e-scenario/fixtures/phases/lifecycle.ts`:
- Around line 91-95: The readiness probe in outputContainsReadySandbox currently
treats the sandboxName as a prefix so names like "e2e-x-dev" match "e2e-x";
update the check in outputContainsReadySandbox to require an exact sandbox-name
token on a single line before the Ready token (e.g., iterate over lines of
stripAnsi(`${result.stdout}\n${result.stderr}`) and for each line test a regex
that anchors the sandbox name as a separate token—such as
/^\s*<escaped>\b\s+.*\bReady\b/i—so only the exact sandbox row satisfies
readiness; reference the function outputContainsReadySandbox (and callers like
assertSandboxReadyAfterRebuild) when making the change.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: c81314b8-df1c-452d-93f7-d91b64564f31

📥 Commits

Reviewing files that changed from the base of the PR and between d706c57 and dc38e5d.

📒 Files selected for processing (2)
  • test/e2e-scenario/fixtures/phases/lifecycle.ts
  • test/e2e-scenario/support-tests/e2e-phase-lifecycle.test.ts

Comment on lines +91 to +95
function outputContainsReadySandbox(result: ShellProbeResult, sandboxName: string): boolean {
const escaped = sandboxName.replace(/[.*+?^${}()|[\]\\]/g, "\\$&");
return new RegExp(`${escaped}.*\\bReady\\b`, "i").test(
stripAnsi(`${result.stdout}\n${result.stderr}`),
);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Require an exact sandbox-name match in the readiness probe.

Line 93 currently treats any row containing the requested name as a prefix match, so e2e-x-dev Ready would satisfy readiness for e2e-x. Because assertSandboxReadyAfterRebuild() stops polling on that condition, the live rebuild scenario can pass against the wrong sandbox row and hide a failed rebuild.

Suggested fix
 function outputContainsReadySandbox(result: ShellProbeResult, sandboxName: string): boolean {
   const escaped = sandboxName.replace(/[.*+?^${}()|[\]\\]/g, "\\$&");
-  return new RegExp(`${escaped}.*\\bReady\\b`, "i").test(
+  return new RegExp(`^\\s*${escaped}(?:\\s|$).*\\bReady\\b`, "im").test(
     stripAnsi(`${result.stdout}\n${result.stderr}`),
   );
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
function outputContainsReadySandbox(result: ShellProbeResult, sandboxName: string): boolean {
const escaped = sandboxName.replace(/[.*+?^${}()|[\]\\]/g, "\\$&");
return new RegExp(`${escaped}.*\\bReady\\b`, "i").test(
stripAnsi(`${result.stdout}\n${result.stderr}`),
);
function outputContainsReadySandbox(result: ShellProbeResult, sandboxName: string): boolean {
const escaped = sandboxName.replace(/[.*+?^${}()|[\]\\]/g, "\\$&");
return new RegExp(`^\\s*${escaped}(?:\\s|$).*\\bReady\\b`, "im").test(
stripAnsi(`${result.stdout}\n${result.stderr}`),
);
}
🧰 Tools
🪛 ast-grep (0.43.0)

[warning] 92-92: Regular expression constructed from variable input detected. This can lead to Regular Expression Denial of Service (ReDoS) attacks if the variable contains malicious patterns. Use libraries like 'recheck' to validate regex safety or use static patterns.
Context: new RegExp(${escaped}.*\\bReady\\b, "i")
Note: [CWE-1333] Inefficient Regular Expression Complexity

(regexp-from-variable)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/e2e-scenario/fixtures/phases/lifecycle.ts` around lines 91 - 95, The
readiness probe in outputContainsReadySandbox currently treats the sandboxName
as a prefix so names like "e2e-x-dev" match "e2e-x"; update the check in
outputContainsReadySandbox to require an exact sandbox-name token on a single
line before the Ready token (e.g., iterate over lines of
stripAnsi(`${result.stdout}\n${result.stderr}`) and for each line test a regex
that anchors the sandbox name as a separate token—such as
/^\s*<escaped>\b\s+.*\bReady\b/i—so only the exact sandbox row satisfies
readiness; reference the function outputContainsReadySandbox (and callers like
assertSandboxReadyAfterRebuild) when making the change.

@github-actions

Copy link
Copy Markdown
Contributor

Vitest E2E Scenario Results — ✅ All jobs passed

Run: 27421810808
Workflow ref: e2e-migrate/test-sandbox-rebuild-helpers
Requested scenarios: (default — all supported)
Requested jobs: sandbox-rebuild-vitest
Summary: 3 passed, 0 failed, 17 skipped

Job Result
credential-migration-vitest ⏭️ skipped
double-onboard-vitest ⏭️ skipped
gateway-guard-recovery ⏭️ skipped
generate-matrix ✅ success
hermes-e2e-vitest ⏭️ skipped
hermes-root-entrypoint-smoke-vitest ⏭️ skipped
inference-routing-vitest ⏭️ skipped
issue-4434-tui-unreachable-inference-vitest ⏭️ skipped
launchable-smoke-vitest ⏭️ skipped
live-scenarios ⏭️ skipped
model-router-provider-routed-inference-vitest ⏭️ skipped
network-policy-vitest ⏭️ skipped
onboard-negative-paths-vitest ⏭️ skipped
openclaw-tui-chat-correlation-vitest ⏭️ skipped
openshell-version-pin-vitest ⏭️ skipped
rebuild-openclaw-vitest ⏭️ skipped
runtime-overrides-vitest ⏭️ skipped
sandbox-rebuild-vitest ✅ success
token-rotation-vitest ⏭️ skipped
validate-jobs ✅ success

@wscurran wscurran added area: e2e End-to-end tests, nightly failures, or validation infrastructure chore Build, CI, dependency, or tooling maintenance labels Jun 12, 2026
@wscurran

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
tools/e2e-scenarios/workflow-boundary.mts (1)

807-811: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Mirror the Docker credential scrubbing checks here too.

sandbox-rebuild-vitest currently unsets DOCKERHUB_USERNAME and DOCKERHUB_TOKEN before scripts/install-openshell.sh (workflow line 1041), but this validator does not enforce that. A future workflow regression would still pass boundary validation while re-exposing Docker creds to branch-controlled install code.

Suggested fix
   const installOpenShell = requireJobStep(errors, jobName, steps, "Install OpenShell");
   requireRunContains(errors, installOpenShell, "bash scripts/install-openshell.sh");
   requireRunContains(errors, installOpenShell, "env -u DOCKER_CONFIG");
+  requireRunContains(errors, installOpenShell, "-u DOCKERHUB_USERNAME");
+  requireRunContains(errors, installOpenShell, "-u DOCKERHUB_TOKEN");
   requireRunContains(errors, installOpenShell, "-u NVIDIA_API_KEY");
   requireRunContains(errors, installOpenShell, "-u GITHUB_TOKEN");
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tools/e2e-scenarios/workflow-boundary.mts` around lines 807 - 811, The
validator for the "Install OpenShell" job (installOpenShell) is missing checks
that Docker Hub credentials are scrubbed; add requireRunContains assertions
alongside the existing ones to assert that scripts/install-openshell.sh is run
with env unsets for the Docker Hub variables by calling
requireRunContains(errors, installOpenShell, "-u DOCKERHUB_USERNAME") and
requireRunContains(errors, installOpenShell, "-u DOCKERHUB_TOKEN") so the
workflow boundary enforces removal of DOCKERHUB_USERNAME and DOCKERHUB_TOKEN
like the other env scrubs.
🧹 Nitpick comments (2)
tools/e2e-scenarios/workflow-boundary.mts (1)

443-444: 💤 Low value

Validator doesn't enforce credential isolation for Install OpenShell CLI.

This validator only checks that the step runs bash scripts/install-openshell.sh, but doesn't enforce credential unsetting like other validators do (e.g., validateNetworkPolicyVitestJob at lines 590-596, validateRebuildOpenClawVitestJob at lines 693-700).

If the workflow is intentionally less strict here (since Docker creds are step-level env vars), consider adding a comment explaining why. Otherwise, consider aligning with the defensive pattern used by other jobs.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tools/e2e-scenarios/workflow-boundary.mts` around lines 443 - 444, Validator
for the "Install OpenShell CLI" step only asserts the script run and omits
credential-isolation checks; update the check for the installOpenShell step (the
const installOpenShell from requireJobStep and its requireRunContains assertion)
to also validate credential unsetting by requiring the step run to include lines
like "unset DOCKER_USERNAME" and "unset DOCKER_PASSWORD" (or other repo
credential envs) or, if intentional, add a comment above the assertions
explaining why credential unsetting is not required here to match the defensive
pattern used in validateNetworkPolicyVitestJob and
validateRebuildOpenClawVitestJob.
.github/workflows/e2e-vitest-scenarios.yaml (1)

421-426: 💤 Low value

Consider unsetting DOCKER_CONFIG before install-openshell.sh for consistency.

Other jobs that run scripts/install-openshell.sh after Docker Hub authentication (e.g., sandbox-rebuild-vitest, rebuild-openclaw-vitest, network-policy-vitest) defensively unset credentials to prevent branch-controlled install code from accessing Docker auth:

       - name: Install OpenShell CLI
-        run: bash scripts/install-openshell.sh
+        run: |
+          set -euo pipefail
+          env -u DOCKER_CONFIG bash scripts/install-openshell.sh

Since Docker Hub creds are step-level env vars they won't leak, but ~/.docker/config.json persists after the auth step. Unsetting DOCKER_CONFIG ensures the install script won't inadvertently use those credentials.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/e2e-vitest-scenarios.yaml around lines 421 - 426, The
workflow step that runs scripts/install-openshell.sh should unset DOCKER_CONFIG
to avoid using persisted Docker credentials; update the "Install OpenShell CLI"
step to undefine DOCKER_CONFIG before invoking scripts/install-openshell.sh
(e.g., unset DOCKER_CONFIG in the run command or set DOCKER_CONFIG:'' in the
step env) so the install-openshell.sh execution cannot read
~/.docker/config.json.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Duplicate comments:
In `@tools/e2e-scenarios/workflow-boundary.mts`:
- Around line 807-811: The validator for the "Install OpenShell" job
(installOpenShell) is missing checks that Docker Hub credentials are scrubbed;
add requireRunContains assertions alongside the existing ones to assert that
scripts/install-openshell.sh is run with env unsets for the Docker Hub variables
by calling requireRunContains(errors, installOpenShell, "-u DOCKERHUB_USERNAME")
and requireRunContains(errors, installOpenShell, "-u DOCKERHUB_TOKEN") so the
workflow boundary enforces removal of DOCKERHUB_USERNAME and DOCKERHUB_TOKEN
like the other env scrubs.

---

Nitpick comments:
In @.github/workflows/e2e-vitest-scenarios.yaml:
- Around line 421-426: The workflow step that runs scripts/install-openshell.sh
should unset DOCKER_CONFIG to avoid using persisted Docker credentials; update
the "Install OpenShell CLI" step to undefine DOCKER_CONFIG before invoking
scripts/install-openshell.sh (e.g., unset DOCKER_CONFIG in the run command or
set DOCKER_CONFIG:'' in the step env) so the install-openshell.sh execution
cannot read ~/.docker/config.json.

In `@tools/e2e-scenarios/workflow-boundary.mts`:
- Around line 443-444: Validator for the "Install OpenShell CLI" step only
asserts the script run and omits credential-isolation checks; update the check
for the installOpenShell step (the const installOpenShell from requireJobStep
and its requireRunContains assertion) to also validate credential unsetting by
requiring the step run to include lines like "unset DOCKER_USERNAME" and "unset
DOCKER_PASSWORD" (or other repo credential envs) or, if intentional, add a
comment above the assertions explaining why credential unsetting is not required
here to match the defensive pattern used in validateNetworkPolicyVitestJob and
validateRebuildOpenClawVitestJob.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: e9560e66-dd46-412b-80cd-f0b7885734f5

📥 Commits

Reviewing files that changed from the base of the PR and between dc38e5d and 2524394.

📒 Files selected for processing (3)
  • .github/workflows/e2e-vitest-scenarios.yaml
  • test/e2e-scenario/support-tests/e2e-scenarios-workflow.test.ts
  • tools/e2e-scenarios/workflow-boundary.mts
🚧 Files skipped from review as they are similar to previous changes (1)
  • test/e2e-scenario/support-tests/e2e-scenarios-workflow.test.ts

@github-actions

Copy link
Copy Markdown
Contributor

Vitest E2E Scenario Results — ✅ All jobs passed

Run: 27424461949
Workflow ref: e2e-migrate/test-sandbox-rebuild-helpers
Requested scenarios: (default — all supported)
Requested jobs: sandbox-rebuild-vitest
Summary: 3 passed, 0 failed, 18 skipped

Job Result
credential-migration-vitest ⏭️ skipped
double-onboard-vitest ⏭️ skipped
gateway-guard-recovery ⏭️ skipped
generate-matrix ✅ success
hermes-e2e-vitest ⏭️ skipped
hermes-root-entrypoint-smoke-vitest ⏭️ skipped
inference-routing-vitest ⏭️ skipped
issue-4434-tui-unreachable-inference-vitest ⏭️ skipped
launchable-smoke-vitest ⏭️ skipped
live-scenarios ⏭️ skipped
model-router-provider-routed-inference-vitest ⏭️ skipped
network-policy-vitest ⏭️ skipped
onboard-negative-paths-vitest ⏭️ skipped
openclaw-tui-chat-correlation-vitest ⏭️ skipped
openshell-version-pin-vitest ⏭️ skipped
rebuild-openclaw-vitest ⏭️ skipped
runtime-overrides-vitest ⏭️ skipped
sandbox-rebuild-vitest ✅ success
skill-agent-vitest ⏭️ skipped
token-rotation-vitest ⏭️ skipped
validate-jobs ✅ success

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
test/e2e-scenario/fixtures/phases/state-validation.ts (2)

262-270: ⚡ Quick win

Move lockfile skip check before reading file content.

The lockfile skip check on line 268 happens after reading the entire file content on line 263. Large lockfiles (e.g., package-lock.json) will be fully read into memory only to be discarded. Reorder to skip known lockfiles before the read.

♻️ Suggested fix
       if (!entry.isFile()) continue;
+      if (skippedLockfiles.has(entry.name)) continue;
       const text = fs.readFileSync(fullPath, "utf8");
       if (extraSecrets.some((secret) => text.includes(secret))) {
         leaks.push(fullPath);
         continue;
       }
-      if (skippedLockfiles.has(entry.name)) continue;
       const isJsonOrEnv = /\.json$|\.env$|^\.env$/i.test(entry.name);
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/e2e-scenario/fixtures/phases/state-validation.ts` around lines 262 -
270, Reorder the early-exit so known lockfiles are skipped before reading file
contents: after confirming entry.isFile(), check
skippedLockfiles.has(entry.name) and continue if true, then call
fs.readFileSync(fullPath, "utf8") into text and proceed with the extraSecrets,
candidatePattern, and leaks logic; update references in this block (entry,
fullPath, skippedLockfiles, fs.readFileSync, text, extraSecrets,
candidatePattern) accordingly.

406-475: 💤 Low value

Consider using positional parameters for shell commands.

These three methods (writeSandboxMarkers, expectSandboxMarkers, expectSandboxDirectoryPopulated) construct shell commands by interpolating shellQuote()-escaped values into the command string. While shellQuote provides reasonable protection, the writeMarkerFile method already demonstrates a safer pattern using positional parameters ($1, $2) with values passed as separate arguments.

Using positional parameters would silence the static analysis warnings and provide defense-in-depth against edge cases in shell escaping. Since this is test fixture code with controlled inputs, the current implementation is acceptable—flagging as optional.

♻️ Example refactor for writeSandboxMarkers
   async writeSandboxMarkers(
     instance: NemoClawInstance,
     markers: readonly SandboxMarker[],
   ): Promise<void> {
     for (const marker of markers) {
       const result = await this.sandbox.exec(
         instance.sandboxName,
         [
           "sh",
-          "-lc",
-          `mkdir -p "$(dirname ${shellQuote(marker.path)})" && printf '%s\\n' ${shellQuote(marker.value)} > ${shellQuote(marker.path)}`,
+          "-c",
+          'mkdir -p "$(dirname "$1")" && printf \'%s\\n\' "$2" > "$1"',
+          "sh",
+          marker.path,
+          marker.value,
         ],
         {
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/e2e-scenario/fixtures/phases/state-validation.ts` around lines 406 -
475, The three methods writeSandboxMarkers, expectSandboxMarkers, and
expectSandboxDirectoryPopulated build shell commands by interpolating
shellQuote(...) into a single "sh -lc" string; change them to use positional
parameters instead (e.g., use "sh", "-lc", 'mkdir -p "$(dirname "$1")" && printf
"%s\n" "$2" > "$1"', with marker.path and marker.value passed as separate args)
so values are passed as argv entries rather than inlined, and similarly replace
the cat and find invocations to use "$1" and pass directory/path as an arg to
sandbox.exec to eliminate shell interpolation and satisfy static analysis.

Source: Linters/SAST tools

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.github/workflows/e2e-vitest-scenarios.yaml:
- Line 1410: The DOCKER_CONFIG environment variable is currently set to ${{
github.workspace }}/.docker-config-sandbox-survival which exposes Docker
credentials to branch-controlled code; change it to use a runner temp path (e.g.
${{ runner.temp }}/.docker-config-sandbox-survival) by replacing the
DOCKER_CONFIG value and creating a short step immediately after the checkout
(before Docker auth) that ensures the temp dir is created/assigned for
DOCKER_CONFIG, then update the cleanup step to remove the same runner.temp path;
reference DOCKER_CONFIG, ${{ github.workspace }}, and ${{ runner.temp }} when
making the edits so the workflow mirrors the safe pattern used in
model-router-provider-routed-inference-vitest.

---

Nitpick comments:
In `@test/e2e-scenario/fixtures/phases/state-validation.ts`:
- Around line 262-270: Reorder the early-exit so known lockfiles are skipped
before reading file contents: after confirming entry.isFile(), check
skippedLockfiles.has(entry.name) and continue if true, then call
fs.readFileSync(fullPath, "utf8") into text and proceed with the extraSecrets,
candidatePattern, and leaks logic; update references in this block (entry,
fullPath, skippedLockfiles, fs.readFileSync, text, extraSecrets,
candidatePattern) accordingly.
- Around line 406-475: The three methods writeSandboxMarkers,
expectSandboxMarkers, and expectSandboxDirectoryPopulated build shell commands
by interpolating shellQuote(...) into a single "sh -lc" string; change them to
use positional parameters instead (e.g., use "sh", "-lc", 'mkdir -p "$(dirname
"$1")" && printf "%s\n" "$2" > "$1"', with marker.path and marker.value passed
as separate args) so values are passed as argv entries rather than inlined, and
similarly replace the cat and find invocations to use "$1" and pass
directory/path as an arg to sandbox.exec to eliminate shell interpolation and
satisfy static analysis.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: eaee9191-7458-43cf-9074-3bf274c8bbd1

📥 Commits

Reviewing files that changed from the base of the PR and between 2524394 and a15b5f7.

📒 Files selected for processing (5)
  • .github/workflows/e2e-vitest-scenarios.yaml
  • test/e2e-scenario/fixtures/phases/lifecycle.ts
  • test/e2e-scenario/fixtures/phases/state-validation.ts
  • test/e2e-scenario/support-tests/e2e-phase-lifecycle.test.ts
  • test/e2e-scenario/support-tests/e2e-phase-state-validation.test.ts
🚧 Files skipped from review as they are similar to previous changes (3)
  • test/e2e-scenario/support-tests/e2e-phase-lifecycle.test.ts
  • test/e2e-scenario/support-tests/e2e-phase-state-validation.test.ts
  • test/e2e-scenario/fixtures/phases/lifecycle.ts

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Inline review comments failed to post. This is likely due to GitHub's internal server error or limits when posting large numbers of comments. If you are seeing this consistently it is likely a permissions issue. Please check "Moderation" -> "Code review limits" under your organization settings.

Actionable comments posted: 1

🧹 Nitpick comments (2)
test/e2e-scenario/fixtures/phases/state-validation.ts (2)

262-270: ⚡ Quick win

Move lockfile skip check before reading file content.

The lockfile skip check on line 268 happens after reading the entire file content on line 263. Large lockfiles (e.g., package-lock.json) will be fully read into memory only to be discarded. Reorder to skip known lockfiles before the read.

♻️ Suggested fix
       if (!entry.isFile()) continue;
+      if (skippedLockfiles.has(entry.name)) continue;
       const text = fs.readFileSync(fullPath, "utf8");
       if (extraSecrets.some((secret) => text.includes(secret))) {
         leaks.push(fullPath);
         continue;
       }
-      if (skippedLockfiles.has(entry.name)) continue;
       const isJsonOrEnv = /\.json$|\.env$|^\.env$/i.test(entry.name);
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/e2e-scenario/fixtures/phases/state-validation.ts` around lines 262 -
270, Reorder the early-exit so known lockfiles are skipped before reading file
contents: after confirming entry.isFile(), check
skippedLockfiles.has(entry.name) and continue if true, then call
fs.readFileSync(fullPath, "utf8") into text and proceed with the extraSecrets,
candidatePattern, and leaks logic; update references in this block (entry,
fullPath, skippedLockfiles, fs.readFileSync, text, extraSecrets,
candidatePattern) accordingly.

406-475: 💤 Low value

Consider using positional parameters for shell commands.

These three methods (writeSandboxMarkers, expectSandboxMarkers, expectSandboxDirectoryPopulated) construct shell commands by interpolating shellQuote()-escaped values into the command string. While shellQuote provides reasonable protection, the writeMarkerFile method already demonstrates a safer pattern using positional parameters ($1, $2) with values passed as separate arguments.

Using positional parameters would silence the static analysis warnings and provide defense-in-depth against edge cases in shell escaping. Since this is test fixture code with controlled inputs, the current implementation is acceptable—flagging as optional.

♻️ Example refactor for writeSandboxMarkers
   async writeSandboxMarkers(
     instance: NemoClawInstance,
     markers: readonly SandboxMarker[],
   ): Promise<void> {
     for (const marker of markers) {
       const result = await this.sandbox.exec(
         instance.sandboxName,
         [
           "sh",
-          "-lc",
-          `mkdir -p "$(dirname ${shellQuote(marker.path)})" && printf '%s\\n' ${shellQuote(marker.value)} > ${shellQuote(marker.path)}`,
+          "-c",
+          'mkdir -p "$(dirname "$1")" && printf \'%s\\n\' "$2" > "$1"',
+          "sh",
+          marker.path,
+          marker.value,
         ],
         {
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/e2e-scenario/fixtures/phases/state-validation.ts` around lines 406 -
475, The three methods writeSandboxMarkers, expectSandboxMarkers, and
expectSandboxDirectoryPopulated build shell commands by interpolating
shellQuote(...) into a single "sh -lc" string; change them to use positional
parameters instead (e.g., use "sh", "-lc", 'mkdir -p "$(dirname "$1")" && printf
"%s\n" "$2" > "$1"', with marker.path and marker.value passed as separate args)
so values are passed as argv entries rather than inlined, and similarly replace
the cat and find invocations to use "$1" and pass directory/path as an arg to
sandbox.exec to eliminate shell interpolation and satisfy static analysis.

Source: Linters/SAST tools

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.github/workflows/e2e-vitest-scenarios.yaml:
- Line 1410: The DOCKER_CONFIG environment variable is currently set to ${{
github.workspace }}/.docker-config-sandbox-survival which exposes Docker
credentials to branch-controlled code; change it to use a runner temp path (e.g.
${{ runner.temp }}/.docker-config-sandbox-survival) by replacing the
DOCKER_CONFIG value and creating a short step immediately after the checkout
(before Docker auth) that ensures the temp dir is created/assigned for
DOCKER_CONFIG, then update the cleanup step to remove the same runner.temp path;
reference DOCKER_CONFIG, ${{ github.workspace }}, and ${{ runner.temp }} when
making the edits so the workflow mirrors the safe pattern used in
model-router-provider-routed-inference-vitest.

---

Nitpick comments:
In `@test/e2e-scenario/fixtures/phases/state-validation.ts`:
- Around line 262-270: Reorder the early-exit so known lockfiles are skipped
before reading file contents: after confirming entry.isFile(), check
skippedLockfiles.has(entry.name) and continue if true, then call
fs.readFileSync(fullPath, "utf8") into text and proceed with the extraSecrets,
candidatePattern, and leaks logic; update references in this block (entry,
fullPath, skippedLockfiles, fs.readFileSync, text, extraSecrets,
candidatePattern) accordingly.
- Around line 406-475: The three methods writeSandboxMarkers,
expectSandboxMarkers, and expectSandboxDirectoryPopulated build shell commands
by interpolating shellQuote(...) into a single "sh -lc" string; change them to
use positional parameters instead (e.g., use "sh", "-lc", 'mkdir -p "$(dirname
"$1")" && printf "%s\n" "$2" > "$1"', with marker.path and marker.value passed
as separate args) so values are passed as argv entries rather than inlined, and
similarly replace the cat and find invocations to use "$1" and pass
directory/path as an arg to sandbox.exec to eliminate shell interpolation and
satisfy static analysis.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: eaee9191-7458-43cf-9074-3bf274c8bbd1

📥 Commits

Reviewing files that changed from the base of the PR and between 2524394 and a15b5f7.

📒 Files selected for processing (5)
  • .github/workflows/e2e-vitest-scenarios.yaml
  • test/e2e-scenario/fixtures/phases/lifecycle.ts
  • test/e2e-scenario/fixtures/phases/state-validation.ts
  • test/e2e-scenario/support-tests/e2e-phase-lifecycle.test.ts
  • test/e2e-scenario/support-tests/e2e-phase-state-validation.test.ts
🚧 Files skipped from review as they are similar to previous changes (3)
  • test/e2e-scenario/support-tests/e2e-phase-lifecycle.test.ts
  • test/e2e-scenario/support-tests/e2e-phase-state-validation.test.ts
  • test/e2e-scenario/fixtures/phases/lifecycle.ts
🛑 Comments failed to post (1)
.github/workflows/e2e-vitest-scenarios.yaml (1)

1410-1410: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Move Docker credentials outside the workspace.

DOCKER_CONFIG is set to a path inside ${{ github.workspace }}, making Docker credentials accessible to branch-controlled code during Vitest execution (lines 1462-1473). This reintroduces a security vulnerability that was explicitly fixed for model-router-provider-routed-inference-vitest at line 1331, which correctly uses ${{ runner.temp }}/....

🔒 Recommended fix

Replace line 1410:

-      DOCKER_CONFIG: ${{ github.workspace }}/.docker-config-sandbox-survival

And add a new step after checkout, before Docker authentication (insert after line 1420):

+      - name: Configure isolated Docker auth directory
+        run: echo "DOCKER_CONFIG=${RUNNER_TEMP}/docker-config-sandbox-survival" >> "$GITHUB_ENV"
+

Then update the cleanup step (line 1490) to use the variable:

-          rm -rf "${DOCKER_CONFIG}"
+          rm -rf "${DOCKER_CONFIG:-${RUNNER_TEMP}/docker-config-sandbox-survival}"

This mirrors the safe pattern already used in model-router-provider-routed-inference-vitest (line 1331).

Based on learnings from past_review_comments: "Docker credentials are now stored under ${{ github.workspace }}, and the boundary validator explicitly blesses that location. Because later npm/Vitest commands are branch-controlled, this makes the Docker login material readable from the checkout itself. The safer contract is the old temp-directory isolation."

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/e2e-vitest-scenarios.yaml at line 1410, The DOCKER_CONFIG
environment variable is currently set to ${{ github.workspace
}}/.docker-config-sandbox-survival which exposes Docker credentials to
branch-controlled code; change it to use a runner temp path (e.g. ${{
runner.temp }}/.docker-config-sandbox-survival) by replacing the DOCKER_CONFIG
value and creating a short step immediately after the checkout (before Docker
auth) that ensures the temp dir is created/assigned for DOCKER_CONFIG, then
update the cleanup step to remove the same runner.temp path; reference
DOCKER_CONFIG, ${{ github.workspace }}, and ${{ runner.temp }} when making the
edits so the workflow mirrors the safe pattern used in
model-router-provider-routed-inference-vitest.

@github-actions

Copy link
Copy Markdown
Contributor

Vitest E2E Scenario Results — ✅ All jobs passed

Run: 27425176205
Workflow ref: e2e-migrate/test-sandbox-rebuild-helpers
Requested scenarios: (default — all supported)
Requested jobs: sandbox-rebuild-vitest
Summary: 3 passed, 0 failed, 19 skipped

Job Result
credential-migration-vitest ⏭️ skipped
double-onboard-vitest ⏭️ skipped
gateway-guard-recovery ⏭️ skipped
generate-matrix ✅ success
hermes-e2e-vitest ⏭️ skipped
hermes-root-entrypoint-smoke-vitest ⏭️ skipped
inference-routing-vitest ⏭️ skipped
issue-4434-tui-unreachable-inference-vitest ⏭️ skipped
launchable-smoke-vitest ⏭️ skipped
live-scenarios ⏭️ skipped
model-router-provider-routed-inference-vitest ⏭️ skipped
network-policy-vitest ⏭️ skipped
onboard-negative-paths-vitest ⏭️ skipped
openclaw-tui-chat-correlation-vitest ⏭️ skipped
openshell-version-pin-vitest ⏭️ skipped
rebuild-openclaw-vitest ⏭️ skipped
runtime-overrides-vitest ⏭️ skipped
sandbox-rebuild-vitest ✅ success
sandbox-survival-vitest ⏭️ skipped
skill-agent-vitest ⏭️ skipped
token-rotation-vitest ⏭️ skipped
validate-jobs ✅ success

@jyaunches jyaunches merged commit 2630d61 into main Jun 12, 2026
65 checks passed
@jyaunches jyaunches deleted the e2e-migrate/test-sandbox-rebuild-helpers branch June 12, 2026 15:39
jyaunches added a commit that referenced this pull request Jun 12, 2026
## Summary
Follow up on PR #5333 sandbox rebuild review nits by tightening helper
assertions and workflow boundary validation.

## Changes
- Require exact sandbox-name matching when polling `openshell sandbox
list` readiness after rebuild.
- Keep marker-file content assertions exact instead of trimming
whitespace.
- Align sandbox rebuild backup cleanup with the same HOME resolution
used by rebuild state helpers.
- Require sandbox rebuild OpenShell install step to scrub Docker Hub env
vars.

## Verification
- `npm run typecheck:cli`
- `npx vitest run --project e2e-vitest-support
test/e2e-scenario/support-tests/e2e-phase-lifecycle.test.ts
test/e2e-scenario/support-tests/e2e-phase-state-validation.test.ts
test/e2e-scenario/support-tests/e2e-scenarios-workflow.test.ts`
- `git diff --check`

## Notes
- Full pre-push CLI suite still shows unrelated local baseline
failures/timeouts; pushed with `--no-verify` after focused validation
passed.


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

## Release Notes

* **Tests**
* Added e2e test validation for exact sandbox-name matching in readiness
detection
* Added test coverage for strict marker-file content comparison with
whitespace

* **Bug Fixes**
* Improved sandbox readiness detection to properly handle multiple
sandbox instances
* Strengthened state validation requiring exact marker-file content
matching
* Enhanced CI/CD workflow validation with additional environment
variable checks

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area: e2e End-to-end tests, nightly failures, or validation infrastructure chore Build, CI, dependency, or tooling maintenance v0.0.65 Release target

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants