Adopt phase fixtures + registry-driven test discovery for Vitest E2E scenarios

> Companion design proposal to #4941. **Builds on, does not replace, the Vitest fixture decision.**

## Architecture at a glance

```
┌───────────────────────────────────────────────────────────────────────────────┐
│  TYPED REGISTRY  —  scenarios/scenarios/baseline.ts (matrix data)              │
│                                                                                 │
│   { id: "ubuntu-repo-cloud-openclaw-slack",                                    │
│     environment: ubuntuRepoDocker("cloud-nvidia-openclaw-slack"),              │
│     suiteIds: ["smoke", "inference", "messaging-slack", "credentials"],        │
│     ... }                                                                       │
│                                                                                 │
│   matrix axes per scenario:                                                     │
│      ┌──────────┬──────────┬──────────┬──────────┬──────────┬──────────┐      │
│      │ platform │ install  │ runtime  │onboarding│lifecycle │  suites  │      │
│      ├──────────┼──────────┼──────────┼──────────┼──────────┼──────────┤      │
│      │ ubuntu   │ repo-    │ docker-  │ cloud-   │ rebuild- │ smoke    │      │
│      │ -local   │ current  │ running  │ openclaw │ current- │ inference│      │
│      │          │          │          │ -slack   │ version  │ messaging│      │
│      └──────────┴──────────┴──────────┴──────────┴──────────┴──────────┘      │
│           │          │          │          │          │          │             │
│           ▼          ▼          ▼          ▼          ▼          ▼             │
│        ╔════════════════════════╗   ╔══════════════════════════════╗          │
│        ║  GHA / WORKFLOW STEPS  ║   ║  VITEST + PHASE FIXTURES     ║          │
│        ║  precondition layer    ║   ║  application-logic layer     ║          │
│        ╚════════════════════════╝   ╚══════════════════════════════╝          │
└───────────────────────────────────────────────────────────────────────────────┘

   scenarios/run.ts --emit-matrix
   produces one matrix entry per
   wired scenario, with platform/
   install/runtime carried as
   matrix.* fields the workflow
   reads at job-level.
                    │
                    ▼
┌─────────────────────────────────────────────────────────┐
│  GHA MATRIX FAN-OUT  (precondition layer)                 │
│                                                           │
│   ✓ runs-on label per scenario      (axis: platform)     │
│   ✓ N parallel jobs                  (free parallelism)  │
│   ✓ secret allowlist per scenario    (requiredSecrets)   │
│   ✓ fail-fast: false                 (negative scenarios) │
│   ✓ matrix.exclude                   (cheap drops)       │
└─────────────────────────────────────────────────────────┘
                    │
                    ▼
┌─────────────────────────────────────────────────────────┐
│  PER-JOB SETUP STEPS  (precondition layer, ctd.)          │
│                                                           │
│   ✓ checkout + setup-node + npm ci                       │
│   ✓ install step  (matrix.install:                       │
│                       repo-current → npm run build:cli   │
│                       launchable   → installer)          │
│   ✓ runtime prep  (matrix.runtime:                       │
│                       docker-running → noop              │
│                       docker-missing → install shim      │
│                       gpu-docker-cdi → already on image) │
│   ✓ wsl-bootstrap  (composite action, windows-latest)    │
│   ✓ brev-provision (composite action, ubuntu-latest)     │
│                                                           │
│   By the time Vitest starts, the host satisfies the      │
│   scenario's environment precondition.                    │
└─────────────────────────────────────────────────────────┘
                    │
                    ▼ npx vitest run -t "^${matrix.id}$"
┌─────────────────────────────────────────────────────────┐
│  VITEST + PHASE FIXTURES  (application-logic layer)       │
│                                                           │
│   environment.assertReady(scenario.environment)          │
│     verifies the precondition steps left us in the       │
│     state the scenario declared (CLI on PATH, docker     │
│     state matches, etc.). Asserts, doesn't install.      │
│                                                           │
│   onboard.from(scenario.environment, env) →               │
│     cloudOpenclaw | cloudHermes | cloudOpenclawSlack |   │
│     cloudOpenclawDiscord | cloudOpenclawTelegram |       │
│     localOllamaOpenclaw | …  (one method per dispatcher) │
│                                                           │
│   stateValidation.from(scenario.expectedStateId, instance)│
│     gatewayHealthy + sandboxRunning   (positive)         │
│     gatewayAbsent + sandboxAbsent     (negative)         │
│                                                           │
│   lifecycle.from(scenario.environment.lifecycle, instance)│
│     rebuildCurrentVersion | snapshotCreateRestore | …    │
│                                                           │
│   runSuite(suiteId, instance) →                           │
│     smoke | inference | credentials | security-* |       │
│     sandbox-lifecycle | snapshot | docs-validation | …   │
└─────────────────────────────────────────────────────────┘
                    │
                    ▼ uses
┌─────────────────────────────────────────────────────────┐
│  CLI WRAPPERS  —  framework/clients/  (LANDING in #4966)  │
│   host  ·  gateway  ·  sandbox  ·  provider  ·  state    │
└─────────────────────────────────────────────────────────┘
                    │
                    ▼ uses
┌─────────────────────────────────────────────────────────┐
│  PRIMITIVES  —  framework/  (LANDED in #4965)             │
│   artifacts  ·  secrets  ·  cleanup  ·  shellProbe       │
│   redaction is canonical (parity-tested with             │
│   src/lib/security/secret-patterns.ts)                    │
└─────────────────────────────────────────────────────────┘
                    ▲
                    │ runs everything above
┌─────────────────────────────────────────────────────────┐
│  VITEST  —  the runner  (per #4941)                       │
└─────────────────────────────────────────────────────────┘
```

The seam: **GHA carries everything that's a precondition for `nemoclaw` being callable; Vitest carries everything from `nemoclaw onboard` onward.**

The matrix axes survive the migration to Vitest. The simplicity #4941 argued for is preserved (Vitest is the runner, fixtures are the API surface, no custom runner). What this proposal adds is one more layer of fixture composition that keeps the existing matrix dispatch coherent — and uses GHA matrix natively for the part it's good at.

---

## Problem Statement

#4941 decided that Vitest is the E2E scenario execution runner and that NemoClaw provides typed fixtures, clients, assertions, and migration inventory. That decision is settled and the right call: Vitest owns lifecycle, fixture composition, reporters, timeouts, and CI integration; NemoClaw owns the domain.

What #4941 did not nail down is **where the scenario matrix lives** under the new model. The current foundation stack (#4965 → #4969) lands the right runner and the right primitives, but the first live scenario is a **hand-authored single test file** (`live/ubuntu-repo-cli-smoke.test.ts`) that:

- hardcodes the platform via `process.execPath`,
- assumes the install state implicitly (repo is cloned, dist is built),
- has no notion of runtime axis (docker-running / docker-missing / gpu-cdi / macos-optional),
- onboards nothing,
- and exists outside `scenarios/registry.ts`.

If subsequent live scenarios follow the same template, every scenario becomes a hand-written test file. We lose the combinatorial matrix the typed-shell-runner explicitly preserves today via `scenarios/scenarios/baseline.ts` × `scenarios/matrix.ts` helpers.

The matrix is not aesthetic. It is the reason "does cloud-openclaw onboarding work on WSL?" is **a one-line constructor change** today (`wslRepoDocker("cloud-openclaw")` instead of `ubuntuRepoDocker("cloud-openclaw")`) rather than a fork-and-edit of an entire test file. It is also what makes `--emit-matrix` and the dynamic GHA fan-out (#4359) coherent — one row per registry entry.

This issue proposes that the Vitest scenario layer keep the same matrix vocabulary, with three layers carrying it: GHA matrix for platform fan-out and per-job preconditions, workflow setup steps for install/runtime state, and Vitest phase fixtures for onboarding-and-after.

cc @cv @jyaunches

## Background — what the typed-shell-runner already gets right

The current `scenarios/` tree decomposes every scenario into 6 axes:

| Axis | Type | Examples | Lives in |
|------|------|----------|----------|
| platform | enum | `ubuntu-local`, `wsl-local`, `macos-local`, `gpu-runner`, `brev-launchable` | `ScenarioEnvironment.platform` |
| install | enum | `repo-current`, `launchable` | `ScenarioEnvironment.install` |
| runtime | enum | `docker-running`, `docker-missing`, `macos-docker-optional`, `gpu-docker-cdi` | `ScenarioEnvironment.runtime` |
| onboarding | string id | `cloud-openclaw`, `cloud-hermes`, `cloud-nvidia-openclaw-slack`, `local-ollama-openclaw` | `ScenarioEnvironment.onboarding` |
| lifecycle | string id (optional) | `rebuild-current-version`, `snapshot`, `upgrade` | `ScenarioEnvironment.lifecycle` |
| runtime-suites | string array | `[smoke, inference, credentials, security, lifecycle, ...]` | `ScenarioDefinition.suiteIds` |

These axes compose via `scenarios/matrix.ts`:

```ts
ubuntuRepoDocker("cloud-nvidia-openclaw-slack")    // axes 1+2+3+4
wslRepoDocker("cloud-openclaw")                    // same axis 4, different 1
ubuntuRepoNoDocker("cloud-openclaw")               // axis 3 = docker-missing
                                                   //   compiler rewrites axis 4
                                                   //   → cloud-openclaw-no-docker
ubuntuRepoDockerLifecycle("cloud-openclaw",        // + axis 5
                          "rebuild-current-version")
```

And the phase orchestrator runs them in fixed order:

```
environment → onboarding → state-validation → lifecycle → runtime
```

The bash side honors the same axes via `nemoclaw_scenarios/{install,onboard,lifecycle,probes}/dispatch.sh` routers — one bash worker per id per axis, dispatched by the typed runner.

This decomposition is real architecture, not metadata. It's what makes the matrix dispatch in `e2e-scenarios-all.yaml` work.

## Proposed Design

### Where each axis lives

| Axis | Carrier | Why this layer |
|------|---------|----------------|
| **platform** | GHA `matrix.runs-on` (resolved by `scenarios/runner-routing.ts`) | Native to GHA. `ubuntu-local` → `ubuntu-latest`, `gpu-runner` → self-hosted, `macos-local` → `macos-26`, `wsl-local` → `windows-latest` (with a WSL bootstrap composite action), `brev-launchable` → `ubuntu-latest` (with a Brev provisioning composite action / fixture). Adding a platform value is a registry edit + a routing-table edit. |
| **install** | Workflow setup step (matrix-gated) | `if: matrix.install == 'repo-current'` runs `npm ci && npm run build:cli`. `if: matrix.install == 'launchable'` runs the installer. By the time Vitest starts, `nemoclaw` is on PATH. Phase fixture only asserts readiness, doesn't install. |
| **runtime** | Workflow setup step (matrix-gated) for state mutations the runner image doesn't already provide | `docker-running` is `ubuntu-latest` default — noop. `docker-missing` requires a shim setup step (existing `nemoclaw_scenarios/onboard/cloud-openclaw-no-docker.sh` does this; promotes to a composite action). `gpu-docker-cdi` is already on the GPU runner image — noop. `macos-docker-optional` is `macos-26`'s default — noop. |
| **onboarding** | Vitest phase fixture (`framework/phases/onboard.ts`) | Calling `nemoclaw onboard --provider nvidia --agent openclaw --channel slack` is application logic. There is no GHA primitive for "run this command and parse the output." |
| **lifecycle** | Vitest phase fixture (`framework/phases/lifecycle.ts`) | State mutations on the running system (rebuild, snapshot, upgrade). Sequential, stateful, single-process. |
| **runtime-suites** | Vitest phase fixture (`framework/phases/runtime.ts:runSuite`) | Assertion bodies. Run sequentially within one Vitest test so they share onboarding state. |

### What stays the same as #4941

- Vitest is the runner. No custom NemoClaw runner.
- `framework/` owns the domain layer.
- Existing primitives (#4965) and CLI wrappers (#4966) keep their shape.
- Migration inventory (#4969) keeps the deletion-readiness contract for legacy `test/e2e/test-*.sh`.
- Bash retained only at true system-boundary probes.

### What this proposal adds

- A **`framework/phases/` directory** holding phase fixtures: `environment.ts` (assertion-only), `onboard.ts`, `state-validation.ts`, `lifecycle.ts`, `runtime.ts`.
- Each phase fixture exposes both:
  - a `from(scenarioPart, ...prereqs)` method that resolves a registry id to the right call (registry-driven path),
  - and named methods for explicit one-off scenarios (e.g. `onboard.cloudOpenclawSlack({...})`).
- A **single registry-driven scenario file** (`live/scenarios.test.ts`) that iterates `listScenarios()` and produces one Vitest test per registry entry. Hand-authored `live/<name>.test.ts` files remain valid for one-off cases that don't fit the matrix.
- A **GHA matrix workflow** (`e2e-vitest-scenarios.yaml`, evolving from #4968) that consumes `--emit-matrix`, fans out one job per scenario id, sets up install/runtime preconditions per matrix axis, and invokes Vitest with a scenario-id filter.
- The `runtime-support` filter from #4380's #4978 follow-up extends to gate the Vitest matrix the same way: scenarios whose phase fixtures aren't wired yet get filtered with structured reasons, not silent fail.

### What naturally retires

- `scenarios/orchestrators/{phase,runner,context,negative-matcher}.ts` (~750 LOC of typed-shell phase orchestration) — once every scenario runs through Vitest phase fixtures, the parallel orchestrator becomes dead code.
- `scenarios/clients/*` stubs (80 LOC) — already replaced by `framework/clients/*` in #4966; should be deleted in that PR per the precedent.
- `nemoclaw_scenarios/{install,onboard,lifecycle,probes}/*.sh` workers (~1,500 LOC) — install + runtime prep promote to composite GHA actions; onboarding/lifecycle workers move into phase fixtures one id at a time, files retire per inventory.
- `validation_suites/**/*.sh` (~3,000 LOC of bash assertions) — logic migrates into runtime-suite fixtures one suite at a time, files retire per inventory.
- `scenarios/run.ts` (the typed-shell entry point) — `--emit-matrix` keeps being the matrix builder; the live-execution path retires once Vitest is the only runner.

What stays as **typed test data** (per #4941 explicit):
- `scenarios/types.ts` (vocabulary)
- `scenarios/builder.ts` (construction)
- `scenarios/registry.ts` + `scenarios/scenarios/baseline.ts` (the matrix data)
- `scenarios/matrix.ts` (composer helpers)
- `scenarios/runner-routing.ts` (platform → GHA runner)
- `scenarios/runtime-support.ts` (wired-fan-out filter)
- `scenarios/run.ts:--emit-matrix` (matrix payload builder)

### Concrete fixture sketch

```ts
// test/e2e-scenario/framework/phases/environment.ts
//
// Assertion-only. The actual install + runtime prep happen as workflow
// setup steps before Vitest starts. This fixture verifies the host is
// in the state the scenario declared.

import type { ScenarioEnvironment } from "../../scenarios/types.ts";
import type { HostCliClient } from "../clients/index.ts";

export interface EnvironmentReady {
  readonly platform: ScenarioEnvironment["platform"];
  readonly install: ScenarioEnvironment["install"];
  readonly runtime: ScenarioEnvironment["runtime"];
  readonly cliPath: string;
}

export interface EnvironmentFixture {
  /** Asserts CLI is on PATH and runtime state matches scenario.environment. */
  assertReady(env: ScenarioEnvironment): Promise<EnvironmentReady>;
}
```

```ts
// test/e2e-scenario/framework/phases/onboard.ts

import type { ScenarioEnvironment } from "../../scenarios/types.ts";
import type { EnvironmentReady } from "./environment.ts";

export interface OpenClawInstance {
  readonly sandboxName: string;
  readonly gatewayUrl: string;
  readonly agent: "openclaw" | "hermes";
  readonly provider: "nvidia" | "ollama-local" | "openai-compatible";
  readonly channels: ReadonlyArray<"slack" | "discord" | "telegram" | "brave">;
}

export interface OnboardFixture {
  /**
   * Registry-driven entry point. Routes by the scenario's onboarding id
   * (with the docker-missing rewrite the existing compiler.ts already
   * does) to the right named method below.
   */
  from(env: ScenarioEnvironment, hostState: EnvironmentReady): Promise<OpenClawInstance>;

  // Named methods — same as bash dispatcher cases.
  cloudOpenclaw(opts?: { model?: string }): Promise<OpenClawInstance>;
  cloudOpenclawNoDocker(opts: { expectError: ErrorClass }): Promise<NegativeOutcome>;
  cloudOpenclawCustomPolicies(opts: { presets: string[] }): Promise<OpenClawInstance>;
  cloudOpenclawSlack(opts: { allowedChannels?: string[] }): Promise<OpenClawInstance>;
  cloudOpenclawDiscord(opts: { allowedChannels?: string[] }): Promise<OpenClawInstance>;
  cloudOpenclawTelegram(opts: { /* ... */ }): Promise<OpenClawInstance>;
  cloudHermes(opts?: { /* ... */ }): Promise<OpenClawInstance>;
  cloudHermesSlack(opts: { /* ... */ }): Promise<OpenClawInstance>;
  cloudHermesDiscord(opts: { /* ... */ }): Promise<OpenClawInstance>;
  localOllamaOpenclaw(opts?: { /* ... */ }): Promise<OpenClawInstance>;
  // ...one per dispatcher case
}
```

```ts
// test/e2e-scenario/live/scenarios.test.ts
// Registry-driven matrix — one Vitest test per scenario in baseline.ts.

import { test, expect } from "../framework/e2e-test.ts";
import { listScenarios } from "../scenarios/registry.ts";
import { isScenarioFullyWired } from "../scenarios/runtime-support.ts";

for (const scenario of listScenarios()) {
  const wired = isScenarioFullyWired(scenario);
  if (!wired.ok) {
    test.skip(`${scenario.id} (not yet wired: ${wired.reasons.join("; ")})`, () => {});
    continue;
  }

  test(scenario.id, async ({
    environment, onboard, stateValidation, lifecycle, runSuite,
  }) => {
    // GHA setup steps already ran install + runtime prep. Just verify.
    const env = await environment.assertReady(scenario.environment);

    const instance = await onboard.from(scenario.environment, env);
    await stateValidation.from(scenario.expectedStateId, instance);

    if (scenario.environment.lifecycle) {
      await lifecycle.from(scenario.environment.lifecycle, instance);
    }

    for (const suiteId of scenario.suiteIds) {
      await runSuite(suiteId, instance);
    }
  });
}
```

### Concrete workflow sketch

```yaml
# .github/workflows/e2e-vitest-scenarios.yaml (evolves from #4968)

name: E2E / Vitest Scenarios

on:
  workflow_dispatch:
    inputs:
      scenarios:
        description: "Comma-separated scenario ids, or empty for full registry"
        required: false
        default: ""

permissions:
  contents: read

concurrency:
  group: e2e-vitest-scenarios-${{ github.ref }}-${{ inputs.scenarios || 'all' }}
  cancel-in-progress: false

jobs:
  generate-matrix:
    runs-on: ubuntu-latest
    outputs:
      matrix: ${{ steps.emit.outputs.matrix }}
    steps:
      - uses: actions/checkout@...
      - uses: actions/setup-node@...
      - run: npm ci --ignore-scripts
      - id: emit
        run: |
          matrix="$(npx tsx test/e2e-scenario/scenarios/run.ts --emit-matrix)"
          echo "matrix=$matrix" >> "$GITHUB_OUTPUT"

  run-scenario:
    needs: generate-matrix
    strategy:
      fail-fast: false
      matrix:
        include: ${{ fromJSON(needs.generate-matrix.outputs.matrix) }}
    runs-on: ${{ matrix.runner }}
    timeout-minutes: 30
    steps:
      - uses: actions/checkout@...
      - uses: actions/setup-node@...

      # Install axis — matrix-gated setup step.
      - name: Install (repo-current)
        if: matrix.install == 'repo-current'
        run: npm ci && npm run build:cli && npm link
      - name: Install (launchable)
        if: matrix.install == 'launchable'
        run: ./scripts/install-launchable.sh

      # Runtime axis — matrix-gated for state mutations.
      - name: Runtime prep (docker-missing)
        if: matrix.runtime == 'docker-missing'
        run: sudo install -m 0755 ./scripts/test-fixtures/docker-shim /usr/local/bin/docker

      # Platform-specific bootstraps.
      - name: WSL bootstrap
        if: matrix.platform == 'wsl-local'
        uses: ./.github/actions/wsl-setup
      - name: Brev provision
        if: matrix.platform == 'brev-launchable'
        uses: ./.github/actions/brev-provision

      - name: Run scenario via Vitest
        env:
          NEMOCLAW_RUN_E2E_SCENARIOS: "1"
          E2E_ARTIFACT_DIR: ${{ github.workspace }}/.e2e/vitest
          # Secret allowlist scoped to this scenario only:
          NVIDIA_API_KEY: ${{ secrets.NVIDIA_API_KEY }}
        run: |
          npx vitest run --project e2e-scenarios-live -t "^${{ matrix.id }}$"

      - name: Upload artifacts
        if: always()
        uses: actions/upload-artifact@...
        with:
          name: e2e-scenario-${{ matrix.id }}
          path: .e2e/vitest/
```

This mirrors the existing `e2e-scenarios-all.yaml` shape one-to-one, just dispatches Vitest instead of `scenarios/run.ts`. Same `--emit-matrix` payload, same runner-routing, same secret allowlist semantics, same fail-fast: false.

## Migration Plan

### 1. Land cv's foundation stack

#4965 → #4969 land as scoped. They give us the runner, primitives, CLI wrappers, first scenario, workflow shape, and migration inventory. Nothing in this proposal blocks them.

### 2. Add phase fixtures (this proposal)

Authored as one PR per phase fixture so each is small and reviewable. Suggested order:

1. **`framework/phases/environment.ts`** — assertion-only (`assertReady(env)`). Verifies CLI is on PATH and docker state matches. Setup is in workflow steps.
2. **`framework/phases/onboard.ts`** — starts with `cloudOpenclaw` and `cloudOpenclawNoDocker` only. New onboarding profiles slot in one method at a time.
3. **`framework/phases/state-validation.ts`** — implements the existing `cli-installed` / `gateway-healthy` / `sandbox-running` / `gateway-absent` / `sandbox-absent` probes from `scenarios/expected-states.ts` as fixture methods.
4. **`framework/phases/lifecycle.ts`** — starts with `rebuildCurrentVersion` and `snapshotCreateRestore` (the two failing today in the typed-shell-runner). Implementing these here naturally fixes the Mode-B failures the typed-shell-runner exposes.
5. **`framework/phases/runtime.ts`** — `runSuite(suiteId, instance)` dispatcher. One suite at a time, mirroring `scenarios/probes/*` and `validation_suites/<category>/*.sh` content.

### 3. Promote install + runtime prep to composite GHA actions

Once the workflow shape stabilizes, extract the install + runtime-prep steps into reusable composite actions under `.github/actions/` so:
- `e2e-scenarios-all.yaml` (typed-shell-runner) and `e2e-vitest-scenarios.yaml` share the same setup steps.
- A new platform value (e.g. a future ARM64 runner) only needs the action updated once.

### 4. Add the registry-driven scenario file + matrix workflow

`live/scenarios.test.ts` as sketched above. `e2e-vitest-scenarios.yaml` evolves to consume `--emit-matrix` (sketch above). As phase fixtures land, more registry entries flip from `test.skip(...)` to running.

### 5. Family-by-family scenario migration

Same as #4941's family-by-family plan. Each family migration:

1. Implements the missing phase fixture method (e.g. `onboard.cloudOpenclawSlack`).
2. Adds the scenario id to `SUPPORTED_ONBOARDING_IDS` in `scenarios/runtime-support.ts`.
3. Updates `migration/legacy-inventory.json` (#4969) with the corresponding bash retirement entry.
4. Verifies parity (Vitest scenario passes the same assertions as the bash suite).
5. Deletes the bash worker + assertion files in a follow-up PR.

The `runtime-support` filter ensures unwired scenarios stay registered (visible in the registry, documented as roadmap) but never produce silent-fail jobs.

### 6. Inventory extends to typed-shell-runner retirement

#4969 currently tracks legacy `test/e2e/test-*.sh`. Extend to also track:
- `scenarios/orchestrators/{phase,runner,context,negative-matcher}.ts`
- `nemoclaw_scenarios/{install,onboard,lifecycle,probes,helpers}/*.sh`
- `validation_suites/**/*.sh`
- `runtime/lib/*.sh`

Each entry gets a `bridgeSurface` (which Vitest phase fixture or composite action replaces it) and `deletionReady` flag. When all phase fixtures cover an area, that bash retires.

## Alternatives Considered

### Per-scenario hand-written test files

This is what `live/ubuntu-repo-cli-smoke.test.ts` does today. Simple, but loses every matrix axis. Adding `wsl-repo-cloud-openclaw-slack` becomes "fork the test file, edit the platform call, edit the onboarding call, edit the channel" — exactly the duplication the typed-shell-runner avoids via `wslRepoDocker(...)`. Acceptable for true one-off probes; not acceptable as the default pattern.

### Single giant `live/all-scenarios.test.ts` with `it.each(...)`

Folds all scenarios into one Vitest file, parameterized by registry. Less flexible than `for-of test()` because Vitest's `it.each` doesn't compose nicely with `test.extend` fixtures. The for-of pattern in the sketch above is idiomatic Vitest and gives each scenario its own test name + artifacts directory.

### Keep typed-shell-runner phase orchestrator, just call it from Vitest

Wraps `scenarios/orchestrators/runner.ts:ScenarioRunner.run()` inside a Vitest test. Preserves the matrix but keeps the duplicated phase orchestration alive forever. Loses #4941's "Vitest owns lifecycle" win.

### Do install + runtime prep inside Vitest fixtures (no GHA matrix)

`environment.from(env)` actually installs (npm ci + build) and mutates runtime state (sets up docker shim) before continuing. Possible but loses GHA's free parallelism on runner selection — one `runs-on: ubuntu-latest` job iterating internally vs N parallel jobs of the right type. Also re-implements work the runner image already does (e.g., `ubuntu-latest` already has node + docker; we shouldn't pretend it doesn't). The hybrid (GHA carries preconditions, fixtures carry application logic) is closer to "use each tool for what it's good at."

### Ignore the matrix; let it lapse

What we're trending toward today if no one objects. The typed registry stays as data, but nothing reads it for Vitest test discovery. Every new scenario is a hand-authored file. After 20 scenarios we have 20 files with 90% duplicate setup. Fixable later, but expensive.

## Proposed Decisions

- [ ] Agree that the matrix axes (platform / install / runtime / onboarding / lifecycle / runtime-suites) survive the migration to Vitest, split between **GHA workflow steps** (platform / install / runtime) and **Vitest phase fixtures** (onboarding / lifecycle / suites).
- [ ] Agree that `live/` test discovery is **registry-driven by default** — one Vitest test per `listScenarios()` entry — with hand-authored files allowed for true one-off probes.
- [ ] Agree that `framework/phases/` is the right home for the application-logic phase fixtures, with `environment.ts` being assertion-only.
- [ ] Agree that `e2e-vitest-scenarios.yaml` (#4968) evolves to consume `--emit-matrix` for fan-out, mirroring the existing `e2e-scenarios-all.yaml` pattern, with install + runtime prep as matrix-gated workflow steps (eventually composite actions).
- [ ] Agree that `scenarios/runtime-support.ts:isScenarioFullyWired` (the existing typed-shell-runner gate) is the same gate for the Vitest matrix — unwired scenarios skip with a structured reason, not silent fail.
- [ ] Agree to extend `migration/legacy-inventory.json` (#4969) to track retirement of `scenarios/orchestrators/`, `nemoclaw_scenarios/`, and `validation_suites/` per family migration.

## Acceptance Criteria

- `framework/phases/environment.ts` (assertion-only) and `framework/phases/onboard.ts` exist and expose at least one method each plus `from(scenarioPart, ...prereqs)`.
- `live/scenarios.test.ts` runs the registry-driven matrix, with `test.skip` for unwired scenarios.
- `e2e-vitest-scenarios.yaml` consumes `--emit-matrix`, fans out one job per scenario id, and runs install + runtime prep as matrix-gated workflow steps.
- One canonical scenario (suggest `ubuntu-repo-cloud-openclaw`) runs end-to-end through phase fixtures and passes its smoke + inference suites.
- The `runtime-support` filter governs both the typed-shell `--emit-matrix` (existing) AND the Vitest registry-driven runner (new).
- Migration inventory entry exists for at least one phase fixture's bash counterpart with `deletionReady: false` (until parity proven).
- Adding a new scenario in `scenarios/scenarios/baseline.ts` automatically produces a Vitest test in CI without touching `live/` files.

## Category

Testing

## Checklist

- [x] I searched existing issues — this is a companion proposal to #4941, not a duplicate.
- [x] This is a design proposal, not a "please build this" request.


Axis	Carrier	Why this layer
platform	GHA `matrix.runs-on` (resolved by `scenarios/runner-routing.ts`)	Native to GHA. `ubuntu-local` → `ubuntu-latest`, `gpu-runner` → self-hosted, `macos-local` → `macos-26`, `wsl-local` → `windows-latest` (with a WSL bootstrap composite action), `brev-launchable` → `ubuntu-latest` (with a Brev provisioning composite action / fixture). Adding a platform value is a registry edit + a routing-table edit.
install	Workflow setup step (matrix-gated)	`if: matrix.install == 'repo-current'` runs `npm ci && npm run build:cli`. `if: matrix.install == 'launchable'` runs the installer. By the time Vitest starts, `nemoclaw` is on PATH. Phase fixture only asserts readiness, doesn't install.
runtime	Workflow setup step (matrix-gated) for state mutations the runner image doesn't already provide	`docker-running` is `ubuntu-latest` default — noop. `docker-missing` requires a shim setup step (existing `nemoclaw_scenarios/onboard/cloud-openclaw-no-docker.sh` does this; promotes to a composite action). `gpu-docker-cdi` is already on the GPU runner image — noop. `macos-docker-optional` is `macos-26`'s default — noop.
onboarding	Vitest phase fixture (`framework/phases/onboard.ts`)	Calling `nemoclaw onboard --provider nvidia --agent openclaw --channel slack` is application logic. There is no GHA primitive for "run this command and parse the output."
lifecycle	Vitest phase fixture (`framework/phases/lifecycle.ts`)	State mutations on the running system (rebuild, snapshot, upgrade). Sequential, stateful, single-process.
runtime-suites	Vitest phase fixture (`framework/phases/runtime.ts:runSuite`)	Assertion bodies. Run sequentially within one Vitest test so they share onboarding state.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adopt phase fixtures + registry-driven test discovery for Vitest E2E scenarios #4990

Architecture at a glance

Problem Statement

Background — what the typed-shell-runner already gets right

Proposed Design

Where each axis lives

What stays the same as #4941

What this proposal adds

What naturally retires

Concrete fixture sketch

Concrete workflow sketch

Migration Plan

1. Land cv's foundation stack

2. Add phase fixtures (this proposal)

3. Promote install + runtime prep to composite GHA actions

4. Add the registry-driven scenario file + matrix workflow

5. Family-by-family scenario migration

6. Inventory extends to typed-shell-runner retirement

Alternatives Considered

Per-scenario hand-written test files

Single giant `live/all-scenarios.test.ts` with `it.each(...)`

Keep typed-shell-runner phase orchestrator, just call it from Vitest

Do install + runtime prep inside Vitest fixtures (no GHA matrix)

Ignore the matrix; let it lapse

Proposed Decisions

Acceptance Criteria

Category

Checklist

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Axis	Type	Examples	Lives in
platform	enum	`ubuntu-local`, `wsl-local`, `macos-local`, `gpu-runner`, `brev-launchable`	`ScenarioEnvironment.platform`
install	enum	`repo-current`, `launchable`	`ScenarioEnvironment.install`
runtime	enum	`docker-running`, `docker-missing`, `macos-docker-optional`, `gpu-docker-cdi`	`ScenarioEnvironment.runtime`
onboarding	string id	`cloud-openclaw`, `cloud-hermes`, `cloud-nvidia-openclaw-slack`, `local-ollama-openclaw`	`ScenarioEnvironment.onboarding`
lifecycle	string id (optional)	`rebuild-current-version`, `snapshot`, `upgrade`	`ScenarioEnvironment.lifecycle`
runtime-suites	string array	`[smoke, inference, credentials, security, lifecycle, ...]`	`ScenarioDefinition.suiteIds`

Adopt phase fixtures + registry-driven test discovery for Vitest E2E scenarios #4990

Description

Architecture at a glance

Problem Statement

Background — what the typed-shell-runner already gets right

Proposed Design

Where each axis lives

What stays the same as #4941

What this proposal adds

What naturally retires

Concrete fixture sketch

Concrete workflow sketch

Migration Plan

1. Land cv's foundation stack

2. Add phase fixtures (this proposal)

3. Promote install + runtime prep to composite GHA actions

4. Add the registry-driven scenario file + matrix workflow

5. Family-by-family scenario migration

6. Inventory extends to typed-shell-runner retirement

Alternatives Considered

Per-scenario hand-written test files

Single giant live/all-scenarios.test.ts with it.each(...)

Keep typed-shell-runner phase orchestrator, just call it from Vitest

Do install + runtime prep inside Vitest fixtures (no GHA matrix)

Ignore the matrix; let it lapse

Proposed Decisions

Acceptance Criteria

Category

Checklist

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

Single giant `live/all-scenarios.test.ts` with `it.each(...)`