Phase 2: Onboarding and Installer Audit Coverage (E2E audit-coverage)

# Phase 2: Onboarding and Installer Audit Coverage

**Parent epic:** #3588

## Goal

Cover every audited assertion and control-flow row from the legacy onboarding and installer E2E scripts. The full OpenClaw cloud onboarding path must be proven through three distinct inference surfaces (direct provider, sandbox `inference.local`, OpenClaw-mediated agent). Negative onboarding paths must fail closed with no forbidden side effects. Public/launchable install sources cannot be satisfied by repo-current install manifests.

## Audit rows in scope

| AQ | Phase | Legacy subject | Required coverage | Boundary | Planned scenario/assertion | Fixtures/actions |
|---|---:|---|---|---|---|---|
| AQ-010 | 2 | `test-full-e2e.sh` | Full OpenClaw path covers install, onboard, gateway, sandbox, credentials, policy, and inference surfaces. | host CLI/sandbox/provider | full e2e scenario contract | install/onboard actions |
| AQ-011 | 2 | `test-cloud-onboard-e2e.sh` | Cloud onboard covers OpenClaw setup, sandbox state, gateway route, credentials, expected policy presets. | sandbox/provider | cloud onboard assertions | cloud provider fixture / live secrets |
| AQ-012 | 2 | `test/e2e/e2e-cloud-experimental/checks/*.sh` | Delegated cloud checks cover inference-local HTTP, security checks, landlock/read-only behavior. | sandbox/security-policy | cloud delegated check assertions | cloud sandbox fixture |
| AQ-013 | 2 | `test-cloud-inference-e2e.sh` | Cloud inference proves direct provider chat, sandbox `inference.local`, and OpenClaw-mediated response as **distinct evidence**. | provider/integration | cloud inference surface assertions | fake/live provider |
| AQ-014 | 2 | `test-hermes-e2e.sh` | Hermes onboarding validates agent selection, sandbox readiness, inference, Hermes-specific health. | agent runtime | hermes onboard assertions | hermes sandbox fixture |
| AQ-015 | 2 | `test-double-onboard.sh` and `test-gpu-double-onboard.sh` | Repeated onboarding preserves/updates registry correctly and rejects stale or duplicate state. | durable state | double-onboard state assertions | staged registry fixture |
| AQ-016 | 2 | `test-onboard-negative-paths.sh` | Invalid key, Docker/preflight failure, gateway port conflict, bad input fail closed without forbidden side effects. | host CLI / durable state | negative onboard assertions | bad key / port-holder fixtures |
| AQ-017 | 2 | `test-onboard-resume.sh` and `test-onboard-repair.sh` | Resume/repair preserves expected state and repairs incomplete onboarding artifacts. | durable state | resume/repair assertions | staged session fixture |
| AQ-018 | 2 | `test-launchable-smoke.sh` and Brev launchable flow | Public/launchable install path **not satisfied by repo-current**; proves launchable sentinel/readiness. | host CLI | launchable smoke assertions | fake download/Brev fixture |

## Required manifests to add (`test/e2e-scenario/nemoclaw_scenarios/manifests/`)

- [ ] `openclaw-nvidia.yaml`
- [ ] `openclaw-nvidia-public-curl.yaml`
- [ ] `openclaw-nvidia-cloud-inference.yaml` (when explicit model evidence is required)
- [ ] `openclaw-openai-compatible-double-onboard.yaml`
- [ ] `openclaw-nvidia-invalid-key-negative.yaml`
- [ ] `openclaw-nvidia-gateway-port-conflict.yaml`
- [ ] `openclaw-nvidia-custom-policies.yaml`
- [ ] `openclaw-nvidia-resume-after-interrupt.yaml`
- [ ] `openclaw-nvidia-repair-existing-config.yaml`
- [ ] `launchable-cloud-nvidia-openclaw.yaml`
- [ ] `dgx-spark-install-only.yaml` *or* explicit setup-only no-manifest scenario

## Required fixtures / runtime actions

- [ ] Public installer source/ref/log verification fixture
- [ ] Fake OpenAI endpoint for double-onboard
- [ ] Port-holder fixture for gateway port conflict
- [ ] Bad-key fixture (injected by scenario, **not stored in manifest**)
- [ ] Interrupted session fixture for resume/repair
- [ ] Missing recorded sandbox fixture for repair
- [ ] Launchable clone/sentinel fixture
- [ ] Direct cloud / sandbox route / OpenClaw-mediated prompt payload fixtures
- [ ] Hermes health/config/log fixtures

## Required assertions

- [ ] Install source/ref correctness
- [ ] CLI/OpenShell availability
- [ ] **Direct NVIDIA chat**, **sandbox `inference.local` chat**, and **OpenClaw-mediated agent response** as three distinct passing assertions
- [ ] Hermes runtime health/config/log assertions as distinct
- [ ] Gateway reuse and no port conflicts during double-onboard
- [ ] Stale registry reconciliation and lifecycle guidance
- [ ] Resume cached-step skipping and session completion
- [ ] Repair recreates missing recorded sandbox and rejects conflicting resume requests
- [ ] Launchable artifacts and sentinel readiness
- [ ] Delegated cloud-experimental check PASS/FAIL outcomes (inference-local HTTP, security checks, Landlock readonly)

## Validation scenarios — all must pass in PR workflow artifacts

### Scenario 2.1 — Cloud OpenClaw onboarding is complete only with all three inference surfaces (Happy Path)

- **Given** A live or hermetic OpenClaw cloud onboarding scenario completes onboarding.
- **When** Direct provider chat, sandbox `inference.local` chat, and OpenClaw-mediated agent response assertions all run.
- **Then** Onboarding can be marked complete with **distinct evidence paths for each surface**.
- **Steps**
  1. Filter rows AQ-010…AQ-018; provision OpenClaw cloud onboarding contract with declared secrets or hermetic fake provider; collect PR workflow evidence.
  2. Scenario runner / Vitest workflow jobs run onboarding and three inference assertion modules.
  3. Verify each matched row is present with passing stable assertion IDs/artifact paths; onboarding is complete only after all three surface assertions pass.

### Scenario 2.2 — Negative onboarding leaves no forbidden side effects (Sad Path)

- **Given** Invalid NVIDIA key or gateway port conflict fixtures injected by scenario setup, **not by product manifests**.
- **When** Onboarding is executed.
- **Then** It exits with the expected message, **no stack trace**, and no sandbox/gateway/credential side effects.
- **Steps**
  1. Filter row AQ-016; stage bad-key or port-holder fixture; collect PR workflow evidence for negative onboarding.
  2. Run onboarding action.
  3. Verify AQ-016 remains incomplete unless workflow evidence shows expected message, no stack trace, and no side effects.

### Scenario 2.3 — Public installer and launchable flows are not satisfied by repo-current (Sad Path)

- **Given** A public-curl, launchable, Spark, or installer scenario is wired to a repo-current install manifest.
- **When** Contract validation runs.
- **Then** Validation **fails** and asks for explicit install source/ref/log evidence or setup-only scenario.
- **Steps**
  1. Filter row AQ-018; create invalid install-source metadata fixture.
  2. Run manifest/contract validation in workflow.
  3. Verify workflow evidence shows repo-current substitution is rejected and AQ-018 remains unresolved until public/launchable evidence exists.

## Acceptance criteria — issue is NOT DONE until ALL are true

1. **PR landed** in `test/e2e-scenario/` adding all manifests, fixtures, runtime actions, and assertion modules above.
2. **PR CI passing**:
   - All 9 audit-row scenarios run in workflow and emit `PASS:` markers
   - Happy-path OpenClaw cloud onboarding asserts **all three** inference surfaces — single-surface evidence is rejected
   - Negative onboarding asserts failure message + no stack trace + no forbidden side effects
3. **Validation Scenarios 2.1–2.3 all pass** in PR workflow artifacts.
4. **Audit work queue updated**: AQ-010 through AQ-018 flipped to `evidence-complete` with stable assertion IDs and evidence paths.
5. **Phase-specific validation gate** (from spec): happy-path onboarding complete only when all three inference surfaces are covered; negative onboarding complete only when failure message + no stack trace + forbidden side effects all asserted; public installer/launchable cannot be satisfied by repo-current install manifests.
6. **No-cheat gate**: generic `/health` or single-surface inference cannot satisfy AQ-013/AQ-014; repo-current cannot satisfy AQ-018.
7. **Secret gate**: bad-key fixture is scenario-injected, never in product manifest; no raw secrets in manifests/logs.
8. **PR description references AQ rows** covered, links this issue.

## Dependencies

- **Phase 1 (#4347)** must be `evidence-complete` — fake services, fixture cleanup framework, runtime action runner, contract schema.

## Out of scope

- Provider/routing/config-shape (Phase 3)
- GPU/Ollama (Phase 4)
- Messaging lifecycle (Phase 5+)
- Hermes Discord/Slack deep flow (Phase 6)
- Security/credentials (Phase 7)

---

## Cross-phase acceptance gates (apply to every phase)

1. **Setup gate** — scenario contract declares environment, manifest or no-manifest reason, fixtures, runtime actions, assertions.
2. **No-cheat gate** — preview/dry-run output cannot mark an audit row complete.
3. **Boundary gate** — assertions touch the same SUT boundary as the legacy script.
4. **Evidence gate** — every assertion emits an evidence path and stable assertion ID.
5. **Secret gate** — no manifest, log, report, or fixture file contains raw secrets.
6. **Cleanup gate** — fixtures that mutate host or repo state have restore/cleanup logic and tests.
7. **Audit completeness gate** — every assigned audit row has owner, planned scenario/assertion, phase assignment, evidence status.
8. **Phase completion gate** — phase complete only when every assigned row has executable evidence (or independent audit amendment).
9. **Executable assertion gate** — completed scenarios point to concrete suite steps / assertion modules, not `pendingStep(...)`, TODOs, generic probes, or prose.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Phase 2: Onboarding and Installer Audit Coverage (E2E audit-coverage) #4348

Phase 2: Onboarding and Installer Audit Coverage

Goal

Audit rows in scope

Required manifests to add (`test/e2e-scenario/nemoclaw_scenarios/manifests/`)

Required fixtures / runtime actions

Required assertions

Validation scenarios — all must pass in PR workflow artifacts

Scenario 2.1 — Cloud OpenClaw onboarding is complete only with all three inference surfaces (Happy Path)

Scenario 2.2 — Negative onboarding leaves no forbidden side effects (Sad Path)

Scenario 2.3 — Public installer and launchable flows are not satisfied by repo-current (Sad Path)

Acceptance criteria — issue is NOT DONE until ALL are true

Dependencies

Out of scope

Cross-phase acceptance gates (apply to every phase)

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

AQ	Phase	Legacy subject	Required coverage	Boundary	Planned scenario/assertion	Fixtures/actions
AQ-010	2	`test-full-e2e.sh`	Full OpenClaw path covers install, onboard, gateway, sandbox, credentials, policy, and inference surfaces.	host CLI/sandbox/provider	full e2e scenario contract	install/onboard actions
AQ-011	2	`test-cloud-onboard-e2e.sh`	Cloud onboard covers OpenClaw setup, sandbox state, gateway route, credentials, expected policy presets.	sandbox/provider	cloud onboard assertions	cloud provider fixture / live secrets
AQ-012	2	`test/e2e/e2e-cloud-experimental/checks/*.sh`	Delegated cloud checks cover inference-local HTTP, security checks, landlock/read-only behavior.	sandbox/security-policy	cloud delegated check assertions	cloud sandbox fixture
AQ-013	2	`test-cloud-inference-e2e.sh`	Cloud inference proves direct provider chat, sandbox `inference.local`, and OpenClaw-mediated response as distinct evidence.	provider/integration	cloud inference surface assertions	fake/live provider
AQ-014	2	`test-hermes-e2e.sh`	Hermes onboarding validates agent selection, sandbox readiness, inference, Hermes-specific health.	agent runtime	hermes onboard assertions	hermes sandbox fixture
AQ-015	2	`test-double-onboard.sh` and `test-gpu-double-onboard.sh`	Repeated onboarding preserves/updates registry correctly and rejects stale or duplicate state.	durable state	double-onboard state assertions	staged registry fixture
AQ-016	2	`test-onboard-negative-paths.sh`	Invalid key, Docker/preflight failure, gateway port conflict, bad input fail closed without forbidden side effects.	host CLI / durable state	negative onboard assertions	bad key / port-holder fixtures
AQ-017	2	`test-onboard-resume.sh` and `test-onboard-repair.sh`	Resume/repair preserves expected state and repairs incomplete onboarding artifacts.	durable state	resume/repair assertions	staged session fixture
AQ-018	2	`test-launchable-smoke.sh` and Brev launchable flow	Public/launchable install path not satisfied by repo-current; proves launchable sentinel/readiness.	host CLI	launchable smoke assertions	fake download/Brev fixture

Phase 2: Onboarding and Installer Audit Coverage (E2E audit-coverage) #4348

Description

Phase 2: Onboarding and Installer Audit Coverage

Goal

Audit rows in scope

Required manifests to add (test/e2e-scenario/nemoclaw_scenarios/manifests/)

Required fixtures / runtime actions

Required assertions

Validation scenarios — all must pass in PR workflow artifacts

Scenario 2.1 — Cloud OpenClaw onboarding is complete only with all three inference surfaces (Happy Path)

Scenario 2.2 — Negative onboarding leaves no forbidden side effects (Sad Path)

Scenario 2.3 — Public installer and launchable flows are not satisfied by repo-current (Sad Path)

Acceptance criteria — issue is NOT DONE until ALL are true

Dependencies

Out of scope

Cross-phase acceptance gates (apply to every phase)

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

Required manifests to add (`test/e2e-scenario/nemoclaw_scenarios/manifests/`)