You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Establish the foundational primitives the entire scenario E2E framework depends on. This phase introduces the 5-part scenario contract schema (environment / manifest-or-no-manifest-reason / fixtures / runtime actions / assertions), a setup-only/host-only scenario type that does not require a NemoClawInstance manifest, the runtime action runner with ordered evidence, and reusable fake-service / state-staging / dangerous-fixture-cleanup primitives that downstream phases consume.
This phase also delivers the audit-row coverage for legacy root scripts that are setup-heavy but assertion-light, where the entire test can be modeled as a hermetic / host-only scenario contract.
Audit rows in scope
AQ
Phase
Legacy subject
Required coverage
Boundary
Planned scenario/assertion
Fixtures/actions
AQ-001
1
Setup and onboarding manifest audit section
Scenario contracts declare environment, manifest or no-manifest reason, fixtures, runtime actions, assertions, expected failures, and runner requirements.
host CLI
contract schema/resolver assertions
none
AQ-002
1
test-gateway-drift-preflight.sh
Hermetic preflight contract detects gateway drift without requiring product manifest.
gateway
gateway drift assertion module
fake gateway/config fixture
AQ-003
1
test-gateway-health-honest.sh
Gateway health scenario distinguishes honest healthy/unhealthy states and cannot pass from a generic HTTP probe alone.
gateway
gateway health-honest assertion module
fake gateway fixture
AQ-004
1
test-openshell-version-pin.sh
Host-only scenario verifies expected OpenShell version pin behavior.
host CLI
openshell version assertion module
fake openshell CLI fixture
AQ-005
1
test-onboard-inference-smoke.sh
Host-only/onboard smoke contract proves inference smoke setup without live provider secrets.
provider/integration
onboard inference smoke assertion module
fake provider fixture
AQ-006
1
test-docs-validation.sh
Docs validation is represented as setup-only/host-only scenario with real command assertions.
State staging: ~/.nemoclaw/sandboxes.json, onboard-session.json, legacy credentials.json, provider records
Port holders + port probes
Old image fixtures: OpenClaw / Hermes / rebuild / upgrade
Crash shim fixture for openshell-gateway
Cleanup/restore obligations declared and tested for every dangerous fixture (Docker daemon mutation, /etc/hosts, blueprint, policy, image, port mutations)
Validation scenarios — all must pass in PR workflow artifacts
Scenario 1.1 — Host-only hermetic scripts run without product manifests (Happy Path)
Given Gateway drift, gateway health honest, OpenShell version pin, docs validation, or Ollama auth proxy host-only scenarios declare explicit no-manifest reasons.
When Scenario resolution and preview run.
Then The resolved contract includes environment, fixtures, runtime actions if needed, real assertions, and no fake NemoClawInstance manifest.
Steps
Filter audit rows AQ-001…AQ-009 for Phase 1; collect the PR workflow evidence report for the planned contract/schema/assertion modules.
npx vitest run test/e2e-scenario/framework-tests as run by the PR workflow.
Verify each matched audit row is present in workflow evidence with stable assertion IDs, artifact paths, and passing status; verify no-manifest reason is present and scenario is not blocked by missing product manifest.
Given A scenario declares ordered runtime actions such as channels.add, inference.set, snapshot.create, rebuild.
When The runtime action runner creates a plan or executes hermetically.
Then Evidence records appear in declaration order and assertions can reference action outputs.
Steps
Filter row AQ-009; collect PR workflow evidence for scenarios with multiple runtime actions.
Run runtime action planner/runner tests as workflow jobs.
Verify AQ-009 is present in workflow evidence with stable assertion IDs/artifact paths and ordering/output-dependency assertions pass.
Acceptance criteria — issue is NOT DONE until ALL are true
PR landed in test/e2e-scenario/ adding all scenario contracts, fixtures, runtime actions, and assertion modules listed above.
PR CI passing:
npx vitest run test/e2e-scenario/framework-tests — passes
All 6 hermetic / host-only scenarios run in workflow and emit PASS: markers with stable assertion IDs
Validation Scenarios 1.1–1.3 all pass in PR workflow artifacts (Given/When/Then above).
Audit work queue updated: rows AQ-001 through AQ-009 in docs/e2e-audit-work-queue.md flipped from not-started to evidence-complete, with stable assertion IDs and evidence artifact paths populated.
Phase-specific validation gate (from spec): host-only/hermetic scenarios do not require fake product manifests; fixture setup/teardown tested without live cloud secrets; dangerous fixtures include cleanup/restore tests; the four named scripts (test-gateway-drift-preflight.sh, test-gateway-health-honest.sh, test-openshell-version-pin.sh, test-onboard-inference-smoke.sh) represented as hermetic scenario contracts with real assertions.
No-cheat gate: no row marked complete by preview/dry-run/metadata-only output.
Cleanup gate: every dangerous fixture has restore logic + test (AQ-008).
Secret gate: no raw secrets, fake bad keys, or assertion IDs in any product-facing manifest.
PR description references AQ rows covered, links this issue.
Out of scope
Product manifest expansion (Phase 2+)
Onboarding/install flow assertions (Phase 2)
Provider/routing/inference assertions (Phase 3)
Legacy script deletion (Phase 11+)
Dependencies
None. This is the foundation phase; everything else waits on it.
Cross-phase acceptance gates (apply to every phase)
No-cheat gate — preview/dry-run output cannot mark an audit row complete.
Boundary gate — assertions touch the same SUT boundary as the legacy script.
Evidence gate — every assertion emits an evidence path and stable assertion ID.
Secret gate — no manifest, log, report, or fixture file contains raw secrets.
Cleanup gate — fixtures that mutate host or repo state have restore/cleanup logic and tests.
Audit completeness gate — every assigned audit row has owner, planned scenario/assertion, phase assignment, evidence status.
Phase completion gate — phase complete only when every assigned row has executable evidence (or independent audit amendment).
Executable assertion gate — completed scenarios point to concrete suite steps / assertion modules, not pendingStep(...), TODOs, generic probes, or prose.
Phase 1: Environment, Manifest, Fixture, and Runtime Action Primitives
Parent epic: #3588
Goal
Establish the foundational primitives the entire scenario E2E framework depends on. This phase introduces the 5-part scenario contract schema (environment / manifest-or-no-manifest-reason / fixtures / runtime actions / assertions), a setup-only/host-only scenario type that does not require a
NemoClawInstancemanifest, the runtime action runner with ordered evidence, and reusable fake-service / state-staging / dangerous-fixture-cleanup primitives that downstream phases consume.This phase also delivers the audit-row coverage for legacy root scripts that are setup-heavy but assertion-light, where the entire test can be modeled as a hermetic / host-only scenario contract.
Audit rows in scope
test-gateway-drift-preflight.shtest-gateway-health-honest.shtest-openshell-version-pin.shtest-onboard-inference-smoke.shtest-docs-validation.shtest-ollama-auth-proxy-e2e.shsetup requirements/etc/hosts, policy, blueprint, image, and port mutations require cleanup/restore tests.channels.add,inference.set,snapshot.create,rebuild, andupgradeemit ordered evidence.Required scenario contracts to add
Each lands with a stable scenario ID, no fake
NemoClawInstancemanifest, and real assertions emittingPASS:/FAIL:markers and evidence paths.gateway-drift-preflight— hermetic, fake gateway/config fixturegateway-health-honest— hermetic, fake gateway fixture, distinguishes honest healthy/unhealthyinstaller-openshell-version-pin— host-only, fake OpenShell CLI fixtureonboard-inference-smoke— host-only, fake provider fixturedocs-validation— host-only, docs fixturehost-ollama-auth-proxy— host-only, Ollama/proxy fixture, declares port/token/cleanupRequired primitives to add
Fixtures (
test/e2e-scenario/nemoclaw_scenarios/fixtures/)~/.nemoclaw/sandboxes.json,onboard-session.json, legacycredentials.json, provider recordsopenshell-gateway/etc/hosts, blueprint, policy, image, port mutations)Runtime actions (
test/e2e-scenario/nemoclaw_scenarios/runtime-actions/)channels.add,inference.set,snapshot.create,rebuild,upgradeAssertion modules (
test/e2e-scenario/validation_suites/assert/)Validation scenarios — all must pass in PR workflow artifacts
Scenario 1.1 — Host-only hermetic scripts run without product manifests (Happy Path)
NemoClawInstancemanifest.npx vitest run test/e2e-scenario/framework-testsas run by the PR workflow.Scenario 1.2 — Dangerous fixtures cannot omit cleanup (Sad Path)
/etc/hosts, policies, blueprint files, or images.Scenario 1.3 — Runtime action evidence preserves declared order (Happy Path)
channels.add,inference.set,snapshot.create,rebuild.Acceptance criteria — issue is NOT DONE until ALL are true
test/e2e-scenario/adding all scenario contracts, fixtures, runtime actions, and assertion modules listed above.npx vitest run test/e2e-scenario/framework-tests— passesPASS:markers with stable assertion IDsdocs/e2e-audit-work-queue.mdflipped fromnot-startedtoevidence-complete, with stable assertion IDs and evidence artifact paths populated.test-gateway-drift-preflight.sh,test-gateway-health-honest.sh,test-openshell-version-pin.sh,test-onboard-inference-smoke.sh) represented as hermetic scenario contracts with real assertions.Out of scope
Dependencies
Cross-phase acceptance gates (apply to every phase)
pendingStep(...), TODOs, generic probes, or prose.