Skip to content

Phase 9: Rebuild, Upgrade, Installer Version, and Runtime Edge Audit Coverage (E2E audit-coverage) #4355

@jyaunches

Description

@jyaunches

Phase 9: Rebuild, Upgrade, Installer Version, and Runtime Edge Audit Coverage

Parent epic: #3588

Goal

Cover audited assertions for rebuild from old sandbox/image fixtures, upgrade of stale sandbox/gateway versions, installer version pin, runtime overrides, overlayfs autofix, Spark install, and runtime edge cases (root entrypoint, plugin runtime EXDEV). Rebuild rows are not satisfied by fresh-sandbox rebuild; installer rows must remain hermetic/setup-specific without product instance manifests; policy preservation must be asserted in registry, live gateway, and backup manifest.

Audit rows in scope

AQ Phase Legacy subject Required coverage Boundary Planned scenario/assertion Fixtures/actions
AQ-050 9 test-rebuild-openclaw.sh, test-rebuild-hermes.sh, test-sandbox-rebuild.sh Rebuild starts from old sandbox/image fixtures and preserves required state/config. durable state / sandbox rebuild assertions old image/sandbox fixture
AQ-051 9 test-upgrade-stale-sandbox.sh and test-openshell-gateway-upgrade.sh Upgrade handles stale sandbox/gateway versions and verifies reachable upgraded survivor. gateway / durable state upgrade assertions stale sandbox fixture
AQ-052 9 validation_suites/rebuild_upgrade/* State preserved, version upgraded, post-rebuild inference works, policy/Hermes config preserved. durable state / provider rebuild-upgrade suite assertions old image/action fixture
AQ-053 9 test-runtime-overrides.sh Runtime overrides affect expected setup/runtime behavior without polluting defaults. host CLI / sandbox runtime override assertions override fixture
AQ-054 9 test-overlayfs-autofix.sh Overlayfs autofix path is hermetic/setup-specific and reports expected repair behavior. host CLI / sandbox overlayfs assertions fake Docker/FS fixture
AQ-055 9 test-spark-install.sh Spark install path covers setup-specific installer behavior. host CLI spark install assertions fake installer fixture
AQ-056 9 test-hermes-root-entrypoint-smoke.sh and test-openclaw-plugin-runtime-exdev.sh Runtime edge checks validate root entrypoint and plugin runtime EXDEV behavior. agent runtime / sandbox runtime edge assertions runtime fixture

Required manifests to add

  • openclaw-nvidia-custom-policies-rebuild.yaml
  • hermes-nvidia-discord-rebuild.yaml
  • openclaw-nvidia-upgrade-stale.yaml
  • openshell-gateway-upgrade-survivor.yaml
  • No product manifest for OpenShell version pin — explicit no-manifest contract installer-openshell-version-pin
  • openclaw-nvidia-overlayfs-autofix.yaml
  • openclaw-build-plugin-runtime-exdev.yaml

Required fixtures / runtime actions

  • Old OpenClaw + Hermes base image fixtures
  • Temporary blueprint min-version mutation (with guaranteed restore)
  • Registry/session staging with old agent versions
  • Fake compatible endpoint for upgrade survivor
  • Old/current installer refs
  • macOS hermetic installer asset fixtures
  • Docker daemon mutation backup/restore
  • Patched cluster image fixture (overlayfs)
  • Temporary policy mutation for /dev and /dev/shm (with restore)
  • EXDEV repro source/destination fixture
  • Hermes root-entrypoint container smoke fixtures (only if static audit confirms presence)

Required assertions

Rebuild (AQ-050)

  • Preserves markers, policies, messaging config
  • Backup sanitization
  • Agent version changes from old to expected current

Upgrade (AQ-051, AQ-052)

  • Upgrade check reports stale before rebuild and up-to-date after rebuild
  • OpenShell gateway upgrade backs up/restores survivor and preserves marker/registry
  • Post-rebuild inference works
  • Policy and Hermes config preserved

Installer (AQ-051 cont.)

  • Installer replaces too-new OpenShell with pinned compatible version

Runtime edge / overrides (AQ-053, AQ-054, AQ-056)

  • Runtime overrides affect setup/runtime without polluting defaults
  • Overlayfs autofix creates and reuses patched image
  • Overlayfs opt-out negative path behaves correctly
  • OpenClaw plugin runtime dependency replacement avoids EXDEV/cross-device rename failure
  • Hermes root-entrypoint smoke assertions (if applicable)

Spark install (AQ-055)

  • Spark install installer behavior

Validation scenarios — all must pass in PR workflow artifacts

Scenario 9.1 — Rebuild/upgrade starts from old sandbox/image fixtures (Happy Path)

  • Given Old OpenClaw/Hermes images, old registry/session state, and temporary blueprint min-version mutation with restore are staged.
  • When Rebuild or upgrade actions run.
  • Then Markers, policies, messaging config, backup sanitization, agent version change, stale-before / up-to-date-after checks, survivor reachability all pass.
  • Steps
    1. Filter rows AQ-050…AQ-056; stage old image/registry/session/blueprint/runtime fixtures; collect PR workflow evidence.
    2. Run rebuild/upgrade/runtime edge actions.
    3. Verify rows evidence-complete only when artifacts include passing markers, policy, messaging, backup, version, stale/up-to-date, reachability, overlayfs, Spark, runtime-edge assertions.

Scenario 9.2 — Fresh sandbox rebuild remains unresolved for old-sandbox rebuild audit rows (Sad Path)

  • Given Rebuild audit coverage is claimed from a fresh sandbox without old-sandbox fixture.
  • When Audit-coverage validation runs.
  • Then The audit row remains unresolved and missing old-sandbox fixture is reported.
  • Steps
    1. Filter rows AQ-050…AQ-052; inject fresh-only rebuild coverage claim.
    2. Run audit-coverage validation in workflow.
    3. Verify rejection; missing old-sandbox fixture/evidence reported.

Scenario 9.3 — Installer-only and runtime edge scenarios stay hermetic/setup-specific (Happy Path)

  • Given OpenShell version pin, overlayfs autofix, and EXDEV runtime dependency replacement contracts are setup/hermetic with required fixtures.
  • When Contract validation and hermetic assertions run.
  • Then Version pin replacement, overlayfs patched-image create/reuse plus opt-out negative path, EXDEV repro source/destination behavior pass without requiring product manifests.
  • Steps
    1. Filter rows AQ-053…AQ-056; stage installer asset, Docker daemon mutation backup, patched image, policy mutation, EXDEV fixtures; collect PR workflow evidence.
    2. Run hermetic assertions.
    3. Verify rows evidence-complete only when artifacts include passing version replacement, overlayfs behavior, opt-out negative, EXDEV fix evidence, cleanup restore, stable assertion IDs.

Acceptance criteria — issue is NOT DONE until ALL are true

  1. PR landed in test/e2e-scenario/ with manifests, fixtures, runtime actions, assertion modules above.
  2. PR CI passing:
    • All 7 audit-row scenarios run in workflow and emit PASS: markers
    • Rebuild coverage rejected if started from fresh sandbox (must use old-sandbox fixture)
    • Policy preservation asserted in registry, live gateway, and backup manifest
    • Installer-only audit coverage stays setup/hermetic — not forced into product instance manifest
  3. Validation Scenarios 9.1–9.3 all pass in PR workflow artifacts.
  4. Audit work queue updated: AQ-050 through AQ-056 flipped to evidence-complete.
  5. Phase-specific validation gate (from spec): rebuild rows require old-sandbox fixture (not fresh sandbox); policy preservation in registry, live gateway, and backup manifest where legacy did so; installer-only coverage hermetic, not forced into product instance manifest.
  6. Cleanup gate: blueprint min-version mutation, Docker daemon mutation, policy mutations all restored after run (asserted).
  7. PR description references AQ rows covered, links this issue.

Dependencies

Out of scope

  • Gateway crash-loop / dashboard / tunnel / Brev (Phase 10)
  • Final reconciliation (Phase 11)

Cross-phase acceptance gates (apply to every phase)

  1. Setup gate — scenario contract declares environment, manifest or no-manifest reason, fixtures, runtime actions, assertions.
  2. No-cheat gate — preview/dry-run output cannot mark an audit row complete.
  3. Boundary gate — assertions touch the same SUT boundary as the legacy script.
  4. Evidence gate — every assertion emits an evidence path and stable assertion ID.
  5. Secret gate — no manifest, log, report, or fixture file contains raw secrets.
  6. Cleanup gate — fixtures that mutate host or repo state have restore/cleanup logic and tests.
  7. Audit completeness gate — every assigned audit row has owner, planned scenario/assertion, phase assignment, evidence status.
  8. Phase completion gate — phase complete only when every assigned row has executable evidence (or independent audit amendment).
  9. Executable assertion gate — completed scenarios point to concrete suite steps / assertion modules, not pendingStep(...), TODOs, generic probes, or prose.

Metadata

Metadata

Assignees

No one assigned

    Labels

    area: e2eEnd-to-end tests, nightly failures, or validation infrastructure
    No fields configured for Enhancement.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions