Phase 3: Inference Provider, Routing, and Config-Shape Audit Coverage (E2E audit-coverage)

# Phase 3: Inference Provider, Routing, and Config-Shape Audit Coverage

**Parent epic:** #3588

## Goal

Cover audited assertions for inference provider integration, routing, configuration shape, and inference switching. Generic `/v1/models` health probes cannot satisfy provider-specific routing rows. Kimi compatibility requires trajectory tool-call splitting evidence. Inference switch requires evidence for route state, registry/session state, config hash/shape, and a live post-switch request.

## Audit rows in scope

| AQ | Phase | Legacy subject | Required coverage | Boundary | Planned scenario/assertion | Fixtures/actions |
|---|---:|---|---|---|---|---|
| AQ-019 | 3 | `test-bedrock-runtime-compatible-anthropic.sh` | Bedrock-compatible path covers adapter config, runtime requests, route behavior, and secret-redaction checks. | provider/integration | bedrock compatible assertions | fake Bedrock endpoint |
| AQ-020 | 3 | `test-cloud-inference-e2e.sh` provider routing rows | Provider-specific routing **cannot** be satisfied by generic `/v1/models` health alone. | provider/integration | provider route assertions | fake provider fixture |
| AQ-021 | 3 | `test-inference-routing.sh` | Inference routing proves route health and routed chat/completion behavior through the expected route. | provider/integration | inference routing assertions | route fixture |
| AQ-022 | 3 | `test-openclaw-inference-switch.sh` and `test-hermes-inference-switch.sh` | Inference switch proves state update, config hash/change, and live post-switch request. | durable state / provider | inference switch assertions | provider switch action |
| AQ-023 | 3 | `test-kimi-inference-compat.sh` | Kimi compatibility covers plugin wiring and Kimi-compatible models route. | provider/integration | kimi compatibility assertions | fake Kimi endpoint |
| AQ-024 | 3 | `test-model-router-provider-routed-inference.sh` | Model-router provider path proves healthy endpoint and routed completion. | provider/integration | model-router assertions | fake router endpoint |
| AQ-025 | 3 | `test-messaging-compatible-endpoint.sh` and `test-brave-search-e2e.sh` | Compatible endpoint / provider integration checks cover route-specific config and runtime behavior. | provider/integration | compatible endpoint assertions | fake endpoint fixture |

## Required manifests to add

- [ ] `openclaw-bedrock-compatible-anthropic.yaml`
- [ ] `hermes-bedrock-compatible-anthropic.yaml`
- [ ] `openai-openclaw-routing.yaml`
- [ ] `anthropic-openclaw-routing.yaml`
- [ ] `compatible-openclaw-routing.yaml`
- [ ] `compatible-openclaw-kimi.yaml`
- [ ] `routed-nvidia-openclaw-model-router.yaml`
- [ ] `openclaw-nvidia-inference-switch.yaml`
- [ ] `cloud-nvidia-hermes.yaml`
- [ ] `cloud-nvidia-hermes-inference-switch.yaml`
- [ ] `telegram-compatible-openclaw.yaml`
- [ ] `openclaw-runtime-overrides.yaml` (only as image-entrypoint setup if useful)

## Required fixtures / runtime actions

- [ ] Fake Bedrock Runtime endpoint and host mapping fixture
- [ ] Bedrock adapter state/log/token fixture
- [ ] Fake compatible OpenAI / Kimi endpoints
- [ ] Model-router health endpoint fixture (or live setup contract)
- [ ] `inference.set` runtime action for OpenClaw and Hermes
- [ ] Runtime override container/image fixture
- [ ] Provider-key leak scan fixture
- [ ] Brave: API secret gate, policy/config, direct-curl fixtures
- [ ] Trajectory/session artifact reader

## Required assertions

- [ ] Provider route identity and provider registry shape
- [ ] OpenClaw and Hermes config shape after compatible provider setup
- [ ] Adapter health including fake endpoint, region, token hash
- [ ] Authenticated Converse/ConverseStream or compatible traffic observed by fake endpoint
- [ ] Sandbox route chat returns expected content
- [ ] OpenClaw/Hermes runtime path returns expected content
- [ ] **Kimi trajectory splits combined tool calls into discrete `hostname`, `date`, `uptime` exec calls**
- [ ] Model-router `healthy_count > 0` and routed completion returns `model: nvidia-routed*` plus content
- [ ] Inference switch updates route/session/registry/config without unwanted restart where legacy checked it
- [ ] Runtime overrides update config and config hash; reject invalid values
- [ ] Brave: secret gate, policy preset, OpenClaw web search config, credential hygiene, agent search, direct-curl search/skip behavior

## Validation scenarios — all must pass in PR workflow artifacts

### Scenario 3.1 — Bedrock-compatible audit coverage covers adapter, configs, runtime, traffic, and leaks (Happy Path)

- **Given** Bedrock-compatible Anthropic fake endpoint, host mapping, adapter token, OpenClaw/Hermes scenario contracts available.
- **When** Onboarding and runtime assertions execute.
- **Then** Health, registry, config shape, sandbox route chat, agent runtime chat, authenticated Converse/ConverseStream traffic, safe logs, and leak scans **all pass**.
- **Steps**
  1. Filter rows AQ-019…AQ-025; start fake Bedrock endpoint/host mapping fixture; collect PR workflow evidence for provider-specific assertions.
  2. Run onboarding and Bedrock/provider assertion modules.
  3. Verify each row passes with stable assertion IDs/artifact paths; Bedrock includes config/runtime/traffic/leak evidence.

### Scenario 3.2 — Generic `/v1/models` health cannot satisfy provider-specific routing audit rows (Sad Path)

- **Given** Kimi, model-router, inference-switch, or runtime-overrides audit row has only a generic models-health assertion.
- **When** Audit-coverage validation runs.
- **Then** The audit row remains **unresolved** and completion is rejected.
- **Steps**
  1. Filter rows AQ-020…AQ-025; inject provider-specific contract with only generic health assertion.
  2. Run audit-coverage validation in workflow.
  3. Verify rejection, rows unresolved, and named missing provider/config/trajectory assertions.

### Scenario 3.3 — Inference switch proves state, config hash, and live post-switch request (Happy Path)

- **Given** OpenClaw or Hermes inference switch action is declared.
- **When** The action runs; assertions inspect route state, registry/session state, config hash/shape, live post-switch request.
- **Then** Switch coverage is complete only if all surfaces match expected values and no unwanted restart occurred where legacy checked it.
- **Steps**
  1. Filter row AQ-022; provision switchable provider fixtures; collect PR workflow evidence.
  2. Run `inference.set` and assertion modules.
  3. Verify AQ-022 evidence-complete only when route/session/registry/config/live-request assertion IDs all pass.

## Acceptance criteria — issue is NOT DONE until ALL are true

1. **PR landed** in `test/e2e-scenario/` with manifests, fixtures, runtime actions, assertion modules above.
2. **PR CI passing**:
   - All 7 audit-row scenarios run in workflow and emit `PASS:` markers
   - Generic `/v1/models` health is **rejected** as coverage for provider-specific routing
   - Kimi requires trajectory/tool-call assertion to pass
   - Inference switch requires route + registry/session + config + live request assertions all to pass
3. **Validation Scenarios 3.1–3.3 all pass** in PR workflow artifacts.
4. **Audit work queue updated**: AQ-019 through AQ-025 flipped to `evidence-complete`.
5. **Phase-specific validation gate** (from spec): generic `/v1/models` cannot satisfy provider-specific routing or config-shape; Kimi not complete unless trajectory/tool-call semantics asserted; inference switch not complete unless route state, registry/session state, config hash/shape, and live post-switch request all covered.
6. **No-cheat gate**: generic health probes alone cannot satisfy any provider-specific row.
7. **Secret gate**: provider keys never in manifests; leak scan asserted on Bedrock fixture.
8. **PR description references AQ rows** covered, links this issue.

## Dependencies

- **Phase 1 (#4347)** — fake provider endpoints, runtime actions runner.
- **Phase 2 (#4348)** — baseline manifests (`openclaw-nvidia.yaml`, Hermes baseline).

## Out of scope

- Local Ollama/GPU (Phase 4)
- Messaging matrix (Phase 5)
- Hermes Discord/Slack flow (Phase 6)

---

## Cross-phase acceptance gates (apply to every phase)

1. **Setup gate** — scenario contract declares environment, manifest or no-manifest reason, fixtures, runtime actions, assertions.
2. **No-cheat gate** — preview/dry-run output cannot mark an audit row complete.
3. **Boundary gate** — assertions touch the same SUT boundary as the legacy script.
4. **Evidence gate** — every assertion emits an evidence path and stable assertion ID.
5. **Secret gate** — no manifest, log, report, or fixture file contains raw secrets.
6. **Cleanup gate** — fixtures that mutate host or repo state have restore/cleanup logic and tests.
7. **Audit completeness gate** — every assigned audit row has owner, planned scenario/assertion, phase assignment, evidence status.
8. **Phase completion gate** — phase complete only when every assigned row has executable evidence (or independent audit amendment).
9. **Executable assertion gate** — completed scenarios point to concrete suite steps / assertion modules, not `pendingStep(...)`, TODOs, generic probes, or prose.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Phase 3: Inference Provider, Routing, and Config-Shape Audit Coverage (E2E audit-coverage) #4349

Phase 3: Inference Provider, Routing, and Config-Shape Audit Coverage

Goal

Audit rows in scope

Required manifests to add

Required fixtures / runtime actions

Required assertions

Validation scenarios — all must pass in PR workflow artifacts

Scenario 3.1 — Bedrock-compatible audit coverage covers adapter, configs, runtime, traffic, and leaks (Happy Path)

Scenario 3.2 — Generic `/v1/models` health cannot satisfy provider-specific routing audit rows (Sad Path)

Scenario 3.3 — Inference switch proves state, config hash, and live post-switch request (Happy Path)

Acceptance criteria — issue is NOT DONE until ALL are true

Dependencies

Out of scope

Cross-phase acceptance gates (apply to every phase)

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

AQ	Phase	Legacy subject	Required coverage	Boundary	Planned scenario/assertion	Fixtures/actions
AQ-019	3	`test-bedrock-runtime-compatible-anthropic.sh`	Bedrock-compatible path covers adapter config, runtime requests, route behavior, and secret-redaction checks.	provider/integration	bedrock compatible assertions	fake Bedrock endpoint
AQ-020	3	`test-cloud-inference-e2e.sh` provider routing rows	Provider-specific routing cannot be satisfied by generic `/v1/models` health alone.	provider/integration	provider route assertions	fake provider fixture
AQ-021	3	`test-inference-routing.sh`	Inference routing proves route health and routed chat/completion behavior through the expected route.	provider/integration	inference routing assertions	route fixture
AQ-022	3	`test-openclaw-inference-switch.sh` and `test-hermes-inference-switch.sh`	Inference switch proves state update, config hash/change, and live post-switch request.	durable state / provider	inference switch assertions	provider switch action
AQ-023	3	`test-kimi-inference-compat.sh`	Kimi compatibility covers plugin wiring and Kimi-compatible models route.	provider/integration	kimi compatibility assertions	fake Kimi endpoint
AQ-024	3	`test-model-router-provider-routed-inference.sh`	Model-router provider path proves healthy endpoint and routed completion.	provider/integration	model-router assertions	fake router endpoint
AQ-025	3	`test-messaging-compatible-endpoint.sh` and `test-brave-search-e2e.sh`	Compatible endpoint / provider integration checks cover route-specific config and runtime behavior.	provider/integration	compatible endpoint assertions	fake endpoint fixture

Phase 3: Inference Provider, Routing, and Config-Shape Audit Coverage (E2E audit-coverage) #4349

Description

Phase 3: Inference Provider, Routing, and Config-Shape Audit Coverage

Goal

Audit rows in scope

Required manifests to add

Required fixtures / runtime actions

Required assertions

Validation scenarios — all must pass in PR workflow artifacts

Scenario 3.1 — Bedrock-compatible audit coverage covers adapter, configs, runtime, traffic, and leaks (Happy Path)

Scenario 3.2 — Generic /v1/models health cannot satisfy provider-specific routing audit rows (Sad Path)

Scenario 3.3 — Inference switch proves state, config hash, and live post-switch request (Happy Path)

Acceptance criteria — issue is NOT DONE until ALL are true

Dependencies

Out of scope

Cross-phase acceptance gates (apply to every phase)

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

Scenario 3.2 — Generic `/v1/models` health cannot satisfy provider-specific routing audit rows (Sad Path)