[QA harness] Mock approval followthrough emits undeclared read for Codex app-server lane

# Correction TLDR

**Status: harness/mock-provider artifact, not a proven user-facing Codex app-server bug.**

The original issue overclaimed this as a P1 Codex runtime problem. A higher-confidence audit shows the mock provider emits a provider-level `read` function call from prompt text even when the Codex app-server lane does not declare `read` as an OpenClaw dynamic tool. Codex intentionally owns workspace tools such as `read/write/edit/exec/apply_patch` natively rather than exposing them through the OpenClaw dynamic-tool bridge.

**What actually breaks:** the QA parity harness was comparing a malformed mock-provider plan against the Codex app-server lane. This is not enough evidence that real users lose approval-followthrough reads.

**Product impact if OpenClaw moved fully to Codex today: P4 until live/native proof says otherwise.** The remaining risk is live proof coverage, not a demonstrated production approval-read regression.

# Latest Beta.5 Evidence

```text
OpenClaw baseline: v2026.5.10-beta.5
PR: #80323
PR head: 3336dec6419c9cc9a87dc7cfa6f48118ca2d838e
Remote proof run: https://github.com/electricsheephq/openclaw-local-test/actions/runs/25719383976
Confidence tracker: #80936
```

The corrected `first-hour-20` mock gate now has zero hard failures:

```json
{
  "first-hour-20-direct": { "total": 18, "passed": 15, "skipped": 3, "failed": 0 }
}
```

`approval-turn-tool-followthrough` is one of the 3 report-only rows:

```text
mock-openai still models approval followthrough as a Pi-style read call; Codex-native approval/read behavior requires native/live proof
```

# Correct Fix

- Gate mock `read` planning on declared/available tools, or model Codex-native read through the real Codex app-server native tool protocol.
- Keep mock provider-plan diagnostics separate from runtime transcript/tool-call evidence.
- Reopen/escalate as a product bug only if a live/native Codex run shows approved reads fail outside this mock contract.

# Superseded Original Report

The earlier reproduction and observed drift were useful for finding the harness flaw, but should not be read as proof of a Codex product/runtime bug.

# Links

- Umbrella RFC/tracker: #80171
- PR: #80323
- Confidence proof: #80936
- Live/Testbox proof tracker: #80397


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[QA harness] Mock approval followthrough emits undeclared read for Codex app-server lane #80236

Correction TLDR

Latest Beta.5 Evidence

Correct Fix

Superseded Original Report

Links

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

[QA harness] Mock approval followthrough emits undeclared read for Codex app-server lane #80236

Description

Correction TLDR

Latest Beta.5 Evidence

Correct Fix

Superseded Original Report

Links

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions