Skip to content

Improve Codex happy path prompt snapshots#76229

Merged
pashpashpash merged 5 commits intomainfrom
codex/rendered-prompt-snapshots
May 2, 2026
Merged

Improve Codex happy path prompt snapshots#76229
pashpashpash merged 5 commits intomainfrom
codex/rendered-prompt-snapshots

Conversation

@pashpashpash
Copy link
Copy Markdown
Contributor

@pashpashpash pashpashpash commented May 2, 2026

This builds on the prompt snapshot work by making the Codex happy-path fixtures show the part we actually care about during audits: the model-bound layer stack, not just the OpenClaw app-server payload.

Before this, the snapshots showed selected thread and turn params, OpenClaw developer instructions, user input, and dynamic tools. That was useful, but it left the most important upstream layer implicit: the Codex gpt-5.5 model instructions that Codex resolves from its model catalog or models_cache.json.

This PR pins a pragmatic gpt-5.5 Codex prompt fixture generated from Codex's runtime model cache shape, records source metadata, and renders it into the Telegram direct, Discord group, and heartbeat happy-path snapshots alongside the Codex permission developer text, OpenClaw developer instructions, user input, dynamic tool references, and rough layer token estimates. The snapshots still call out the remaining runtime-owned gap clearly: this is a deterministic reconstructed layer view, not a byte-for-byte raw OpenAI request capture from Codex core.

I also added pnpm prompt:snapshots:sync-codex-model so maintainers can refresh the pinned Codex prompt from Codex's normal $CODEX_HOME/models_cache.json location, from the default ~/.codex/models_cache.json, or from an explicit local Codex checkout/catalog path. If none of those default sources exist, the command now exits cleanly without changing the committed fixture, because most contributors should not need a Codex checkout just to work in this repo.

The prompt snapshot drift check now runs inside CI's additional boundary shard. That means a PR that changes prompt composition or the pinned Codex prompt fixture has to carry the regenerated snapshots with it, instead of letting the committed audit artifacts silently drift.

While running the widened local changed gate, the Docker E2E boundary guard exposed an existing mismatch for live-codex-npm-plugin: that lane is intentionally both live and package-backed. The PR keeps that exception explicit so script lint remains aligned with the existing Docker plan tests.

After rebasing over the plugin externalization work on main, this also keeps the root package files exclusions for the publishable acpx, googlechat, and line plugin dist trees. Those plugin builds have their own package paths and should not be swept into the core OpenClaw npm tarball.

@openclaw-barnacle openclaw-barnacle Bot added docs Improvements or additions to documentation scripts Repository scripts docker Docker and sandbox tooling size: XL maintainer Maintainer-authored PR labels May 2, 2026
@clawsweeper
Copy link
Copy Markdown
Contributor

clawsweeper Bot commented May 2, 2026

Codex review: needs maintainer review before merge.

Summary
The PR adds reconstructed Codex model-bound prompt snapshot fixtures, a Codex model fixture sync script, prompt snapshot CI drift checking, and package/Docker boundary adjustments.

Reproducibility: not applicable. as a feature/test-fixture PR. The current-main gap is reproducible by reading the existing docs and snapshot helper: they explicitly omit the Codex model prompt layer and render only OpenClaw-owned app-server layers.

Next step before merge
The protected maintainer label and external Codex prompt fixture provenance require human review; I found no narrow automated repair defect.

Security
Cleared: The diff adds docs, tests, generated fixtures, package exclusions, and local maintainer scripts without new dependencies, workflow permission changes, secret handling, or downloaded code execution.

Review details

Best possible solution:

Have a maintainer review the protected PR, confirm the Codex fixture provenance, and merge once the exact-head boundary checks are green.

Do we have a high-confidence way to reproduce the issue?

Not applicable as a feature/test-fixture PR. The current-main gap is reproducible by reading the existing docs and snapshot helper: they explicitly omit the Codex model prompt layer and render only OpenClaw-owned app-server layers.

Is this the best way to solve the issue?

Yes, subject to maintainer sign-off. A pinned fixture with source metadata and explicit reconstructed-layer wording is a narrow maintainable solution that avoids claiming byte-for-byte raw Codex request capture.

What I checked:

Likely related people:

  • pashpashpash: GitHub commit history for the central snapshot helper and generator shows merged PR Add Codex happy path prompt snapshots #75807 introduced the current Codex happy-path prompt snapshot feature this PR builds on. (role: prior feature contributor; confidence: high; commits: 563dca82f429, f8e2bd4f0102, 6fb1c0d539dd; files: test/helpers/agents/happy-path-prompt-snapshots.ts, scripts/generate-prompt-snapshots.ts, test/fixtures/agents/prompt-snapshots/happy-path)
  • steipete: Recent history on docs/concepts/system-prompt.md includes prompt and system-prompt maintenance work adjacent to the documented behavior this PR updates. (role: adjacent prompt/docs maintainer; confidence: medium; commits: 8f4cbbbe6658, 496a5eb56f46, 22bff819abd3; files: docs/concepts/system-prompt.md)
  • vincentkoc: Recent history on the Docker E2E boundary guard includes the lane-resource guard this PR adjusts for the live package-backed Codex lane. (role: adjacent Docker/package boundary maintainer; confidence: medium; commits: b9eb31b54cfa, edfef73ffceb, 6cba12caaec0; files: scripts/check-docker-e2e-boundaries.mjs, scripts/lib/docker-e2e-scenarios.mjs, scripts/lib/docker-e2e-plan.mjs)

Remaining risk / open question:

  • The exact-head check-additional-boundaries job, which includes the newly added prompt snapshot drift check, was still in progress at review time.
  • The PR commits an external Codex model prompt fixture, so a maintainer should confirm the pinned source/provenance is acceptable before merge.

Codex review notes: model gpt-5.5, reasoning high; reviewed against fd83c49cffc0.

@pashpashpash pashpashpash force-pushed the codex/rendered-prompt-snapshots branch from 47068c5 to da8dc21 Compare May 2, 2026 20:13
@pashpashpash pashpashpash force-pushed the codex/rendered-prompt-snapshots branch from e23c131 to 2c569a5 Compare May 2, 2026 21:00
@pashpashpash pashpashpash merged commit 9e57b98 into main May 2, 2026
85 checks passed
@pashpashpash pashpashpash deleted the codex/rendered-prompt-snapshots branch May 2, 2026 21:40
lxe pushed a commit to lxe/openclaw that referenced this pull request May 6, 2026
* test: add Codex model prompt layers to snapshots

* test: keep rendered prompt snapshots raw

* test: check prompt snapshot drift in ci

* test: prefer codex model cache for prompt fixtures

* fix: exclude publishable plugin dist from core package
github-actions Bot pushed a commit to Desicool/openclaw that referenced this pull request May 9, 2026
* test: add Codex model prompt layers to snapshots

* test: keep rendered prompt snapshots raw

* test: check prompt snapshot drift in ci

* test: prefer codex model cache for prompt fixtures

* fix: exclude publishable plugin dist from core package
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

docker Docker and sandbox tooling docs Improvements or additions to documentation maintainer Maintainer-authored PR scripts Repository scripts size: XL

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant