Improve Codex happy path prompt snapshots#76229
Conversation
|
Codex review: needs maintainer review before merge. Summary Reproducibility: not applicable. as a feature/test-fixture PR. The current-main gap is reproducible by reading the existing docs and snapshot helper: they explicitly omit the Codex model prompt layer and render only OpenClaw-owned app-server layers. Next step before merge Security Review detailsBest possible solution: Have a maintainer review the protected PR, confirm the Codex fixture provenance, and merge once the exact-head boundary checks are green. Do we have a high-confidence way to reproduce the issue? Not applicable as a feature/test-fixture PR. The current-main gap is reproducible by reading the existing docs and snapshot helper: they explicitly omit the Codex model prompt layer and render only OpenClaw-owned app-server layers. Is this the best way to solve the issue? Yes, subject to maintainer sign-off. A pinned fixture with source metadata and explicit reconstructed-layer wording is a narrow maintainable solution that avoids claiming byte-for-byte raw Codex request capture. What I checked:
Likely related people:
Remaining risk / open question:
Codex review notes: model gpt-5.5, reasoning high; reviewed against fd83c49cffc0. |
47068c5 to
da8dc21
Compare
e23c131 to
2c569a5
Compare
* test: add Codex model prompt layers to snapshots * test: keep rendered prompt snapshots raw * test: check prompt snapshot drift in ci * test: prefer codex model cache for prompt fixtures * fix: exclude publishable plugin dist from core package
* test: add Codex model prompt layers to snapshots * test: keep rendered prompt snapshots raw * test: check prompt snapshot drift in ci * test: prefer codex model cache for prompt fixtures * fix: exclude publishable plugin dist from core package
This builds on the prompt snapshot work by making the Codex happy-path fixtures show the part we actually care about during audits: the model-bound layer stack, not just the OpenClaw app-server payload.
Before this, the snapshots showed selected thread and turn params, OpenClaw developer instructions, user input, and dynamic tools. That was useful, but it left the most important upstream layer implicit: the Codex
gpt-5.5model instructions that Codex resolves from its model catalog ormodels_cache.json.This PR pins a pragmatic
gpt-5.5Codex prompt fixture generated from Codex's runtime model cache shape, records source metadata, and renders it into the Telegram direct, Discord group, and heartbeat happy-path snapshots alongside the Codex permission developer text, OpenClaw developer instructions, user input, dynamic tool references, and rough layer token estimates. The snapshots still call out the remaining runtime-owned gap clearly: this is a deterministic reconstructed layer view, not a byte-for-byte raw OpenAI request capture from Codex core.I also added
pnpm prompt:snapshots:sync-codex-modelso maintainers can refresh the pinned Codex prompt from Codex's normal$CODEX_HOME/models_cache.jsonlocation, from the default~/.codex/models_cache.json, or from an explicit local Codex checkout/catalog path. If none of those default sources exist, the command now exits cleanly without changing the committed fixture, because most contributors should not need a Codex checkout just to work in this repo.The prompt snapshot drift check now runs inside CI's additional boundary shard. That means a PR that changes prompt composition or the pinned Codex prompt fixture has to carry the regenerated snapshots with it, instead of letting the committed audit artifacts silently drift.
While running the widened local changed gate, the Docker E2E boundary guard exposed an existing mismatch for
live-codex-npm-plugin: that lane is intentionally both live and package-backed. The PR keeps that exception explicit so script lint remains aligned with the existing Docker plan tests.After rebasing over the plugin externalization work on
main, this also keeps the root packagefilesexclusions for the publishableacpx,googlechat, andlineplugin dist trees. Those plugin builds have their own package paths and should not be swept into the core OpenClaw npm tarball.