Skip to content

test(codex): preserve mirrored transcript branch across turns#77051

Open
mjamiv wants to merge 2 commits intoopenclaw:mainfrom
mjamiv:test/codex-webchat-transcript-branch-regression
Open

test(codex): preserve mirrored transcript branch across turns#77051
mjamiv wants to merge 2 commits intoopenclaw:mainfrom
mjamiv:test/codex-webchat-transcript-branch-regression

Conversation

@mjamiv
Copy link
Copy Markdown
Contributor

@mjamiv mjamiv commented May 4, 2026

Summary

Test

  • pnpm test extensions/codex/src/app-server/transcript-mirror.test.ts --run

Refs #77012

@clawsweeper
Copy link
Copy Markdown
Contributor

clawsweeper Bot commented May 4, 2026

Codex review: needs maintainer review before merge.

Summary
The PR adds a Codex transcript-mirror regression test that writes two scoped app-server turns and asserts the JSONL roles, contents, and parent chain are preserved.

Reproducibility: yes. for the PR coverage path: the added test performs two mirror writes with distinct scopes and inspects the resulting JSONL branch. I did not run local tests because this review is read-only, but the source path and PR-head checks support the scenario.

Next step before merge
No repair lane is needed because the prior mechanical blocker has been addressed and no new actionable patch defect was found.

Security
Cleared: The diff only adds plugin test coverage and does not alter runtime code, dependencies, workflows, secrets, or package execution paths.

Review details

Best possible solution:

Merge the focused regression test after exact-head CI finishes, and handle the WebChat overwrite behavior in the linked implementation issue or PR.

Do we have a high-confidence way to reproduce the issue?

Yes for the PR coverage path: the added test performs two mirror writes with distinct scopes and inspects the resulting JSONL branch. I did not run local tests because this review is read-only, but the source path and PR-head checks support the scenario.

Is this the best way to solve the issue?

Yes: for a test-only PR, the updated approach is boundary-clean and narrowly covers consecutive Codex mirror turns without adding a new public API or reaching into core internals.

What I checked:

  • Updated PR diff: The final PR head adds only one test in the Codex plugin transcript mirror test file and removes the earlier direct import from src/gateway/session-utils.fs.js. (extensions/codex/src/app-server/transcript-mirror.test.ts:122, f58124c5f38a)
  • Boundary policy: The scoped extension guide treats bundled plugins like third-party plugins and forbids importing core internals from src/**; the updated test imports only plugin SDK helpers and the local transcript mirror module. (extensions/AGENTS.md:3, f58124c5f38a)
  • Current implementation path: Current main mirrors app-server result snapshots with a per-turn idempotency scope and the mirror helper appends through the shared session transcript append path under a session write lock. (extensions/codex/src/app-server/run-attempt.ts:1778, 9dc3271efbd7)
  • Existing append behavior: appendSessionTranscriptMessage reads the current transcript leaf and appends the next message with parentId set to the prior leaf, which is the behavior the new regression test locks down for consecutive mirror calls. (src/config/sessions/transcript-append.ts:188, 9dc3271efbd7)
  • CI status: The exact PR head shows the previously failing check-additional-boundaries-c now successful; one unrelated broader check was still in progress at review time. (f58124c5f38a)
  • Related bug context: The PR references the open WebChat transcript-overwrite regression report, but it is regression coverage only and should not close that issue by itself.

Likely related people:

  • steipete: Recent GitHub path history shows repeated work on transcript-mirror.ts, transcript-append.ts, and the async session transcript changes that this regression test exercises. (role: recent maintainer and transcript mirror/append owner; confidence: high; commits: c02bf2f399db, f7ed29e11812, 6147e1b91d3e; files: extensions/codex/src/app-server/transcript-mirror.ts, src/config/sessions/transcript-append.ts)
  • vincentkoc: Recent history shows work on the Codex run-attempt caller path and gateway transcript reader behavior adjacent to the reported session-history loss. (role: recent adjacent maintainer; confidence: medium; commits: e9ca63cf06ec, eb1a0aa574b5, d122a3492d17; files: extensions/codex/src/app-server/run-attempt.ts, src/gateway/session-utils.fs.ts)

Remaining risk / open question:

  • This is regression coverage only; the broader WebChat overwrite report remains open and should be fixed or resolved separately.
  • One exact-head CI check was still in progress during this read-only review, although the prior boundary failure is now green.

Codex review notes: model gpt-5.5, reasoning high; reviewed against 90d25d59c610.

@mjamiv
Copy link
Copy Markdown
Contributor Author

mjamiv commented May 4, 2026

Updated #77051 at f58124c to address the plugin-test core import blocker. The regression now stays inside the Codex extension test boundary and asserts the mirrored JSONL roles, text content, and parent chain directly.

Validation passed:

  • pnpm test extensions/codex/src/app-server/transcript-mirror.test.ts --run
  • pnpm exec oxfmt --check --threads=1 extensions/codex/src/app-server/transcript-mirror.test.ts
  • OPENCLAW_LOCAL_CHECK=0 pnpm check:changed -- --base upstream/main --head HEAD

@mjamiv
Copy link
Copy Markdown
Contributor Author

mjamiv commented May 4, 2026

Follow-up on the remaining red CI after f58124c: the local focused test, format, and changed-check lane passed, and the failed GitHub jobs are downstream of checks-node-agentic-control-plane-runtime timing out with no output in the gateway-server shard.

The log shows visible gateway-server tests passing, then Vitest hitting the 300s no-output watchdog, retrying, and hitting the same no-output watchdog again. checks-node-core failed because it aggregates that non-dist shard failure. I attempted to rerun failed jobs, but GitHub returned HTTP 403 because this account lacks repo admin rights for workflow reruns.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants