fix(agents): persist user turn before attempt failures by TurboTheTurtle · Pull Request #86764 · openclaw/openclaw

TurboTheTurtle · 2026-05-26T05:10:39Z

Real behavior proof

Behavior addressed: Internal ACP/CLI session-effects turns now persist the accepted user message into the active internal attempt transcript before the external runner can fail, and later successful transcript persistence reuses that same internal file without duplicating the user row or leaking the internal prompt into the visible session transcript.
Real environment tested: Local OpenClaw runtime modules from /Users/andy/openclaw/openclaw-86592 on macOS, writing real temporary OpenClaw session JSONL files through the production transcript persistence helpers.
Exact steps or command run after this patch: node --import tsx - <<'EOF' ... EOF runtime smoke that imports persistUserTurnTranscript and persistAcpTurnTranscript, starts with a visible session transcript, routes the current internal turn through sessionFileOverride, then appends the assistant reply with userAlreadyPersisted: true.
Evidence after fix:

{
  "internalSessionFileEndsWith": "internal-agent-runs/run-proof.jsonl",
  "visibleSessionFileEndsWith": "visible-session.jsonl",
  "afterUserSessionFileIsInternal": true,
  "internalMessages": [
    "user:sanitized internal user prompt",
    "assistant:sanitized internal assistant reply"
  ],
  "visibleMessages": [
    "user:existing visible session prompt"
  ],
  "visibleContainsInternalPrompt": false,
  "internalUserMessageCount": 1
}

I also ran the targeted ClawSweeper regression suites after this patch:

node scripts/run-vitest.mjs src/agents/command/attempt-execution.cli.test.ts src/commands/agent.test.ts src/agents/agent-command.live-model-switch.test.ts
Test Files  5 passed (5)
Tests  137 passed (137)

node scripts/run-vitest.mjs src/auto-reply/reply/agent-runner-execution.test.ts src/auto-reply/reply/followup-runner.test.ts
Test Files  2 passed (2)
Tests  190 passed (190)

Observed result after fix: The pre-run user message lands in internal-agent-runs/run-proof.jsonl, the ACP assistant append uses the same internal transcript with userAlreadyPersisted: true, the visible transcript retains only its original visible prompt, and the internal user message appears exactly once.
What was not tested: I did not launch a live external Claude/Codex CLI subprocess or a live ACP backend; this was a local OpenClaw runtime transcript smoke using real session persistence code, supplemented by targeted regression tests.

Summary

add a shared user-only transcript persistence helper
persist ACP and CLI user turns before the runner can throw
route internal session-effects persistence to the active internal transcript
skip duplicate user-row mirroring when the current turn was already persisted

Tests

node --import tsx - <<'EOF' ... EOF local OpenClaw runtime transcript smoke
node scripts/run-vitest.mjs src/agents/command/attempt-execution.cli.test.ts src/commands/agent.test.ts src/agents/agent-command.live-model-switch.test.ts
node scripts/run-vitest.mjs src/auto-reply/reply/agent-runner-execution.test.ts src/auto-reply/reply/followup-runner.test.ts

Fixes #86592

clawsweeper · 2026-05-26T05:13:04Z

Codex review: needs maintainer review before merge. Reviewed June 7, 2026, 11:02 PM ET / 03:02 UTC.

Summary
The PR adds pre-run user-turn transcript persistence for CLI and ACP attempts, routes internal session-effects writes to internal transcript files, and avoids duplicate user rows during later transcript mirroring.

PR surface: Source +121, Tests +460. Total +581 across 5 files.

Reproducibility: yes. Source inspection shows current main persists CLI/ACP user rows from post-attempt transcript paths, so an exception before those paths leaves the accepted prompt unwritten; I did not execute the repro because this review was read-only.

Review metrics: 1 noteworthy metric.

Session write timing: 2 runtime paths changed: CLI and ACP. The PR moves accepted user-turn persistence earlier on both runtime paths, which is the core compatibility and session-state fact to verify before merge.

Merge readiness
Overall: 🐚 platinum hermit
Proof: 🦞 diamond lobster
Patch quality: 🐚 platinum hermit
Result: ready for maintainer review.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Rank-up moves:

Rerun the focused transcript persistence suites and git diff --check on the final merge ref before landing.

Risk before merge

[P1] This intentionally changes session-state timing: accepted user rows can become durable before backend completion, so the final merge ref should keep focused transcript tests in the land gate.
[P1] The diff touches the prompt persistence, before_agent_run, before_message_write, and redaction boundary; the latest source looks correct, but this remains a security-sensitive persistence surface.
[P1] The codegraph comment reports function-level overlap with other active agent-command/session PRs, so maintainers should recheck the actual merge result before landing.

Maintainer options:

Land With Focused Merge-Ref Proof (recommended)
Rerun the targeted agent-command, attempt-execution, CLI runner, auto-reply, and diff-check gates on the final merge ref before landing this session-timing change.
Pause For Overlap Sequencing
If maintainers intend to land overlapping agent-command/session PRs first, refresh this branch afterward and review only the resulting merge diff.
Accept The Timing Change
Maintainers may accept the new pre-run durability behavior once they are comfortable that prompt redaction and duplicate-suppression coverage are sufficient.

Next step before merge

No automated repair is indicated; the remaining action is maintainer validation of the final merge ref and sequencing with overlapping agent-command/session PRs.

Security
Cleared: No concrete security or supply-chain defect found; the diff changes durable prompt persistence timing but keeps CLI persistence behind before_agent_run and uses existing before_message_write/redaction hooks.

Review details

Best possible solution:

Land the narrow transcript-persistence fix after final merge-ref validation confirms CLI, ACP, internal session-effects, fallback retry, and hook/redaction behavior still match the intended transcript contract.

Do we have a high-confidence way to reproduce the issue?

Yes. Source inspection shows current main persists CLI/ACP user rows from post-attempt transcript paths, so an exception before those paths leaves the accepted prompt unwritten; I did not execute the repro because this review was read-only.

Is this the best way to solve the issue?

Yes. The PR uses the existing approved user-turn recorder for CLI after before_agent_run, adds the same pre-run persistence for ACP, and uses userAlreadyPersisted/sessionFileOverride to avoid duplicate or visible/internal transcript drift.

AGENTS.md: found and applied where relevant.

Codex review notes: model gpt-5.5, reasoning high; reviewed against 4780546c124d.

Label changes

Label justifications:

P1: The PR addresses lost accepted user turns that can corrupt session history and wedge agent workflows after attempt failures.
merge-risk: 🚨 session-state: The diff changes when and where user turns are persisted in session transcripts, including internal session-effects transcripts.
merge-risk: 🚨 security-boundary: The diff moves prompt persistence earlier in the run lifecycle and therefore depends on the hook/redaction boundary remaining intact.
rating: 🐚 platinum hermit: Overall readiness is 🐚 platinum hermit; proof is 🦞 diamond lobster and patch quality is 🐚 platinum hermit.
status: 👀 ready for maintainer look: ClawSweeper has no concrete contributor-facing blocker left for this PR. Sufficient (terminal): The PR body includes after-fix terminal output from a local runtime transcript smoke using production persistence helpers plus targeted regression output, which is sufficient for this internal transcript-persistence path.
proof: sufficient: Contributor real behavior proof is sufficient. The PR body includes after-fix terminal output from a local runtime transcript smoke using production persistence helpers plus targeted regression output, which is sufficient for this internal transcript-persistence path.

Evidence reviewed

PR surface:

Source +121, Tests +460. Total +581 across 5 files.

View PR surface stats

Area	Files	Added	Removed	Net
Source	3	167	46	+121
Tests	2	461	1	+460
Docs	0	0	0	0
Config	0	0	0	0
Generated	0	0	0	0
Other	0	0	0	0
Total	5	628	47	+581

Acceptance criteria:

[P1] node scripts/run-vitest.mjs src/agents/command/attempt-execution.cli.test.ts src/commands/agent.test.ts src/agents/agent-command.live-model-switch.test.ts.
[P1] node scripts/run-vitest.mjs src/auto-reply/reply/agent-runner-execution.test.ts src/auto-reply/reply/followup-runner.test.ts.
[P1] git diff --check.

What I checked:

Repository policy read: Read the root repository AGENTS.md and the scoped src/agents/AGENTS.md; the review applied the session-state, security-boundary, and scoped agent testing guidance. (AGENTS.md:1, 4780546c124d)
Current main ACP failure window: On current main, ACP transcript persistence is called only after acpManager.runTurn completes, so an exception before that post-run block skips user-row persistence. (src/agents/agent-command.ts:944, 4780546c124d)
Current main CLI post-run mirroring: On current main, CLI transcript mirroring happens after runWithModelFallback returns; failures before that path do not call persistCliTurnTranscript. (src/agents/agent-command.ts:2010, 4780546c124d)
PR head pre-run user helper: At PR head, persistUserTurnTranscript reuses persistTextTurnTranscript with an empty assistant reply, while userAlreadyPersisted prevents the later mirror from appending the same user row twice. (src/agents/command/attempt-execution.ts:274, 762e7ada8bfc)
PR head CLI recorder wiring: At PR head, runAgentAttempt builds a UserTurnTranscriptRecorder from transcriptBody/body and passes it to the CLI runner after suppressPromptPersistenceOnRetry is considered. (src/agents/command/attempt-execution.ts:619, 762e7ada8bfc)
CLI runner boundary: The existing CLI runner persists the approved user turn only after before_agent_run handling and before llm_input/model execution, preserving the hook gate the PR depends on. (src/agents/cli-runner.ts:791, 4780546c124d)

Likely related people:

@shakkernerd: History shows Shakker refactored command transcript user-turn persistence and restored user-turn persistence checks shortly before this PR's linked issue and patch surface. (role: introduced adjacent user-turn persistence behavior; confidence: high; commits: 840cea5d6e36, 5268bf900e7b, 696fb41c5b2e; files: src/agents/command/attempt-execution.ts, src/agents/cli-runner.ts, src/sessions/user-turn-transcript.ts)
Andy Ye: The PR branch commits directly implement the new pre-run user persistence, internal transcript routing, cwd preservation, and hook gating on the affected files. (role: current PR author and branch implementer; confidence: high; commits: 8b5b2edba500, 8ff6d0b4a34f, 22dbafb92060; files: src/agents/agent-command.ts, src/agents/command/attempt-execution.ts, src/agents/command/attempt-execution.cli.test.ts)
@steipete: Recent main history shows Peter Steinberger repeatedly touched the same agent-command, ACP, and session transcript surfaces through refactors and docs around the affected area. (role: recent adjacent owner; confidence: medium; commits: 77f1359612f6, 7dea2837565e, 6c113837b88c; files: src/agents/agent-command.ts, src/agents/command/attempt-execution.ts)
Marvinthebored: A recent merged change stabilized user-turn serialization across turns and touched the same attempt-execution and user-turn transcript helpers. (role: recent adjacent contributor; confidence: medium; commits: 1af55bc6654f; files: src/agents/command/attempt-execution.ts, src/sessions/user-turn-transcript.ts, src/agents/embedded-agent-runner/run/attempt.ts)

What the crustacean ranks mean

🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works

ClawSweeper keeps one durable marker-backed review comment per issue or PR.
Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
Maintainers can also comment @clawsweeper review to request a fresh review only.
Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

TurboTheTurtle · 2026-05-26T05:17:44Z

@clawsweeper re-review

clawsweeper · 2026-05-26T05:17:46Z

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

State: Complete
Detail: The targeted re-review finished, the durable review comment was updated, and the synced verdict was routed.
Run: https://github.com/openclaw/clawsweeper/actions/runs/26433802370
Updated: 2026-05-26T05:26:08.403Z

clawsweeper · 2026-05-26T05:26:01Z

ClawSweeper PR egg

✨ Hatched: ✨ glimmer Brave Proofling

Hatch command

Comment @clawsweeper hatch when this PR is hatchable.

Hatchability rules:

Merged PRs are hatchable.
Open PRs are hatchable when they are status: 👀 ready for maintainer look, status: 🚀 automerge armed, or labeled clawsweeper:automerge.
Closed unmerged PRs are hatchable only when one of those hatchable labels is still present in the durable record.

Rarity: ✨ glimmer.
Trait: sniffs out flaky tests.
Image traits: location green-check meadow; accessory green check lantern; palette moonlit blue and soft silver; mood sparkly; pose curling around a status light; shell glossy opal shell; lighting warm desk-lamp glow; background subtle branch markers.
Share on X: post this hatch
Copy: My PR egg hatched a ✨ glimmer Brave Proofling in ClawSweeper.

What is this egg doing here?

Eggs appear after the PR passes real-behavior proof. It is here for vibes, not verdicts: it does not change labels, ratings, merge decisions, or automation.
The shell reacts to review momentum: open follow-up work warms it up, re-review makes it wobble, and a clean final review lets it hatch.
Hatchability usually comes from sufficient real-behavior proof, no blocking P0/P1/P2 findings, no security attention needed, and clean correctness. A merged PR is already final, so merge makes the egg hatchable independently.
The hatch is seeded from this repository and PR number, so the same PR keeps the same creature; the reviewed head SHA can only change safe visual details.
Rarity is just collectible sparkle: 🥚 common, 🌱 uncommon, 💎 rare, ✨ glimmer, and 🌈 legendary.

TurboTheTurtle · 2026-05-26T05:52:04Z

@clawsweeper re-review

clawsweeper · 2026-05-26T05:52:07Z

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

State: Complete
Detail: The targeted re-review finished, the durable review comment was updated, and the synced verdict was routed.
Run: https://github.com/openclaw/clawsweeper/actions/runs/26434935263
Updated: 2026-05-26T06:00:21.285Z

TurboTheTurtle · 2026-06-05T00:54:01Z

@clawsweeper re-review

clawsweeper · 2026-06-05T00:54:03Z

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

State: Complete
Detail: The targeted re-review finished, the durable review comment was updated, and the synced verdict was routed.
Run: https://github.com/openclaw/clawsweeper/actions/runs/26988716384
Updated: 2026-06-05T01:03:37.447Z

TurboTheTurtle · 2026-06-05T01:32:57Z

@clawsweeper re-review

clawsweeper · 2026-06-05T01:33:00Z

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

State: Complete
Detail: The targeted re-review finished, the durable review comment was updated, and the synced verdict was routed.
Run: https://github.com/openclaw/clawsweeper/actions/runs/26989977662
Updated: 2026-06-05T01:41:56.093Z

openclaw-barnacle Bot added commands Command implementations agents Agent runtime and tooling size: M triage: mock-only-proof Candidate: PR proof only shows tests, mocks, snapshots, lint, typecheck, or CI. labels May 26, 2026

openclaw-barnacle Bot added proof: supplied External PR includes structured after-fix real behavior proof. and removed triage: mock-only-proof Candidate: PR proof only shows tests, mocks, snapshots, lint, typecheck, or CI. labels May 26, 2026

openclaw-barnacle Bot added size: L and removed size: M proof: sufficient ClawSweeper judged the real behavior proof convincing. labels May 26, 2026

This comment was marked as spam.

Sign in to view

RomneyDa added the dependency-guard-backfill label May 29, 2026

clawsweeper Bot removed proof: sufficient ClawSweeper judged the real behavior proof convincing. rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. labels May 29, 2026

clawsweeper Bot added rating: 🌊 off-meta tidepool PR readiness rating does not apply to this item. and removed status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. labels May 29, 2026

RomneyDa removed the dependency-guard-backfill label May 29, 2026

openperf mentioned this pull request May 29, 2026

fix(agents): drop stale exec approval followups after session rebind #85679

Merged

4 tasks

TurboTheTurtle force-pushed the fix/persist-user-turn-on-attempt-failure-86592 branch from c37df1b to 6574b73 Compare May 31, 2026 19:54

openclaw-barnacle Bot removed the proof: sufficient ClawSweeper judged the real behavior proof convincing. label Jun 5, 2026

TurboTheTurtle added 4 commits June 7, 2026 19:51

fix(agents): persist user turn before attempt failures

8b5b2ed

fix(agents): route internal turn persistence to active transcript

8ff6d0b

fix(agents): preserve cwd for pre-run CLI persistence

22dbafb

fix(agents): gate CLI user persistence behind hooks

762e7ad

TurboTheTurtle force-pushed the fix/persist-user-turn-on-attempt-failure-86592 branch from b98f590 to 762e7ad Compare June 8, 2026 02:52

openclaw-barnacle Bot removed the proof: sufficient ClawSweeper judged the real behavior proof convincing. label Jun 8, 2026

clawsweeper Bot added the proof: sufficient ClawSweeper judged the real behavior proof convincing. label Jun 8, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(agents): persist user turn before attempt failures#86764

fix(agents): persist user turn before attempt failures#86764
TurboTheTurtle wants to merge 4 commits into
openclaw:mainfrom
TurboTheTurtle:fix/persist-user-turn-on-attempt-failure-86592

TurboTheTurtle commented May 26, 2026 •

edited

Loading

Uh oh!

clawsweeper Bot commented May 26, 2026 •

edited

Loading

Uh oh!

TurboTheTurtle commented May 26, 2026

Uh oh!

clawsweeper Bot commented May 26, 2026 •

edited

Loading

Uh oh!

clawsweeper Bot commented May 26, 2026 •

edited

Loading

Uh oh!

TurboTheTurtle commented May 26, 2026

Uh oh!

clawsweeper Bot commented May 26, 2026 •

edited

Loading

Uh oh!

This comment was marked as spam.

TurboTheTurtle commented Jun 5, 2026

Uh oh!

clawsweeper Bot commented Jun 5, 2026 •

edited

Loading

Uh oh!

TurboTheTurtle commented Jun 5, 2026

Uh oh!

clawsweeper Bot commented Jun 5, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

TurboTheTurtle commented May 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Real behavior proof

Summary

Tests

Uh oh!

clawsweeper Bot commented May 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

TurboTheTurtle commented May 26, 2026

Uh oh!

clawsweeper Bot commented May 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

clawsweeper Bot commented May 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Hatch command

Uh oh!

TurboTheTurtle commented May 26, 2026

Uh oh!

clawsweeper Bot commented May 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

This comment was marked as spam.

TurboTheTurtle commented Jun 5, 2026

Uh oh!

clawsweeper Bot commented Jun 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

TurboTheTurtle commented Jun 5, 2026

Uh oh!

clawsweeper Bot commented Jun 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

TurboTheTurtle commented May 26, 2026 •

edited

Loading

clawsweeper Bot commented May 26, 2026 •

edited

Loading

clawsweeper Bot commented May 26, 2026 •

edited

Loading

clawsweeper Bot commented May 26, 2026 •

edited

Loading

clawsweeper Bot commented May 26, 2026 •

edited

Loading

clawsweeper Bot commented Jun 5, 2026 •

edited

Loading

clawsweeper Bot commented Jun 5, 2026 •

edited

Loading