fix(agents): dedupe transcript rewrite suffix replay#89387
Conversation
|
Codex review: needs real behavior proof before merge. Reviewed June 2, 2026, 8:57 PM ET / 00:57 UTC. Summary PR surface: Source +81, Tests +173. Total +254 across 2 files. Reproducibility: yes. at source level. Current main replays every suffix entry after a transcript rewrite, and the linked issue provides persisted JSONL evidence of duplicate user rows and compaction chains; I did not execute the failing scenario in this read-only review. Review metrics: none identified. Merge readiness Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch. Rank-up moves:
Proof guidance:
Risk before merge
Maintainer options:
Next step before merge
Security Review detailsBest possible solution: Merge a narrow shared transcript-rewrite dedupe only after real persisted-session/live overflow proof is added or maintainers explicitly accept a proof override; keep the broader duplicate-transcript umbrella separate. Do we have a high-confidence way to reproduce the issue? Yes, at source level. Current main replays every suffix entry after a transcript rewrite, and the linked issue provides persisted JSONL evidence of duplicate user rows and compaction chains; I did not execute the failing scenario in this read-only review. Is this the best way to solve the issue? Yes, this is the best layer for the scoped bug: the shared transcript rewrite helper is the common seam used by context-engine maintenance and tool-result truncation. The implementation is deliberately narrower than prior attempts, but it still needs real behavior proof before merge. AGENTS.md: found and applied where relevant. Codex review notes: model gpt-5.5, reasoning high; reviewed against 8b546facaf49. Label changesLabel changes:
Label justifications:
Evidence reviewedPR surface: Source +81, Tests +173. Total +254 across 2 files. View PR surface stats
What I checked:
Likely related people:
What the crustacean ranks mean
Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics. How this review workflow works
|
Context-overflow recovery re-runs transcript suffix replay (rewriteTranscriptEntriesInSessionManager / rewriteTranscriptEntriesInState) without a replay-identity guard, so byte-identical role=user entries are re-appended on every recovery pass. The persisted session JSONL then accumulates duplicate messages, amplifying transcript growth and cascading into further overflows. Collapse only the proven overflow-clone pattern during replay: byte-identical role=user messages (whose timestamp distinguishes a recovery clone from a legitimate repeat) and byte-identical compactions (identity folds in the remapped firstKeptEntryId so distinct kept boundaries survive). Exact suffix history is preserved for assistant, tool-result, and custom entries, whose identical payloads can be legitimate history. Closes openclaw#66443
f0591e3 to
2e18181
Compare
Summary
Context-overflow recovery re-runs transcript suffix replay without a replay-identity guard, so byte-identical
role=userentries are re-appended on every recovery pass and the persisted session JSONL accumulates duplicate messages (issue #66443).role=usermessages and byte-identical compactions. Exact suffix history is preserved for assistant, tool-result, and custom entries.firstKeptEntryIdso two distinct compactions are not merged.Linked context
Closes #66443
Related #79150 — superseded prior attempt; this PR addresses its "include the kept boundary in compaction identity" review note.
Not maintainer-requested; selected from the
clawsweeper:queueable-fixbacklog.Real behavior proof (required for external PRs)
role=userentries in the persisted session JSONL (issue Overflow recovery duplicates role=user messages in session JSONL, amplifying transcript growth #66443 captured one unique message stored as three byte-identical copies).rewriteTranscriptEntriesInSessionManagerandrewriteTranscriptEntriesInStateagainst a real in-memorySessionManagerand persisted transcript-file-state — these functions are run for real, not mocked.node scripts/run-vitest.mjs src/agents/embedded-agent-runner/transcript-rewrite.test.tsmainand passes after the fix, and added tests prove legitimate repeats survive:role=userclone is collapsed to one entry, while (a) two user messages with distinct timestamps, (b) byte-identical repeated assistant/tool-result entries, and (c) two distinct compactions are all preserved.crabbox check:changedgate). A maintainer with a live setup can confirm end-to-end behavior, or applyproof: overridefor this logic-only dedup fix; I am happy to add a live capture if someone can reproduce an overflow.expected 2 to be 1) is the before-state captured from unfixedmain.Tests and validation
node scripts/run-vitest.mjs src/agents/embedded-agent-runner/transcript-rewrite.test.ts-> 10 passed (4 new).tool-result-truncation,context-engine-maintenance,compact.hooks,cli-runner.context-engine-> 126 passed.tsgo -p tsconfig.core.jsonand core test types: no errors in the changed files (two unrelated pre-existing errors insrc/config/io.tsandsrc/secrets/config-io.tsare present onmain).transcript-rewrite.test.ts: clone collapse (fails first withexpected 2 to be 1, passes after); distinct compactions preserved; legitimate repeated user messages (distinct timestamps) preserved; byte-identical repeated assistant/tool-result entries preserved.Risk checklist
merge-risk: session-stateconcern).role=usermessages and compactions only; the full-payload hash (including timestamp) distinguishes a recovery clone from a legitimate repeat; compaction identity includes the remappedfirstKeptEntryId; assistant/tool-result/custom entries are never deduped. Covered by the four regression tests above.Current review state
proof: override.