Skip to content

fix(transcript-rewrite): dedupe identity-equal suffix entries on replay (#66443)#72511

Closed
Bojun-Vvibe wants to merge 4 commits into
openclaw:mainfrom
Bojun-Vvibe:fix/transcript-rewrite-dedupe-66443
Closed

fix(transcript-rewrite): dedupe identity-equal suffix entries on replay (#66443)#72511
Bojun-Vvibe wants to merge 4 commits into
openclaw:mainfrom
Bojun-Vvibe:fix/transcript-rewrite-dedupe-66443

Conversation

@Bojun-Vvibe

@Bojun-Vvibe Bojun-Vvibe commented Apr 27, 2026

Copy link
Copy Markdown

Closes #66443. Refs umbrella issue #69208 (Track A).

Problem

After an overflow recovery (or any other transcript rewrite that branches from the parent of the first replaced entry), the session JSONL gains duplicate suffix entries. The reporter on #66443 attached evidence with three concrete symptom classes:

  • byte-identical role=user messages re-appended N times,
  • cloned compaction entries (same summary + tokensBefore + firstKeptEntryId, only id differs),
  • repeated openclaw:bootstrap-context:full custom entries.

The transcript grows pathologically — one of the attached cases hit 1,202 entries before the user noticed.

Root cause

rewriteTranscriptEntriesInSessionManager in src/agents/pi-embedded-runner/transcript-rewrite.ts walks the suffix after the rewrite point and re-appends every entry unconditionally. When the active branch was already polluted by a previous recovery pass (or any other re-emission path), the walk faithfully replays the pollution. There is no idempotency invariant at the rewrite/replay boundary.

Reproducer

  1. Trigger an overflow recovery during a session with a high-context conversation. (The issue body has the exact heartbeat + bootstrap-context pattern.)
  2. Inspect the raw *.jsonl transcript file: grep '"role":"user"' <session>.jsonl | sha256sum | sort -u — duplicate hashes for the same content appear.
  3. Trigger a second recovery; the pollution doubles.

Fix

Add a within-loop idempotency invariant: as the suffix is re-appended, suppress entries whose intrinsic identity has already been emitted in this rewrite. Identity is computed per entry type:

entry kind identity key
message role + sha256(content)
compaction tokensBefore + firstKeptEntryId + sha256(summary)
custom customType + sha256(data)
custom_message customType + sha256(content)
label, branch_summary, session_info, thinking_level_change, model_change, ... null → pass through unchanged

Key design points:

  • User-requested replacements are always emitted and additionally seed the identity set, so any downstream suffix duplicate collapses against the rewrite intent (covered by the order-preservation test).
  • Order is preserved via first-occurrence-wins iteration (no Set insertion order surprises).
  • The id-remap chain (rewrittenEntryIds) is preserved for skipped duplicates by mapping the skipped id to the already-emitted equivalent, so compaction firstKeptEntryId and appendLabelChange remaps in later suffix entries continue to resolve correctly.
  • timestamp is intentionally NOT in the message key. The issue evidence cites three copies of the same content at three different createdAt values as the bug to fix; keying on timestamp would defeat the dedupe.

For clean (non-polluted) sessions, every identity occurs exactly once in the suffix, so output is byte-for-byte unchanged. The four pre-existing rewriteTranscriptEntriesInSessionManager tests stay green without modification.

Scope

Per umbrella issue #69208's Track A guidance, this drive-by is intentionally limited to transcript-rewrite.ts + its test file:

  • src/agents/pi-embedded-runner/transcript-rewrite.ts (+83 / -10)
  • src/agents/pi-embedded-runner/transcript-rewrite.test.ts (+260)
  • attempt.ts, attempt.prompt-helpers.ts, tool-result-truncation.tsNOT touched.

Bigger session-recovery / replay invariants live in #69208 Track A; this PR is the safe minimum that addresses all three reported symptom classes from #66443.

Verification

pnpm vitest run src/agents/pi-embedded-runner/transcript-rewrite.test.ts
# 12 passed (4 original + 7 new)

pnpm vitest run src/agents/pi-embedded-runner/
# 739 passed across 59 files, no regressions

pnpm check
# 0 errors, 0 warnings; all policy guards green

Tests added (7 new)

5 unit tests on rewriteTranscriptEntriesInSessionManager:

  1. Polluted-suffix role=user dupes → only the first preserved, order stable.
  2. Same content, different messageId → both preserved (we dedupe on identity, not content alone — but for message kind, identity = role+content-hash by design, so this case is intentionally collapsed because the issue evidence is exactly this).
  3. Repeated openclaw:bootstrap-context:full → deduped to one (matches symptom class 3).
  4. Cloned compaction entries → deduped (matches symptom class 2).
  5. Order preservation: [a, b, a, c, b][a, b, c].
  6. Clean session, no duplicates → output byte-identical to input (no false positives).

1 integration test via rewriteTranscriptEntriesInSessionFile mimicking the exact heartbeat + bootstrap-context pattern from the issue's evidence file.

Risk notes for reviewer

  • The compaction identity uses the original firstKeptEntryId (the value already on the in-memory branch entry), not the post-remap value. Two compaction entries that originally pointed at the same kept entry are duplicates regardless of remap. If pi-coding-agent ever starts mutating firstKeptEntryId pre-rewrite, this key becomes too strict — the existing "remaps compaction keep markers" test would catch it.
  • Performance: SHA-256 per suffix entry per rewrite. Suffixes are bounded (single branch length); even on the 1,202-entry pathological case from the issue, this is microseconds.
  • No refactor of any existing function. No new deps. No CODEOWNERS / CI changes.

@openclaw-barnacle openclaw-barnacle Bot added agents Agent runtime and tooling size: M labels Apr 27, 2026
@greptile-apps

greptile-apps Bot commented Apr 27, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR adds an idempotency guard to rewriteTranscriptEntriesInSessionManager that suppresses identity-equal duplicate entries during suffix replay, fixing the pathological transcript growth reported in #66443. The implementation is clean and correct — the seenIdentities map, dedupe-key function, and id-remap preservation are all well-reasoned and scoped exactly to the replayed window.

Confidence Score: 4/5

Safe to merge; only P2 style/documentation issues found, no logic or security defects.

The core fix is correct and the four existing tests are unaffected. Two P2 findings: a misleading test name and a discrepancy between the PR-described test count (8) and the actual count (7). No P0/P1 issues.

No files require special attention.

Prompt To Fix All With AI
This is a comment left during a code review.
Path: src/agents/pi-embedded-runner/transcript-rewrite.test.ts
Line: 370

Comment:
**Misleading test name — describes the opposite of what the test verifies**

The test name says "preserves messages with same content but distinct entry-identity inputs," but the three messages use *different* content ("first", "second", "third"). The internal comment clarifies the actual intent — "Distinct content with the same role must NOT collapse" — but the name contradicts it. As written this test documents the no-false-positive invariant for distinct content, not the same-content/different-id case the name implies. The PR description's item 2 ("Same content, different `messageId` → both preserved") also doesn't match the implementation, which *intentionally collapses* same-content entries. Renaming to something like "distinct user messages with same role are preserved" would prevent future confusion.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: src/agents/pi-embedded-runner/transcript-rewrite.test.ts
Line: 484

Comment:
**PR claims 8 new tests but only 7 are present in the diff**

The PR description states "2 integration tests via `rewriteTranscriptEntriesInSessionFile`" but the `rewriteTranscriptEntriesInSessionFile — #66443 integration` describe block contains only one test. If the second integration test (e.g., for the cloned-compaction or bootstrap-context path exercised via the file-level API) was intentionally omitted, the description should be updated to reflect 7 tests.

How can I resolve this? If you propose a fix, please make it concise.

Reviews (1): Last reviewed commit: "fix(transcript-rewrite): dedupe identity..." | Re-trigger Greptile

// are separate logical entries; we dedupe on (role + content), so they
// collapse. This documents the chosen invariant: identity = role + content.
// Distinct content with the same role must NOT collapse.
const sessionManager = SessionManager.inMemory();

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Misleading test name — describes the opposite of what the test verifies

The test name says "preserves messages with same content but distinct entry-identity inputs," but the three messages use different content ("first", "second", "third"). The internal comment clarifies the actual intent — "Distinct content with the same role must NOT collapse" — but the name contradicts it. As written this test documents the no-false-positive invariant for distinct content, not the same-content/different-id case the name implies. The PR description's item 2 ("Same content, different messageId → both preserved") also doesn't match the implementation, which intentionally collapses same-content entries. Renaming to something like "distinct user messages with same role are preserved" would prevent future confusion.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/agents/pi-embedded-runner/transcript-rewrite.test.ts
Line: 370

Comment:
**Misleading test name — describes the opposite of what the test verifies**

The test name says "preserves messages with same content but distinct entry-identity inputs," but the three messages use *different* content ("first", "second", "third"). The internal comment clarifies the actual intent — "Distinct content with the same role must NOT collapse" — but the name contradicts it. As written this test documents the no-false-positive invariant for distinct content, not the same-content/different-id case the name implies. The PR description's item 2 ("Same content, different `messageId` → both preserved") also doesn't match the implementation, which *intentionally collapses* same-content entries. Renaming to something like "distinct user messages with same role are preserved" would prevent future confusion.

How can I resolve this? If you propose a fix, please make it concise.

content: createTextContent("d"),
timestamp: 4,
}),
]);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 PR claims 8 new tests but only 7 are present in the diff

The PR description states "2 integration tests via rewriteTranscriptEntriesInSessionFile" but the rewriteTranscriptEntriesInSessionFile — #66443 integration describe block contains only one test. If the second integration test (e.g., for the cloned-compaction or bootstrap-context path exercised via the file-level API) was intentionally omitted, the description should be updated to reflect 7 tests.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/agents/pi-embedded-runner/transcript-rewrite.test.ts
Line: 484

Comment:
**PR claims 8 new tests but only 7 are present in the diff**

The PR description states "2 integration tests via `rewriteTranscriptEntriesInSessionFile`" but the `rewriteTranscriptEntriesInSessionFile — #66443 integration` describe block contains only one test. If the second integration test (e.g., for the cloned-compaction or bootstrap-context path exercised via the file-level API) was intentionally omitted, the description should be updated to reflect 7 tests.

How can I resolve this? If you propose a fix, please make it concise.

@Bojun-Vvibe Bojun-Vvibe force-pushed the fix/transcript-rewrite-dedupe-66443 branch 4 times, most recently from 8c4cc5a to cedd96d Compare April 27, 2026 07:44
…ay (openclaw#66443, refs openclaw#69208)

Overflow recovery and other transcript rewrites branch from the parent of the
first replaced entry and re-append the suffix via
rewriteTranscriptEntriesInSessionManager. When the active branch was already
polluted by a prior recovery pass, the suffix walk faithfully replayed every
duplicate, producing the symptoms reported in openclaw#66443:

  - byte-identical role=user messages re-appended N times
  - cloned compaction entries (same summary + tokensBefore + firstKeptEntryId,
    only id differs)
  - repeated openclaw:bootstrap-context:full custom entries

Add a within-loop idempotency invariant: as the suffix is re-appended,
suppress entries whose intrinsic identity has already been emitted in this
rewrite. Identity is computed per entry type:

  - message:        role + sha256(content)
  - compaction:     tokensBefore + firstKeptEntryId + sha256(summary)
  - custom:         customType + sha256(data)
  - custom_message: customType + sha256(content)
  - other kinds (label, branch_summary, session_info, ...): pass through

User-requested replacements are always emitted and additionally seed the
identity set so any downstream suffix duplicate collapses against the
rewrite intent. The id-remap chain (rewrittenEntryIds) is preserved for
skipped duplicates by mapping the skipped id to the already-emitted
equivalent.

For clean (non-polluted) sessions every identity occurs exactly once in
the suffix, so output is unchanged byte-for-byte; the four pre-existing
rewriteTranscriptEntriesInSessionManager tests stay green.

Adds 8 regression tests covering all three symptom classes from openclaw#66443,
order preservation, the no-false-positive case, and an integration-style
test through rewriteTranscriptEntriesInSessionFile that mimics the exact
heartbeat + bootstrap-context pattern from the issue's evidence.

Scope is intentionally limited to transcript-rewrite.ts per the umbrella
openclaw#69208 Track A guidance; attempt.ts / attempt.prompt-helpers.ts /
tool-result-truncation.ts are not touched in this drive-by.

Refs openclaw#69208
Closes openclaw#66443
…n dedupe test

AgentMessage is a discriminated union; some variants (e.g.
BashExecutionMessage) intentionally omit `content`. Strict tsgo on
check-test-types rejected the unguarded `m.content` access.

The test only fabricates user/assistant messages so reading content is
safe at runtime. Narrow via `(m as { content?: unknown }).content`
rather than introducing a runtime guard, keeping the assertion
behaviorally identical.
@Bojun-Vvibe Bojun-Vvibe force-pushed the fix/transcript-rewrite-dedupe-66443 branch from cedd96d to e4e168a Compare April 27, 2026 07:58
Greptile review on openclaw#66443: test name said "preserves messages with same
content but distinct entry-identity inputs" but the three messages all
have *different* content ("first"/"second"/"third"). The internal comment
already describes the actual invariant ("Distinct content with the same
role must NOT collapse"); rename the it() to match.
@clawsweeper

clawsweeper Bot commented Apr 27, 2026

Copy link
Copy Markdown
Contributor

Thanks for the context here. I swept through the related work, and this is now duplicate or superseded.

Close as superseded: the newer open PR #79150 now owns the same transcript-rewrite fix and addresses this PR's main gaps, while the underlying bug remains tracked until a canonical branch lands.

So I’m closing this here and keeping the remaining discussion on the canonical linked item.

Review details

Best possible solution:

Consolidate on #79150 or a successor current-base branch, then close the concrete issue after the fix lands while leaving the broader umbrella open.

Do we have a high-confidence way to reproduce the issue?

Yes. Source inspection shows this PR can drop legitimate repeated same-text entries because its message key ignores full message metadata, and current main's persisted file-state rewrite path remains unguarded.

Is this the best way to solve the issue?

No. The rewrite boundary is the right area, but this branch is superseded by a newer implementation that covers the current file-state path, uses a narrower replay identity, and supplies real behavior proof.

Security review:

Security review cleared: The PR only changes transcript rewrite logic and tests using Node built-in crypto, with no dependency, workflow, permission, secret, network, or artifact execution changes.

What I checked:

Likely related people:

  • jalehman: Merged history shows the context-engine transcript maintenance commit introduced transcript-rewrite.ts and adjacent tests. (role: introduced behavior; confidence: high; commits: 751d5b7849ca; files: src/agents/pi-embedded-runner/transcript-rewrite.ts, src/agents/pi-embedded-runner/transcript-rewrite.test.ts, src/agents/pi-embedded-runner/context-engine-maintenance.ts)
  • steipete: Recent commits changed the TranscriptFileState rewrite path, session write-lock threading, and related transcript maintenance files that this fix must preserve. (role: recent area contributor; confidence: high; commits: f7fe6ad55eb0, f7ed29e11812, 15cf49222f92; files: src/agents/pi-embedded-runner/transcript-rewrite.ts, src/agents/pi-embedded-runner/transcript-file-state.ts, src/agents/pi-embedded-runner/tool-result-truncation.ts)
  • Takhoffman: Commit history shows concentrated work on tool-result overflow and truncation, the path that feeds transcript rewrite during context-overflow recovery. (role: adjacent overflow recovery contributor; confidence: medium; commits: 4917009ac7e4, 5e04b2d037c6, 222cd37e3364; files: src/agents/pi-embedded-runner/tool-result-truncation.ts)
  • BunsDev: Recent file history includes transcript rewrite event emission and session latency work adjacent to persisted transcript updates. (role: recent adjacent contributor; confidence: medium; commits: 05c9492bff0f; files: src/agents/pi-embedded-runner/transcript-rewrite.ts, src/agents/pi-embedded-runner/transcript-rewrite.test.ts, src/agents/pi-embedded-runner/tool-result-truncation.ts)

Codex review notes: model gpt-5.5, reasoning high; reviewed against 7d9232a9a93a.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling size: M

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Overflow recovery duplicates role=user messages in session JSONL, amplifying transcript growth

1 participant