fix(sessions): dedupe redundant delivery mirrors#67185
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a814c8f564
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| const candidateText = extractAssistantMessageText(candidate); | ||
| if (candidateText !== expectedText) { | ||
| continue; | ||
| } | ||
| if (typeof parsed.id === "string" && parsed.id) { |
There was a problem hiding this comment.
Restrict mirror dedupe to latest assistant message
This dedupe returns an existing ID for any prior assistant message with matching text, so a repeated assistant reply on a later turn (with no idempotency key) will be treated as a duplicate and not appended. In practice, common repeated outputs like "Done."/"OK" can collapse distinct turns into one transcript entry, which drops history and breaks message sequencing for consumers. The scan should only dedupe against the immediately latest assistant artifact for the current turn (or another turn-scoped signal), not any historical text match.
Useful? React with 👍 / 👎.
Greptile SummaryThis PR adds a text-based dedup guard to
Confidence Score: 3/5Not safe to merge as-is: the dedup scan can match across conversation turns, silently dropping delivery mirrors when reply text repeats. One P1 logic bug: src/config/sessions/transcript.ts — specifically the Prompt To Fix All With AIThis is a comment left during a code review.
Path: src/config/sessions/transcript.ts
Line: 301-325
Comment:
**Scan continues past non-matching assistant messages — allows cross-turn false positives**
The loop `continue`s on both non-assistant entries and non-matching assistant entries, so it searches the entire transcript backward. If the most-recent assistant message doesn't match the delivery mirror text but an older one (from a previous turn) does, that stale ID is returned and the delivery mirror is silently suppressed for the current turn.
Concrete failure: a repeated response such as "I'm sorry, I can't help with that" appended by the non-Codex path will match an earlier occurrence from a different turn. The delivery mirror is dropped, `mirrorResult.messageId` points to the old turn's entry, and downstream code relying on per-turn IDs (e.g. Mattermost dedup tracking) may silently lose the current reply.
The fix is to stop scanning as soon as the first non-matching assistant message is encountered, rather than continuing backward:
```suggestion
const candidateText = extractAssistantMessageText(candidate);
if (candidateText !== expectedText) {
return undefined;
}
```
How can I resolve this? If you propose a fix, please make it concise.Reviews (1): Last reviewed commit: "fix(sessions): dedupe redundant delivery..." | Re-trigger Greptile |
| for (let index = lines.length - 1; index >= 0; index -= 1) { | ||
| const line = lines[index]; | ||
| if (!line.trim()) { | ||
| continue; | ||
| } | ||
| try { | ||
| const parsed = JSON.parse(line) as { | ||
| id?: unknown; | ||
| message?: SessionTranscriptAssistantMessage; | ||
| }; | ||
| const candidate = parsed.message; | ||
| if (!candidate || candidate.role !== "assistant") { | ||
| continue; | ||
| } | ||
| const candidateText = extractAssistantMessageText(candidate); | ||
| if (candidateText !== expectedText) { | ||
| continue; | ||
| } | ||
| if (typeof parsed.id === "string" && parsed.id) { | ||
| return parsed.id; | ||
| } | ||
| } catch { | ||
| continue; | ||
| } | ||
| } |
There was a problem hiding this comment.
Scan continues past non-matching assistant messages — allows cross-turn false positives
The loop continues on both non-assistant entries and non-matching assistant entries, so it searches the entire transcript backward. If the most-recent assistant message doesn't match the delivery mirror text but an older one (from a previous turn) does, that stale ID is returned and the delivery mirror is silently suppressed for the current turn.
Concrete failure: a repeated response such as "I'm sorry, I can't help with that" appended by the non-Codex path will match an earlier occurrence from a different turn. The delivery mirror is dropped, mirrorResult.messageId points to the old turn's entry, and downstream code relying on per-turn IDs (e.g. Mattermost dedup tracking) may silently lose the current reply.
The fix is to stop scanning as soon as the first non-matching assistant message is encountered, rather than continuing backward:
| for (let index = lines.length - 1; index >= 0; index -= 1) { | |
| const line = lines[index]; | |
| if (!line.trim()) { | |
| continue; | |
| } | |
| try { | |
| const parsed = JSON.parse(line) as { | |
| id?: unknown; | |
| message?: SessionTranscriptAssistantMessage; | |
| }; | |
| const candidate = parsed.message; | |
| if (!candidate || candidate.role !== "assistant") { | |
| continue; | |
| } | |
| const candidateText = extractAssistantMessageText(candidate); | |
| if (candidateText !== expectedText) { | |
| continue; | |
| } | |
| if (typeof parsed.id === "string" && parsed.id) { | |
| return parsed.id; | |
| } | |
| } catch { | |
| continue; | |
| } | |
| } | |
| const candidateText = extractAssistantMessageText(candidate); | |
| if (candidateText !== expectedText) { | |
| return undefined; | |
| } |
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/config/sessions/transcript.ts
Line: 301-325
Comment:
**Scan continues past non-matching assistant messages — allows cross-turn false positives**
The loop `continue`s on both non-assistant entries and non-matching assistant entries, so it searches the entire transcript backward. If the most-recent assistant message doesn't match the delivery mirror text but an older one (from a previous turn) does, that stale ID is returned and the delivery mirror is silently suppressed for the current turn.
Concrete failure: a repeated response such as "I'm sorry, I can't help with that" appended by the non-Codex path will match an earlier occurrence from a different turn. The delivery mirror is dropped, `mirrorResult.messageId` points to the old turn's entry, and downstream code relying on per-turn IDs (e.g. Mattermost dedup tracking) may silently lose the current reply.
The fix is to stop scanning as soon as the first non-matching assistant message is encountered, rather than continuing backward:
```suggestion
const candidateText = extractAssistantMessageText(candidate);
if (candidateText !== expectedText) {
return undefined;
}
```
How can I resolve this? If you propose a fix, please make it concise.|
Live repro evidence from a real Mattermost install using
Why this lines up with the source:
Local verification before opening this PR:
So this is not just a Mattermost delivery quirk. The duplicate visible reply is downstream of a real redundant transcript append on Codex-backed turns. |
|
Reviewer note on the added test: The new It first appends a real assistant transcript entry with
So the assertion is specifically guarding the redundant second write that appears on Codex-backed turns. |
a814c8f to
1a10d95
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 1a10d95fac
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if (!candidate || candidate.role !== "assistant") { | ||
| continue; |
There was a problem hiding this comment.
Abort mirror dedupe when trailing entry is not assistant
Fresh evidence: the reverse scan skips non-assistant entries, so in a normal turn boundary (assistant from prior turn, then user, then new mirror send) it can walk past the latest user line and reuse the older assistant ID if the text repeats (for example "OK"), which drops the current turn’s assistant artifact. This still collapses repeated answers across turns; dedupe should only apply when the most recent transcript message itself is the matching assistant.
Useful? React with 👍 / 👎.
| const latestEquivalentAssistantId = isRedundantDeliveryMirror(params.message) | ||
| ? await findLatestEquivalentAssistantMessageId(sessionFile, params.message) | ||
| : undefined; |
There was a problem hiding this comment.
Respect explicit idempotency keys before text dedupe
After an explicit idempotencyKey lookup misses, the new text-based dedupe still runs for delivery-mirror and can return an older message ID based only on text. That suppresses appending a new transcript entry and never records the new key, so distinct sends with different idempotency keys can collapse into one message. The text dedupe path should be gated off when a caller supplies an explicit key (or include the key in the equivalence check).
Useful? React with 👍 / 👎.
1a10d95 to
1fea15d
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 1fea15ddd6
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| const parts = message.content | ||
| .filter( | ||
| ( | ||
| part, | ||
| ): part is { | ||
| type: "text"; | ||
| text: string; | ||
| } => part.type === "text" && typeof part.text === "string" && part.text.trim().length > 0, | ||
| ) |
There was a problem hiding this comment.
Ignore commentary phases in mirror text comparison
The new text matcher concatenates all assistant text blocks, so phased messages that include hidden commentary are compared as commentary + final instead of the user-visible final answer. In Codex/OpenAI phased transcripts, appendExactAssistantMessageToSessionTranscript can store both commentary and final_answer blocks, while the delivery mirror text is only the visible final output; this makes candidateText !== expectedText and the dedupe path appends a duplicate mirror anyway. This means the duplicate-prevention fix misses a real production shape of assistant messages.
Useful? React with 👍 / 👎.
1fea15d to
b40ec19
Compare
|
Landed via rebase merge. Source/head: Review follow-up included:
Gates:
|
Summary
delivery-mirrortranscript appends when the latest assistant message already has the same visible textProblem
When replies go through the Codex harness, OpenClaw already persists the real assistant message into the session transcript. The generic outbound delivery path still appends a synthetic
provider:"openclaw", model:"delivery-mirror"assistant entry with the same text.In Mattermost this can surface as duplicate visible replies for a single turn.
Root Cause
Two transcript writers become active on the same turn:
delivery-mirrorThe old idempotency-key guard does not help here because the real assistant message and the synthetic mirror do not share an idempotency key.
Fix
Before appending a
delivery-mirrormessage, scan backward through the transcript and reuse the latest assistant message id when its visible text already matches. This keeps the mirror path intact for channels that still need it, but avoids creating a second assistant artifact when the transcript already contains the real reply.Repro
provider:"codex"provider:"openclaw", model:"delivery-mirror"with the same textTest Plan
src/config/sessions/transcript.test.tsthat appends a realcodexassistant message first, then attempts to append the matchingdelivery-mirror, and asserts that no second transcript line is written.node_modulesinstalled, so the PR currently has code-level validation plus the focused regression test only.