fix: harden afterTurn dedup guard against false-positive drops#257
Conversation
Improves the replay dedup introduced in Martian-Engineering#246 with two fixes: 1. Replace hasMessage() fast-path with aligned-tail boundary check. The old approach checks if batch[0] exists *anywhere* in the DB, which false-positives on legitimate repeated first messages (e.g. user sends 'hello' again). The new check verifies the DB's last message aligns with the exact replay boundary position in the incoming batch. 2. Run dedup on newMessages before prepending autoCompactionSummary. The merged Martian-Engineering#246 deduplicates the full ingestBatch including the synthetic summary, which can interfere with replay detection when the summary content matches historical messages. Both changes are conservative: any mismatch falls through to the existing full ordered-prefix proof, and mismatches always preserve the batch unchanged (no data loss on false negatives).
There was a problem hiding this comment.
Pull request overview
Hardens afterTurn replay deduplication to avoid false-positive drops and to exclude synthetic autoCompactionSummary content from replay detection.
Changes:
- Replaces the previous
hasMessage()fast-path with an O(1) aligned-tail boundary check before performing full ordered-prefix proof. - Runs deduplication on
newMessagesbefore prependingautoCompactionSummaryto the ingest batch. - Adds extra conservatism checks (e.g., stored message count alignment) before trimming.
Comments suppressed due to low confidence (1)
src/engine.ts:2335
afterTurnnow buildsingestBatch(autoCompactionSummary + dedupedNewMessages), but theingestBatchcall passesmessages: dedupedBatch, which is no longer defined. This will fail to compile and will prevent ingestion entirely. Pass the constructedingestBatch(or otherwise define the intended variable) toingestBatch().
try {
await this.ingestBatch({
sessionId: params.sessionId,
sessionKey: params.sessionKey,
messages: dedupedBatch,
isHeartbeat: params.isHeartbeat === true,
});
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| */ | ||
| /** | ||
| * Remove messages from the batch that already exist in the DB for this session. | ||
| * Conservative replay detection: only strip a prefix when the incoming | ||
| * batch begins with the entire stored transcript for the session. |
There was a problem hiding this comment.
There are now two consecutive JSDoc blocks above deduplicateAfterTurnBatch that repeat the same description. This duplication makes the file harder to read and suggests an accidental leftover from the previous comment; please remove the redundant block and keep a single up-to-date doc comment.
| */ | |
| /** | |
| * Remove messages from the batch that already exist in the DB for this session. | |
| * Conservative replay detection: only strip a prefix when the incoming | |
| * batch begins with the entire stored transcript for the session. |
Fix the follow-up replay dedup change so afterTurn passes the constructed ingest batch into ingestBatch instead of referencing a removed variable. Add a regression test covering restart replay when auto-compaction summary text is prepended, and include a patch changeset for release notes. Regeneration-Prompt: | Review PR 257 in lossless-claw and fix the blocking typo left in the afterTurn replay-dedup follow-up. Preserve the aligned-tail replay detection approach, keep the fix additive, and avoid changing unrelated behavior. Add targeted regression coverage for the summary-prepend edge case that the PR description calls out, then add a patch changeset so the data-loss hardening lands in release notes. Validate with the repo's existing vitest binary from the main checkout because the PR worktree does not have its own node_modules.
|
Re: the duplicate JSDoc comment — I checked the source and there is only one JSDoc block above Re: the |
|
Thank you! |
Problem
The dedup guard merged in #246 has two issues that can cause data loss in production:
1.
hasMessage()fast-path drops legitimate repeated messagesThe current fast-path checks if
batch[0]exists anywhere in the conversation DB. This false-positives when a user legitimately sends the same message content again:After slicing:
newMessages = ["hello", "second reply"]The current code sees
"hello"exists in DB → enters the replay path → compares only 2 stored messages against the first 2 batch items → drops the legitimate new messages.2. Dedup runs on ingestBatch (includes autoCompactionSummary)
The dedup currently runs on the full
ingestBatchwhich includes the syntheticautoCompactionSummaryat index 0. This means the summary content participates in replay detection, which can interfere when summary text happens to match historical message content.Fix
Replace
hasMessage()with aligned-tail boundary check: Instead of checking ifbatch[0]exists anywhere, verify that the DB's last message matches the message at the exact replay boundary position (batch[storedMessageCount - 1]). This is O(1) and only triggers on actual replay geometry.Dedup newMessages before prepending autoCompactionSummary: Slice first, dedup second, then prepend the synthetic summary. This keeps the summary out of replay detection entirely.
Both changes are conservative — any mismatch falls through to the full ordered-prefix proof, and mismatches always preserve the batch unchanged (no data loss on false negatives).
Evidence
We hit this in production when a 120K-message conversation whale formed from restart-replay duplication. The forensic analysis showed restart-aligned replay bursts accounting for 62.4% of rows. After fixing this locally, we also identified that the
autoCompactionSummaryordering issue could mask legitimate replays in the other direction.Testing
node --experimental-strip-types --check src/engine.ts)