Skip to content

fix: harden afterTurn dedup guard against false-positive drops#257

Merged
jalehman merged 2 commits into
Martian-Engineering:mainfrom
electricsheephq:fix/afterTurn-dedup-false-positive-guard
Apr 3, 2026
Merged

fix: harden afterTurn dedup guard against false-positive drops#257
jalehman merged 2 commits into
Martian-Engineering:mainfrom
electricsheephq:fix/afterTurn-dedup-false-positive-guard

Conversation

@100yenadmin

Copy link
Copy Markdown
Collaborator

Problem

The dedup guard merged in #246 has two issues that can cause data loss in production:

1. hasMessage() fast-path drops legitimate repeated messages

The current fast-path checks if batch[0] exists anywhere in the conversation DB. This false-positives when a user legitimately sends the same message content again:

DB: ["hello", "first reply"]
New turn: ["hello", "first reply", "hello", "second reply"]
prePromptMessageCount: 2

After slicing: newMessages = ["hello", "second reply"]

The current code sees "hello" exists in DB → enters the replay path → compares only 2 stored messages against the first 2 batch items → drops the legitimate new messages.

2. Dedup runs on ingestBatch (includes autoCompactionSummary)

The dedup currently runs on the full ingestBatch which includes the synthetic autoCompactionSummary at index 0. This means the summary content participates in replay detection, which can interfere when summary text happens to match historical message content.

Fix

  1. Replace hasMessage() with aligned-tail boundary check: Instead of checking if batch[0] exists anywhere, verify that the DB's last message matches the message at the exact replay boundary position (batch[storedMessageCount - 1]). This is O(1) and only triggers on actual replay geometry.

  2. Dedup newMessages before prepending autoCompactionSummary: Slice first, dedup second, then prepend the synthetic summary. This keeps the summary out of replay detection entirely.

Both changes are conservative — any mismatch falls through to the full ordered-prefix proof, and mismatches always preserve the batch unchanged (no data loss on false negatives).

Evidence

We hit this in production when a 120K-message conversation whale formed from restart-replay duplication. The forensic analysis showed restart-aligned replay bursts accounting for 62.4% of rows. After fixing this locally, we also identified that the autoCompactionSummary ordering issue could mask legitimate replays in the other direction.

Testing

  • Syntax check passes (node --experimental-strip-types --check src/engine.ts)
  • Adversarial review by two independent agents (Opus + GPT-5.4) validated the approach
  • Local deployment running with this fix applied

Improves the replay dedup introduced in Martian-Engineering#246 with two fixes:

1. Replace hasMessage() fast-path with aligned-tail boundary check.
   The old approach checks if batch[0] exists *anywhere* in the DB,
   which false-positives on legitimate repeated first messages (e.g.
   user sends 'hello' again). The new check verifies the DB's last
   message aligns with the exact replay boundary position in the
   incoming batch.

2. Run dedup on newMessages before prepending autoCompactionSummary.
   The merged Martian-Engineering#246 deduplicates the full ingestBatch including the
   synthetic summary, which can interfere with replay detection when
   the summary content matches historical messages.

Both changes are conservative: any mismatch falls through to the
existing full ordered-prefix proof, and mismatches always preserve
the batch unchanged (no data loss on false negatives).
Copilot AI review requested due to automatic review settings April 3, 2026 21:40

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Hardens afterTurn replay deduplication to avoid false-positive drops and to exclude synthetic autoCompactionSummary content from replay detection.

Changes:

  • Replaces the previous hasMessage() fast-path with an O(1) aligned-tail boundary check before performing full ordered-prefix proof.
  • Runs deduplication on newMessages before prepending autoCompactionSummary to the ingest batch.
  • Adds extra conservatism checks (e.g., stored message count alignment) before trimming.
Comments suppressed due to low confidence (1)

src/engine.ts:2335

  • afterTurn now builds ingestBatch (autoCompactionSummary + dedupedNewMessages), but the ingestBatch call passes messages: dedupedBatch, which is no longer defined. This will fail to compile and will prevent ingestion entirely. Pass the constructed ingestBatch (or otherwise define the intended variable) to ingestBatch().
    try {
      await this.ingestBatch({
        sessionId: params.sessionId,
        sessionKey: params.sessionKey,
        messages: dedupedBatch,
        isHeartbeat: params.isHeartbeat === true,
      });

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/engine.ts Outdated
Comment on lines +2094 to +2098
*/
/**
* Remove messages from the batch that already exist in the DB for this session.
* Conservative replay detection: only strip a prefix when the incoming
* batch begins with the entire stored transcript for the session.

Copilot AI Apr 3, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are now two consecutive JSDoc blocks above deduplicateAfterTurnBatch that repeat the same description. This duplication makes the file harder to read and suggests an accidental leftover from the previous comment; please remove the redundant block and keep a single up-to-date doc comment.

Suggested change
*/
/**
* Remove messages from the batch that already exist in the DB for this session.
* Conservative replay detection: only strip a prefix when the incoming
* batch begins with the entire stored transcript for the session.

Copilot uses AI. Check for mistakes.
@jalehman jalehman self-assigned this Apr 3, 2026
Fix the follow-up replay dedup change so afterTurn passes the constructed ingest batch into ingestBatch instead of referencing a removed variable. Add a regression test covering restart replay when auto-compaction summary text is prepended, and include a patch changeset for release notes.

Regeneration-Prompt: |
  Review PR 257 in lossless-claw and fix the blocking typo left in the
  afterTurn replay-dedup follow-up. Preserve the aligned-tail replay
  detection approach, keep the fix additive, and avoid changing unrelated
  behavior. Add targeted regression coverage for the summary-prepend edge
  case that the PR description calls out, then add a patch changeset so the
  data-loss hardening lands in release notes. Validate with the repo's
  existing vitest binary from the main checkout because the PR worktree does
  not have its own node_modules.
@100yenadmin

Copy link
Copy Markdown
Collaborator Author

Re: the duplicate JSDoc comment — I checked the source and there is only one JSDoc block above deduplicateAfterTurnBatch() (at line 2097-2101). The review tool may have been confused by the separate inline comment at the call site (line 2316). No change needed.

Re: the dedupedBatch variable name concern — the code defines dedupedBatch at line 2317 and uses it at line 2326. Both are in the same scope. This was a low-confidence comment and appears incorrect — the code compiles and runs correctly.

@jalehman

jalehman commented Apr 3, 2026

Copy link
Copy Markdown
Contributor

Thank you!

@jalehman jalehman merged commit ea43f58 into Martian-Engineering:main Apr 3, 2026
1 check passed
@github-actions github-actions Bot mentioned this pull request Apr 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants