fix: prevent bootstrap replay flood after maintain() JSONL rewrite#280
Merged
jalehman merged 2 commits intoApr 5, 2026
Merged
Conversation
maintain() calls rewriteTranscriptEntries() which rewrites the JSONL via branch-and-reappend, changing file size, mtime, and entry IDs. But the conversation_bootstrap_state was only updated during bootstrap(), leaving a stale checkpoint after every maintenance rewrite. On the next gateway restart, bootstrap saw the stale checkpoint: - Fast path 1 (exact size+mtime match) failed (file shrank) - Fast path 2 (append-only, size > stored) failed (file shrank) - Fell through to reconcileSessionTail() full content-based reconcile - On conversations with many identical (role, content) pairs, the occurrence-counting anchor matched at the wrong position by coincidence - Result: thousands of duplicate messages imported in one second Three fixes: 1. Checkpoint update after maintain(): After a successful JSONL rewrite, update the bootstrap checkpoint so the next bootstrap() hits the fast path instead of falling through to reconcile. 2. messageOnly skip in readLastJsonlEntryBeforeOffset(): The function now accepts a messageOnly flag so it skips non-message JSONL entries (cache-ttl, tool-result, session-meta) when computing the tail entry hash for checkpoint comparison. 3. Import cap in reconcileSessionTail(): If reconcile would import more than 20% of the existing DB message count (minimum 50), abort and log an error instead of blindly importing. Root cause of 19+ bootstrap flood events across March 29 - April 5, 2026. Fixes Martian-Engineering#271 Relates to Martian-Engineering#268 Relates to Martian-Engineering#276 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
5bf3b6b to
4e8c262
Compare
Keep maintain() checkpoint writes aligned with bootstrap state semantics by normalizing the stored mtime, and prevent capped reconcile aborts from advancing bootstrap state as though the transcript had been fully processed. Add regression coverage for the unchanged maintain->bootstrap fast path and the capped reconcile retry path, plus the required patch changeset. Regeneration-Prompt: | Address the two review findings on PR 280 without broadening scope. Make bootstrap checkpoint updates after transcript maintenance use the same timestamp semantics as the existing bootstrap path, and ensure the replay-safety import cap in reconcileSessionTail does not mark the transcript as fully processed when it aborts. Add targeted regression tests proving maintain() leaves the next unchanged bootstrap on the fast path and that a capped reconcile preserves the stale checkpoint so a later retry is still possible. Include the required patch changeset and rerun the repository's relevant test and packaging gates before pushing back to the contributor fork.
Merged
liu51115
pushed a commit
to liu51115/lossless-claw
that referenced
this pull request
Apr 7, 2026
Round-trip integration test: create conv → maintain() rewrites JSONL → bootstrap() → assert 0 re-imports. Also tests import cap on stale checkpoint. Covers both PR Martian-Engineering#280 fixes (checkpoint update + import cap).
liu51115
pushed a commit
to liu51115/lossless-claw
that referenced
this pull request
Apr 7, 2026
Regression test covering both fixes from PR Martian-Engineering#280: 1. maintain() updates checkpoint after rewriteTranscriptEntries() — prevents stale checkpoint on restart 2. Import cap blocks mass re-imports when checkpoint is stale (>max(existingDbCount*0.2, 50)) Tests: - Round-trip: create conv → maintain() → bootstrap() → assert 0 re-imports - Import cap: corrupt checkpoint → append flood messages → assert cap blocks - Defense-in-depth: both fixes working together
jalehman
pushed a commit
that referenced
this pull request
Apr 7, 2026
Regression test covering both fixes from PR #280: 1. maintain() updates checkpoint after rewriteTranscriptEntries() — prevents stale checkpoint on restart 2. Import cap blocks mass re-imports when checkpoint is stale (>max(existingDbCount*0.2, 50)) Tests: - Round-trip: create conv → maintain() → bootstrap() → assert 0 re-imports - Import cap: corrupt checkpoint → append flood messages → assert cap blocks - Defense-in-depth: both fixes working together Co-authored-by: Claw Liu <liu51115claw@brun.taild04815.ts.net>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
maintain()callsrewriteTranscriptEntries()which rewrites the JSONL via branch-and-reappend, changing file size, mtime, and entry IDs. But theconversation_bootstrap_statewas only updated duringbootstrap(), leaving a stale checkpoint after every maintenance rewrite.On the next gateway restart, bootstrap saw the stale checkpoint, fell through the fast paths, and hit
reconcileSessionTail()where occurrence-counting anchored at the wrong position — importing thousands of duplicate messages in one second.Three fixes
Checkpoint update after
maintain()— After a successful JSONL rewrite, update the bootstrap checkpoint so the nextbootstrap()hits the fast path instead of falling through to reconcile.messageOnlyskip inreadLastJsonlEntryBeforeOffset()— The function now accepts amessageOnlyflag so it skips non-message JSONL entries (cache-ttl,tool-result,session-meta) when computing the tail entry hash for checkpoint comparison. Previously, a trailing non-message entry would cause the hash check to fail even when the checkpoint was otherwise correct.Import cap in
reconcileSessionTail()— If reconcile would import more than 20% of the existing DB message count (minimum 50), abort and log an error instead of blindly importing. Defense-in-depth against any future stale-checkpoint scenario.Root cause of 19+ bootstrap flood events across March 29 – April 5, 2026.
Supersedes #278.
Fixes #271
Relates to #268, #276
Test plan
readLastJsonlEntryBeforeOffsetwithmessageOnlyflag (test/bootstrap-message-only.test.ts)npm test)🤖 Generated with Claude Code