Skip to content

db/snapshotsync: fix crash due to double close of decompressor #21545

Merged
AskAlexSharov merged 1 commit into
mainfrom
sudeep/snapshot-close-refcount-safety
Jun 1, 2026
Merged

db/snapshotsync: fix crash due to double close of decompressor #21545
AskAlexSharov merged 1 commit into
mainfrom
sudeep/snapshot-close-refcount-safety

Conversation

@sudeepdino008

@sudeepdino008 sudeepdino008 commented Jun 1, 2026

Copy link
Copy Markdown
Member

Problem

panic: runtime error: invalid memory address or nil pointer dereference
  seg.(*Decompressor).FilePath
  snapshotsync.(*DirtySegment).closeAndRemoveFiles   snapshots.go:420
  snapshotsync.(*RoTx).Close                         snapshots.go:537
  snapshotsync.(*View).Close

Fix

In closeWhatNotInList, skip segments with refcount > 0: a live reader still references them, so closing now would invalidate that reader. They are reaped on a later pass once the reader releases them (closeWhatNotInList already runs on every OpenFolder).

View/BeginRo stays lock-free (#20490) — the fix is purely in the close path.

Test

TestCloseWhatNotInListVsLiveViewDoesNotCrash reproduces the crash deterministically (pure snapshotsync, no merge machinery): it builds sub-segments, opens a View over them, drops a covering merged file on disk, reopens (so NoOverlaps removes the subs from the list), and asserts View.Close does not crash. It fails before this change and passes after.

@sudeepdino008 sudeepdino008 changed the title db/snapshotsync: don't close refcount-held segments in closeWhatNotInList db/snapshotsync: fix crash due to double close of decompressor Jun 1, 2026
@sudeepdino008 sudeepdino008 requested review from JkLondon and awskii June 1, 2026 08:23
@sudeepdino008 sudeepdino008 marked this pull request as ready for review June 1, 2026 08:24
…List

closeWhatNotInList closed dirty segments that were not in the canonical
on-disk list without checking their refcount, so it could close a segment
(nilling its decompressor) while a live View/RoTx still held it. After a
merge, TypedSegments->NoOverlaps drops the merged-away sub-segments from the
list even though their files remain on disk; a concurrent OpenFolder then
closed those subsumed segments out from under the merge's own View, and the
View's later closeAndRemoveFiles hit FilePath() on the nil decompressor and
crashed.

Skip refcount-held segments in closeWhatNotInList; they are reaped on a
later pass once the reader releases them. View/BeginRo stays lock-free.
@sudeepdino008 sudeepdino008 force-pushed the sudeep/snapshot-close-refcount-safety branch from 1fa270f to c2eba3d Compare June 1, 2026 08:32
@AskAlexSharov AskAlexSharov enabled auto-merge June 1, 2026 08:46
@AskAlexSharov AskAlexSharov added this pull request to the merge queue Jun 1, 2026
Merged via the queue into main with commit 92b22dc Jun 1, 2026
90 checks passed
@AskAlexSharov AskAlexSharov deleted the sudeep/snapshot-close-refcount-safety branch June 1, 2026 10:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants