manifest: unref files with range keys when version refcnt is zero#2022
Conversation
|
I can probably add some unit tests to the |
jbowens
left a comment
There was a problem hiding this comment.
Right here in DB.Close can we error if len(d.mu.versions.zombieTables) > 0?
We've already:
- asserted
Version.Refs() == 0for all previous versions and.Refs() == 1for the current version. - and scheduled cleaning jobs to remove all obsolete files, which will remove files from
d.mu.versions.zombieTables.
I don't think it's possible for len(d.mu.versions.zombieTables) to be greater than zero without a ref count leak.
Reviewed 3 of 3 files at r1, all commit messages.
Reviewable status:complete! all files reviewed, all discussions resolved (waiting on @itsbilal)
sumeerbhola
left a comment
There was a problem hiding this comment.
Can we assert 0 zombie files and memtables when closing a DB in all our tests, including metamorphic tests?
Reviewable status:
complete! all files reviewed, all discussions resolved (waiting on @itsbilal)
jbowens
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! all files reviewed, all discussions resolved (waiting on @itsbilal)
A manifest version uses reference counting to ensure that obsolete SSTables are only removed once it is safe to do so (i.e. no iterators or snapshots are open referencing tables in a older manifest version). In cockroachdb#1771, a new slice of `LevelMetadata` was added to support range keys. A slice already existed for point keys (plus range dels). Currently, when a version's ref count reaches zero, only the files present in the point key `LevelMetadata` slice have their ref counts decremented. If a file also contains range keys, the reference count never becomes zero and the file becomes "zombied", and can never be deleted from disk. Decrement the ref counts on files present in the range keys `LevelMetadata` slice, allowing the ref counts to be correctly zeroed. Add regression test to exercise the bug observed in cockroachdb/cockroach#90149. Fix cockroachdb/cockroach#90149.
205e149 to
2811cec
Compare
jbowens
left a comment
There was a problem hiding this comment.
Reviewable status: 1 of 4 files reviewed, all discussions resolved (waiting on @itsbilal)
There was a problem hiding this comment.
TFTRs! Ack on the memtable count - let me follow up once this lands.
in all our tests, including metamorphic tests?
It feels like what we want is some general harness for asserting that a DB and its associated iterators are always closed after tests. Similar to what we have in Cockroach. Let's mull that over on #2032.
Reviewable status: 1 of 4 files reviewed, all discussions resolved (waiting on @itsbilal and @jbowens)
A manifest version uses reference counting to ensure that obsolete SSTables are only removed once it is safe to do so (i.e. no iterators or snapshots are open referencing tables in a older manifest version).
In #1771, a new slice of
LevelMetadatawas added to support range keys. A slice already existed for point keys (plus range dels). Currently, when a version's ref count reaches zero, only the files present in the point keyLevelMetadataslice have their ref counts decremented. If a file also contains range keys, the reference count never becomes zero and the file becomes "zombied", and can never be deleted from disk.Decrement the ref counts on files present in the range keys
LevelMetadataslice, allowing the ref counts to be correctly zeroed.Add regression test to exercise the bug observed in cockroachdb/cockroach#90149.
Fix cockroachdb/cockroach#90149.