-
Notifications
You must be signed in to change notification settings - Fork 4.1k
storage/engine: experiment with enabling "filter_deletes" in RocksDB #4321
Description
From discussion on #4196:
Even with the restricted iterator, these 20k+ tombstones will get written to disk when a minor compaction happens (and potentially rewritten in future compactions until the system can determine they are completely obsolete). I believe that since we use bloom filters, we have a solution for this: set the filter_deletes option. This means that on delete, rocksdb will consult the bloom filter first and will only create tombstones for keys that the filter indicates may exist on disk. Keys that are created and deleted in between minor compactions can be deleted from the memtable without leaving any trace on disk. (Bigtable had the same optimization, but there wasn't a separate option for it: it was enabled whenever bloom filters were enabled)