Skip to content

storage/engine: experiment with enabling "filter_deletes" in RocksDB #4321

@petermattis

Description

@petermattis

From discussion on #4196:

Even with the restricted iterator, these 20k+ tombstones will get written to disk when a minor compaction happens (and potentially rewritten in future compactions until the system can determine they are completely obsolete). I believe that since we use bloom filters, we have a solution for this: set the filter_deletes option. This means that on delete, rocksdb will consult the bloom filter first and will only create tombstones for keys that the filter indicates may exist on disk. Keys that are created and deleted in between minor compactions can be deleted from the memtable without leaving any trace on disk. (Bigtable had the same optimization, but there wasn't a separate option for it: it was enabled whenever bloom filters were enabled)

Metadata

Metadata

Assignees

No one assigned

    Labels

    C-enhancementSolution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions