-
Notifications
You must be signed in to change notification settings - Fork 556
perf: investigate deletion-only compactions #720
Description
We have "normal" compactions which take a set of input sstables and rewrite them. We have "move" compactions which take a set of input sstables and move them to a new level. Should we have deletion-only compactions which take a set of sstables and delete them?
The idea behind a deletion-only compaction is to delete one or more sstables that are completely covered by a range tombstone at a higher level and for which there is not a snapshot pinning some of their keys in existence. This would be a bit like the effect of the RocksDB DeleteFilesInRange operation, but safe with regards to the normal operation semantics. This idea has come up in the past, and the recent work on incorporating range tombstones in the compaction heuristic (#707) has brought it up again. #707 is already determining that an sstable is completely covered by a range tombstone. If that occurs, we could mark a flag that all of the keys in an sstable are deleted. We then simply have to wait for a time when the range of seqnums in the sstable are in the same snapshot stripe as the covering range tombstone.
The expected result of this optimization is to make DROP/TRUNCATE TABLE in CRDB much more efficient.