Skip to content

perf: relative positioning through broad range tombstones is slow #1070

@jbowens

Description

@jbowens

In cockroachlabs/support#846, it looks like high CPU usage was caused by a SeekGE for a key just before a range tombstone's start key. If a SeekGE key is just less than a tombstone's start key, and there are no keys between the seek key the tombstone start key, SeekGE must Next through the entire range tombstone.

We have a couple longer term ideas that would have helped here:

  • perf: split L6 data into separate obsolete, live files #847 perf: split L6 data into separate obsolete, live files - Segmenting obsolete data into separate files would have allowed the SeekGE at a recent sequence number to ignore all the obsolete keys covered by the range tombstone.
  • NewSnapshot key range(s) - I couldn't find where this idea was documented, but it's the idea of specifying key ranges to NewSnapshot so that a snapshot only applies to keys within the range. This would've allowed Pebble to drop the data covered by the range tombstone, assuming the snapshot did not actually care about the data beneath the tombstone. I think this idea is tricky mostly in the interface exposed to CockroachDB, because CRDB should only be able to read snapshotted data when reading from a snapshot with a range.

In the immediate term, it seems like it should be possible to add some sort noninvasive optimization:

  • When we Next into the range tombstone, we're nexting outside the replica's keyspace. I'm not sure if iterator used by mvccPutUsingIter currently has an upper bound set on its iterator, but if not, we could set one. Then whenever we encounter a new tombstone during findNextEntry, we could compare it to the iterator upper bound and return early.
  • When we load a sstable that is wholly covered by a range tombstone (either within that file or above), can we somehow detect that and skip it? @sumeerbhola pointed out that the range tombstone would be a part of the file's metadata key range, which makes it hard to tell if the range tombstone is more recent than every other key contained within the file. This also doesn't help with files that contain live and obsolete data.
  • If we detect that we're nexting many times through a range tombstone, can we seek the iterators lower than the tombstone in the LSM to the tombstone's end key?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions