storage: clear range keys in Writer.Clear*Range methods#82041
storage: clear range keys in Writer.Clear*Range methods#82041craig[bot] merged 1 commit intocockroachdb:masterfrom
Writer.Clear*Range methods#82041Conversation
65e30a4 to
81ada7c
Compare
jbowens
left a comment
There was a problem hiding this comment.
Reviewed 3 of 8 files at r1.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @aliher1911, @erikgrinaker, @itsbilal, and @nicktrav)
pkg/storage/engine.go line 1230 at r1 (raw file):
// perhaps we should fix Pebble to handle large numbers of tombstones in an // sstable better. Note that we are referring to storage-level tombstones here, // and not MVCC tombstones.
This comment applies to range keys as well: all the internal range keys in a sstable are in a single block. I'm wondering if we should test if there are any existing range keys within [start, end) before writing a clearing tombstone. Or does this function need to delete range keys, or can we make it point-specific like ClearMVCCVersions?
There's some work we can do within Pebble to help where we can't with range deletions: We can use the range-key metadata bounds of sstables lower in the LSM to elide these tombstones during flushes and compactions if there are no range keys in sstables with overlapping bounds. But that still forces iterators reading through the memtable to fragment these range key tombstones.
pkg/storage/pebble_batch.go line 418 at r1 (raw file):
} } return p.ExperimentalClearAllMVCCRangeKeys(start, end)
Previous question applies here too
erikgrinaker
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @aliher1911, @itsbilal, @jbowens, and @nicktrav)
pkg/storage/engine.go line 1230 at r1 (raw file):
I'm wondering if we should test if there are any existing range keys within [start, end) before writing a clearing tombstone.
Sure, we can do that. Would it make sense to do that check in Pebble itself? Or are the semantics such that RangeKeyDelete() should always result in a range tombstone?
Or does this function need to delete range keys, or can we make it point-specific like ClearMVCCVersions?
This is called when destroying a replica, e.g. after a merge or transfer, so it needs to remove range keys too.
There's also the question of whether ClearRawRange should clear range keys -- it currently does, because several callers otherwise need to be modified to handle range keys. Seems fine to me personally.
pkg/storage/pebble_batch.go line 418 at r1 (raw file):
Previously, jbowens (Jackson Owens) wrote…
Previous question applies here too
This and ClearMVCCRange are often used interchangeably (see e.g. the ClearRange KV method), so they need to have the same range key semantics.
jbowens
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @aliher1911, @erikgrinaker, @itsbilal, @jbowens, and @nicktrav)
pkg/storage/engine.go line 1230 at r1 (raw file):
Sure, we can do that. Would it make sense to do that check in Pebble itself?
Pebble doesn't have any special, cheap way of performing the check. We would be suffering the check unconditionally, even when the caller knows there definitively are range keys defined over the span.
There's also the question of whether ClearRawRange should clear range keys -- it currently does, because several callers otherwise need to be modified to handle range keys. Seems fine to me personally.
Seems fine to me, except are there callers that are writing to portions of the keyspace that cannot hold range keys? It would be beneficial to avoid writing these range-key tombstones where they're known to be ineffectual.
81ada7c to
bc0e4e5
Compare
erikgrinaker
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @aliher1911, @itsbilal, @jbowens, and @nicktrav)
pkg/storage/engine.go line 1230 at r1 (raw file):
Previously, jbowens (Jackson Owens) wrote…
Sure, we can do that. Would it make sense to do that check in Pebble itself?
Pebble doesn't have any special, cheap way of performing the check. We would be suffering the check unconditionally, even when the caller knows there definitively are range keys defined over the span.
There's also the question of whether ClearRawRange should clear range keys -- it currently does, because several callers otherwise need to be modified to handle range keys. Seems fine to me personally.
Seems fine to me, except are there callers that are writing to portions of the keyspace that cannot hold range keys? It would be beneficial to avoid writing these range-key tombstones where they're known to be ineffectual.
Right, ok. ExperimentalClearAllMVCCRangeKeys() now checks for any existing range keys (if any) and tightens the bounds to the smallest span that covers them all (if any).
This got a bit annoying because we have to deal with clears via unindexed batches that themselves contain range keys. Added tracking of range key writes via pebbleBatch.containsRangeKeys to handle this case by falling back to clearing the entire span. I don't see this really happening in practice though, but seemed worth handling.
Thanks for pointing this out btw, dropping all of these unnecessary range tombstones wouldn't have been great.
jbowens
left a comment
There was a problem hiding this comment.
Reviewed 2 of 4 files at r2, all commit messages.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @aliher1911, @itsbilal, @jbowens, and @nicktrav)
|
TFTR! bors r=jbowens |
|
Build failed (retrying...): |
|
bors r- Looks like this sometimes breaks Raft snapshots. Stressing |
|
Canceled. |
bc0e4e5 to
f038cc4
Compare
|
Well, that was simple enough: the patch added a metamorphic test value for Removed the metamorphic value, since we're likely exercising both paths plenty anyway. |
This patch clears range keys in the `Writer` methods `ClearRawRange`, `ClearMVCCRange`, and `ClearMVCCIteratorRange`, as well as in the `ClearRangeWithHeuristic` helper. Range keys are not cleared in `ClearMVCCVersions`, since this method is specifically for clearing MVCC point key versions, and it is not possible to clear range keys between versions of the same point key. `Engine.ExperimentalClearAllMVCCRangeKeys()` has been improved to scan for any range keys in the given span, and only clear the smallest single span that covers all range keys, to avoid dropping unnecessary Pebble range tombstones across these range key spans. Release note: None
f038cc4 to
72b19bc
Compare
|
bors r=jbowens |
|
Build succeeded: |
This patch clears range keys in the
WritermethodsClearRawRange,ClearMVCCRange, andClearMVCCIteratorRange, as well as in theClearRangeWithHeuristichelper.Range keys are not cleared in
ClearMVCCVersions, since this method isspecifically for clearing MVCC point key versions, and it is not
possible to clear range keys between versions of the same point key.
Touches #70412.
Release note: None