Skip to content

storage: clear range keys in Writer.Clear*Range methods#82041

Merged
craig[bot] merged 1 commit intocockroachdb:masterfrom
erikgrinaker:mvcc-range-tombstones-clear
Jun 7, 2022
Merged

storage: clear range keys in Writer.Clear*Range methods#82041
craig[bot] merged 1 commit intocockroachdb:masterfrom
erikgrinaker:mvcc-range-tombstones-clear

Conversation

@erikgrinaker
Copy link
Copy Markdown
Contributor

@erikgrinaker erikgrinaker commented May 29, 2022

This patch clears range keys in the Writer methods ClearRawRange,
ClearMVCCRange, and ClearMVCCIteratorRange, as well as in the
ClearRangeWithHeuristic helper.

Range keys are not cleared in ClearMVCCVersions, since this method is
specifically for clearing MVCC point key versions, and it is not
possible to clear range keys between versions of the same point key.

Touches #70412.

Release note: None

@erikgrinaker erikgrinaker self-assigned this May 29, 2022
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@erikgrinaker erikgrinaker force-pushed the mvcc-range-tombstones-clear branch from 65e30a4 to 81ada7c Compare June 4, 2022 13:18
@erikgrinaker erikgrinaker marked this pull request as ready for review June 4, 2022 13:19
@erikgrinaker erikgrinaker requested a review from a team as a code owner June 4, 2022 13:19
@erikgrinaker erikgrinaker requested a review from itsbilal June 4, 2022 13:19
Copy link
Copy Markdown
Contributor

@jbowens jbowens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 3 of 8 files at r1.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @aliher1911, @erikgrinaker, @itsbilal, and @nicktrav)


pkg/storage/engine.go line 1230 at r1 (raw file):

	// perhaps we should fix Pebble to handle large numbers of tombstones in an
	// sstable better. Note that we are referring to storage-level tombstones here,
	// and not MVCC tombstones.

This comment applies to range keys as well: all the internal range keys in a sstable are in a single block. I'm wondering if we should test if there are any existing range keys within [start, end) before writing a clearing tombstone. Or does this function need to delete range keys, or can we make it point-specific like ClearMVCCVersions?

There's some work we can do within Pebble to help where we can't with range deletions: We can use the range-key metadata bounds of sstables lower in the LSM to elide these tombstones during flushes and compactions if there are no range keys in sstables with overlapping bounds. But that still forces iterators reading through the memtable to fragment these range key tombstones.


pkg/storage/pebble_batch.go line 418 at r1 (raw file):

		}
	}
	return p.ExperimentalClearAllMVCCRangeKeys(start, end)

Previous question applies here too

Copy link
Copy Markdown
Contributor Author

@erikgrinaker erikgrinaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @aliher1911, @itsbilal, @jbowens, and @nicktrav)


pkg/storage/engine.go line 1230 at r1 (raw file):

I'm wondering if we should test if there are any existing range keys within [start, end) before writing a clearing tombstone.

Sure, we can do that. Would it make sense to do that check in Pebble itself? Or are the semantics such that RangeKeyDelete() should always result in a range tombstone?

Or does this function need to delete range keys, or can we make it point-specific like ClearMVCCVersions?

This is called when destroying a replica, e.g. after a merge or transfer, so it needs to remove range keys too.

There's also the question of whether ClearRawRange should clear range keys -- it currently does, because several callers otherwise need to be modified to handle range keys. Seems fine to me personally.


pkg/storage/pebble_batch.go line 418 at r1 (raw file):

Previously, jbowens (Jackson Owens) wrote…

Previous question applies here too

This and ClearMVCCRange are often used interchangeably (see e.g. the ClearRange KV method), so they need to have the same range key semantics.

Copy link
Copy Markdown
Contributor

@jbowens jbowens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @aliher1911, @erikgrinaker, @itsbilal, @jbowens, and @nicktrav)


pkg/storage/engine.go line 1230 at r1 (raw file):

Sure, we can do that. Would it make sense to do that check in Pebble itself?

Pebble doesn't have any special, cheap way of performing the check. We would be suffering the check unconditionally, even when the caller knows there definitively are range keys defined over the span.

There's also the question of whether ClearRawRange should clear range keys -- it currently does, because several callers otherwise need to be modified to handle range keys. Seems fine to me personally.

Seems fine to me, except are there callers that are writing to portions of the keyspace that cannot hold range keys? It would be beneficial to avoid writing these range-key tombstones where they're known to be ineffectual.

Copy link
Copy Markdown
Contributor Author

@erikgrinaker erikgrinaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @aliher1911, @itsbilal, @jbowens, and @nicktrav)


pkg/storage/engine.go line 1230 at r1 (raw file):

Previously, jbowens (Jackson Owens) wrote…

Sure, we can do that. Would it make sense to do that check in Pebble itself?

Pebble doesn't have any special, cheap way of performing the check. We would be suffering the check unconditionally, even when the caller knows there definitively are range keys defined over the span.

There's also the question of whether ClearRawRange should clear range keys -- it currently does, because several callers otherwise need to be modified to handle range keys. Seems fine to me personally.

Seems fine to me, except are there callers that are writing to portions of the keyspace that cannot hold range keys? It would be beneficial to avoid writing these range-key tombstones where they're known to be ineffectual.

Right, ok. ExperimentalClearAllMVCCRangeKeys() now checks for any existing range keys (if any) and tightens the bounds to the smallest span that covers them all (if any).

This got a bit annoying because we have to deal with clears via unindexed batches that themselves contain range keys. Added tracking of range key writes via pebbleBatch.containsRangeKeys to handle this case by falling back to clearing the entire span. I don't see this really happening in practice though, but seemed worth handling.

Thanks for pointing this out btw, dropping all of these unnecessary range tombstones wouldn't have been great.

Copy link
Copy Markdown
Contributor

@jbowens jbowens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 2 of 4 files at r2, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @aliher1911, @itsbilal, @jbowens, and @nicktrav)

@erikgrinaker
Copy link
Copy Markdown
Contributor Author

TFTR!

bors r=jbowens

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Jun 7, 2022

Build failed (retrying...):

@erikgrinaker
Copy link
Copy Markdown
Contributor Author

bors r-

Looks like this sometimes breaks Raft snapshots. Stressing TestStoreRangeMergeRaftSnapshot yields:

F220607 18:45:20.315283 2663 kv/kvserver/pkg/kv/kvserver/store_raft.go:463  [n3,s3,r46/3:???{-\x00\x00}] 1  while applying snapshot: while applying snapshot: SST indices [3] don't match
F220607 18:45:20.315283 2663 kv/kvserver/pkg/kv/kvserver/store_raft.go:463  [n3,s3,r46/3:???{-\x00\x00}] 1 +(1) attached stack trace
F220607 18:45:20.315283 2663 kv/kvserver/pkg/kv/kvserver/store_raft.go:463  [n3,s3,r46/3:???{-\x00\x00}] 1 +  -- stack trace:
F220607 18:45:20.315283 2663 kv/kvserver/pkg/kv/kvserver/store_raft.go:463  [n3,s3,r46/3:???{-\x00\x00}] 1 +  | github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*Replica).handleRaftReadyRaftMuLocked
F220607 18:45:20.315283 2663 kv/kvserver/pkg/kv/kvserver/store_raft.go:463  [n3,s3,r46/3:???{-\x00\x00}] 1 +  | 	github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/replica_raft.go:765
F220607 18:45:20.315283 2663 kv/kvserver/pkg/kv/kvserver/store_raft.go:463  [n3,s3,r46/3:???{-\x00\x00}] 1 +  | [...repeated from below...]
F220607 18:45:20.315283 2663 kv/kvserver/pkg/kv/kvserver/store_raft.go:463  [n3,s3,r46/3:???{-\x00\x00}] 1 +Wraps: (2) while applying snapshot
F220607 18:45:20.315283 2663 kv/kvserver/pkg/kv/kvserver/store_raft.go:463  [n3,s3,r46/3:???{-\x00\x00}] 1 +Wraps: (3) attached stack trace
F220607 18:45:20.315283 2663 kv/kvserver/pkg/kv/kvserver/store_raft.go:463  [n3,s3,r46/3:???{-\x00\x00}] 1 +  -- stack trace:
F220607 18:45:20.315283 2663 kv/kvserver/pkg/kv/kvserver/store_raft.go:463  [n3,s3,r46/3:???{-\x00\x00}] 1 +  | github.com/cockroachdb/cockroach/pkg/kv/kvserver_test.TestStoreRangeMergeRaftSnapshot.func1
F220607 18:45:20.315283 2663 kv/kvserver/pkg/kv/kvserver/store_raft.go:463  [n3,s3,r46/3:???{-\x00\x00}] 1 +  | 	github.com/cockroachdb/cockroach/pkg/kv/kvserver_test/pkg/kv/kvserver/client_merge_test.go:3887
F220607 18:45:20.315283 2663 kv/kvserver/pkg/kv/kvserver/store_raft.go:463  [n3,s3,r46/3:???{-\x00\x00}] 1 +  | github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*Replica).applySnapshot
F220607 18:45:20.315283 2663 kv/kvserver/pkg/kv/kvserver/store_raft.go:463  [n3,s3,r46/3:???{-\x00\x00}] 1 +  | 	github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/replica_raftstorage.go:941
F220607 18:45:20.315283 2663 kv/kvserver/pkg/kv/kvserver/store_raft.go:463  [n3,s3,r46/3:???{-\x00\x00}] 1 +  | github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*Replica).handleRaftReadyRaftMuLocked
F220607 18:45:20.315283 2663 kv/kvserver/pkg/kv/kvserver/store_raft.go:463  [n3,s3,r46/3:???{-\x00\x00}] 1 +  | 	github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/replica_raft.go:763
F220607 18:45:20.315283 2663 kv/kvserver/pkg/kv/kvserver/store_raft.go:463  [n3,s3,r46/3:???{-\x00\x00}] 1 +  | github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*Store).processRaftSnapshotRequest.func1
F220607 18:45:20.315283 2663 kv/kvserver/pkg/kv/kvserver/store_raft.go:463  [n3,s3,r46/3:???{-\x00\x00}] 1 +  | 	github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/store_raft.go:462
F220607 18:45:20.315283 2663 kv/kvserver/pkg/kv/kvserver/store_raft.go:463  [n3,s3,r46/3:???{-\x00\x00}] 1 +  | github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*Store).withReplicaForRequest
F220607 18:45:20.315283 2663 kv/kvserver/pkg/kv/kvserver/store_raft.go:463  [n3,s3,r46/3:???{-\x00\x00}] 1 +  | 	github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/store_raft.go:341
F220607 18:45:20.315283 2663 kv/kvserver/pkg/kv/kvserver/store_raft.go:463  [n3,s3,r46/3:???{-\x00\x00}] 1 +  | github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*Store).processRaftSnapshotRequest
F220607 18:45:20.315283 2663 kv/kvserver/pkg/kv/kvserver/store_raft.go:463  [n3,s3,r46/3:???{-\x00\x00}] 1 +  | 	github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/store_raft.go:400
F220607 18:45:20.315283 2663 kv/kvserver/pkg/kv/kvserver/store_raft.go:463  [n3,s3,r46/3:???{-\x00\x00}] 1 +  | github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*Store).receiveSnapshot
F220607 18:45:20.315283 2663 kv/kvserver/pkg/kv/kvserver/store_raft.go:463  [n3,s3,r46/3:???{-\x00\x00}] 1 +  | 	github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/store_snapshot.go:806
F220607 18:45:20.315283 2663 kv/kvserver/pkg/kv/kvserver/store_raft.go:463  [n3,s3,r46/3:???{-\x00\x00}] 1 +  | github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*Store).HandleSnapshot.func1
F220607 18:45:20.315283 2663 kv/kvserver/pkg/kv/kvserver/store_raft.go:463  [n3,s3,r46/3:???{-\x00\x00}] 1 +  | 	github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/store_raft.go:216
F220607 18:45:20.315283 2663 kv/kvserver/pkg/kv/kvserver/store_raft.go:463  [n3,s3,r46/3:???{-\x00\x00}] 1 +  | github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunTaskWithErr
F220607 18:45:20.315283 2663 kv/kvserver/pkg/kv/kvserver/store_raft.go:463  [n3,s3,r46/3:???{-\x00\x00}] 1 +  | 	github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:344
F220607 18:45:20.315283 2663 kv/kvserver/pkg/kv/kvserver/store_raft.go:463  [n3,s3,r46/3:???{-\x00\x00}] 1 +  | github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*Store).HandleSnapshot
F220607 18:45:20.315283 2663 kv/kvserver/pkg/kv/kvserver/store_raft.go:463  [n3,s3,r46/3:???{-\x00\x00}] 1 +  | 	github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/store_raft.go:213
F220607 18:45:20.315283 2663 kv/kvserver/pkg/kv/kvserver/store_raft.go:463  [n3,s3,r46/3:???{-\x00\x00}] 1 +  | github.com/cockroachdb/cockroach/pkg/kv/kvserver_test.(*unreliableRaftHandler).HandleSnapshot
F220607 18:45:20.315283 2663 kv/kvserver/pkg/kv/kvserver/store_raft.go:463  [n3,s3,r46/3:???{-\x00\x00}] 1 +  | 	github.com/cockroachdb/cockroach/pkg/kv/kvserver_test/pkg/kv/kvserver/client_raft_helpers_test.go:135
F220607 18:45:20.315283 2663 kv/kvserver/pkg/kv/kvserver/store_raft.go:463  [n3,s3,r46/3:???{-\x00\x00}] 1 +  | github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*RaftTransport).RaftSnapshot.func1.1
F220607 18:45:20.315283 2663 kv/kvserver/pkg/kv/kvserver/store_raft.go:463  [n3,s3,r46/3:???{-\x00\x00}] 1 +  | 	github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/raft_transport.go:524
F220607 18:45:20.315283 2663 kv/kvserver/pkg/kv/kvserver/store_raft.go:463  [n3,s3,r46/3:???{-\x00\x00}] 1 +  | github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*RaftTransport).RaftSnapshot.func1
F220607 18:45:20.315283 2663 kv/kvserver/pkg/kv/kvserver/store_raft.go:463  [n3,s3,r46/3:???{-\x00\x00}] 1 +  | 	github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/raft_transport.go:525
F220607 18:45:20.315283 2663 kv/kvserver/pkg/kv/kvserver/store_raft.go:463  [n3,s3,r46/3:???{-\x00\x00}] 1 +  | github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunAsyncTaskEx.func2
F220607 18:45:20.315283 2663 kv/kvserver/pkg/kv/kvserver/store_raft.go:463  [n3,s3,r46/3:???{-\x00\x00}] 1 +  | 	github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:494
F220607 18:45:20.315283 2663 kv/kvserver/pkg/kv/kvserver/store_raft.go:463  [n3,s3,r46/3:???{-\x00\x00}] 1 +  | runtime.goexit
F220607 18:45:20.315283 2663 kv/kvserver/pkg/kv/kvserver/store_raft.go:463  [n3,s3,r46/3:???{-\x00\x00}] 1 +  | 	GOROOT/src/runtime/asm_amd64.s:1581

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Jun 7, 2022

Canceled.

@erikgrinaker erikgrinaker force-pushed the mvcc-range-tombstones-clear branch from bc0e4e5 to f038cc4 Compare June 7, 2022 19:13
@erikgrinaker
Copy link
Copy Markdown
Contributor Author

erikgrinaker commented Jun 7, 2022

Well, that was simple enough: the patch added a metamorphic test value for ClearRangeWithHeuristic that was randomized on every run, thus resulting in non-deterministic SSTs.

Removed the metamorphic value, since we're likely exercising both paths plenty anyway.

This patch clears range keys in the `Writer` methods `ClearRawRange`,
`ClearMVCCRange`, and `ClearMVCCIteratorRange`, as well as in the
`ClearRangeWithHeuristic` helper.

Range keys are not cleared in `ClearMVCCVersions`, since this method is
specifically for clearing MVCC point key versions, and it is not
possible to clear range keys between versions of the same point key.

`Engine.ExperimentalClearAllMVCCRangeKeys()` has been improved to scan
for any range keys in the given span, and only clear the smallest single
span that covers all range keys, to avoid dropping unnecessary Pebble
range tombstones across these range key spans.

Release note: None
@erikgrinaker erikgrinaker force-pushed the mvcc-range-tombstones-clear branch from f038cc4 to 72b19bc Compare June 7, 2022 19:15
@erikgrinaker
Copy link
Copy Markdown
Contributor Author

bors r=jbowens

@craig craig bot merged commit d9fd5bf into cockroachdb:master Jun 7, 2022
@craig
Copy link
Copy Markdown
Contributor

craig bot commented Jun 7, 2022

Build succeeded:

@erikgrinaker erikgrinaker deleted the mvcc-range-tombstones-clear branch June 8, 2022 11:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants