Skip to content

internal/manifest: narrow Overlaps to exclude RANGEDEL sentinel keys#1432

Merged
jbowens merged 1 commit intocockroachdb:masterfrom
jbowens:overlaps-sentinel
Jan 13, 2022
Merged

internal/manifest: narrow Overlaps to exclude RANGEDEL sentinel keys#1432
jbowens merged 1 commit intocockroachdb:masterfrom
jbowens:overlaps-sentinel

Conversation

@jbowens
Copy link
Copy Markdown
Contributor

@jbowens jbowens commented Dec 28, 2021

When the largest key in a sstable is a range tombstone, the sstable's largest
boundary is set to the range deletion sentinel with the tombstone's exclusive
end boundary and the maximum sequence number. The range deletion sentinel
serves as a marker, indicating that the file's bounds end immediately before
the first key with the sentinel's user key.

In many places such as growing compactions or determining atomic compaction
units, Pebble considers the sentinel as exclusive and avoids unnecessarily
including files that share the same user key but only as a sentinel's exclusive
end bounary.

The (*manifest.Version).Overlaps method did not share this logic and only
considered user keys. This method is used during compaction picking when
finding the initial compaction inputs, for determining in-use key ranges,
grandparent files, etc.

This change adapts this method to consider a file with a largest user key equal
to the search range's start user key nonoverlapping if the file's largest key
is a range deletion sentinel. Additionally, Overlaps now takes an exclusiveEnd
parameter, indicating whether the end user key provided to Overlaps should
similarly be treated as an exclusive bound.

This change is expected to reduce write amplification in the presence of range
deletions by avoiding unnecessarily pulling in files due to perceived overlap.

Additionally, some of these RangeDeletionSentinel comparisons are updated to
use a new (*InternalKey).IsExclusiveSentinel helper. With the introduction of
range keys, we will have additional exclusive end boundary key kinds.

@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@jbowens jbowens changed the title internal/manifest: narrow Overlaps to exclude RANGEDEL sentinel keys. internal/manifest: narrow Overlaps to exclude RANGEDEL sentinel keys Dec 29, 2021
Copy link
Copy Markdown
Contributor

@itsbilal itsbilal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great catch! It's been a while since we caught an instance of Pebble not special-casing rangedel sentinel keys, but I'm not surprised there's one more.

Reviewed 4 of 4 files at r1, all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @jbowens and @sumeerbhola)


internal/manifest/version.go, line 517 at r1 (raw file):

// The returned files are a subsequence of the input files, i.e., the ordering
// is not changed.
func (v *Version) Overlaps(level int, cmp Compare, start, end []byte) LevelSlice {

You're handling the f.Largest.UserKey == start && f.Largest.Trailer = RangeDelSentinel case correctly, but we can do better and also handle the end == RangeDelSentinel && f.Smallest.UserKey == end case. But since end is a user key, you'd have to add an exclusiveEnd bool in the signature and make the relevant signature changes everywhere.

What do you think? I think that'd be a good idea to also fix while we're addressing this, but if you feel otherwise, I'm good with :lgtm: -ing this as-is.

Copy link
Copy Markdown
Collaborator

@sumeerbhola sumeerbhola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice!

Reviewed 3 of 4 files at r1.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @itsbilal and @jbowens)


internal/manifest/version.go, line 517 at r1 (raw file):

Previously, itsbilal (Bilal Akhtar) wrote…

You're handling the f.Largest.UserKey == start && f.Largest.Trailer = RangeDelSentinel case correctly, but we can do better and also handle the end == RangeDelSentinel && f.Smallest.UserKey == end case. But since end is a user key, you'd have to add an exclusiveEnd bool in the signature and make the relevant signature changes everywhere.

What do you think? I think that'd be a good idea to also fix while we're addressing this, but if you feel otherwise, I'm good with :lgtm: -ing this as-is.

+1 to handling the end being exclusive case. There seem to be enough callers who have an [InternalKey, InternalKey] and we are losing information by passing in only the UserKey.


testdata/range_del, line 1269 at r1 (raw file):

----
1:
  000008:[a-c]

can this test be changed to print the InternalKey bounds. It will make it easier to see that the test cases are actually correct.

@jbowens jbowens force-pushed the overlaps-sentinel branch 7 times, most recently from 70c92e2 to e4fab74 Compare January 5, 2022 00:18
Copy link
Copy Markdown
Contributor Author

@jbowens jbowens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the exclusiveEnd parameter. I also added a (*InternalKey).IsExclusiveSentinel helper, because range keys will also have exclusive sentinel key kinds. Some of this logic will need to further change with range keys, for example, levelIter and mergingIter will only care if the largest key that affects points (eg, point key or range del) is a sentinel. I've deferred that for now.

Reviewable status: 0 of 23 files reviewed, 2 unresolved discussions (waiting on @itsbilal and @sumeerbhola)


internal/manifest/version.go, line 517 at r1 (raw file):

Previously, sumeerbhola wrote…

+1 to handling the end being exclusive case. There seem to be enough callers who have an [InternalKey, InternalKey] and we are losing information by passing in only the UserKey.

Done.


testdata/range_del, line 1269 at r1 (raw file):

Previously, sumeerbhola wrote…

can this test be changed to print the InternalKey bounds. It will make it easier to see that the test cases are actually correct.

Done.

Copy link
Copy Markdown
Contributor

@itsbilal itsbilal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 6 of 17 files at r2, 17 of 17 files at r3, all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @sumeerbhola)

Copy link
Copy Markdown
Contributor

@nicktrav nicktrav left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @sumeerbhola)

Copy link
Copy Markdown
Collaborator

@sumeerbhola sumeerbhola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 4 of 17 files at r2, 16 of 17 files at r3.
Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @jbowens and @sumeerbhola)


internal/manifest/version.go, line 275 at r3 (raw file):

	if !endIter.iter.valid() {
		endIter.Prev()
	} else if c := cmp(endIter.Current().Smallest.UserKey, end); c > 0 || c == 0 && exclusiveEnd {

nit: we don't need to do the comparison if !exclusiveEnd, so may as well avoid it.


internal/manifest/version_test.go, line 264 at r3 (raw file):

		// Level 2: empty.
		{2, "a", "z", false, ""},
	}

I'm being paranoid here, but curious if coverprofile shows that we've covered all the code paths.

Copy link
Copy Markdown
Contributor Author

@jbowens jbowens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @sumeerbhola)


internal/manifest/version.go, line 275 at r3 (raw file):

Previously, sumeerbhola wrote…

nit: we don't need to do the comparison if !exclusiveEnd, so may as well avoid it.

I might be misunderstanding, but if !exclusiveEnd the above for loop might've Nexted to a file with Smallest.UserKey > end, and we need the comparison to check for that.


internal/manifest/version_test.go, line 264 at r3 (raw file):

Previously, sumeerbhola wrote…

I'm being paranoid here, but curious if coverprofile shows that we've covered all the code paths.

Just checked and it does 👍

Copy link
Copy Markdown
Collaborator

@sumeerbhola sumeerbhola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 16 of 23 files reviewed, 3 unresolved discussions (waiting on @itsbilal, @jbowens, and @sumeerbhola)


internal/manifest/version.go, line 275 at r3 (raw file):

Previously, jbowens (Jackson Owens) wrote…

I might be misunderstanding, but if !exclusiveEnd the above for loop might've Nexted to a file with Smallest.UserKey > end, and we need the comparison to check for that.

I see. Could you add a tiny example in a code comment -- it will make future readers very happy :)


internal/manifest/version_test.go, line 264 at r3 (raw file):

Previously, jbowens (Jackson Owens) wrote…

Just checked and it does 👍

Thanks

When the largest key in a sstable is a range tombstone, the sstable's largest
boundary is set to the range deletion sentinel with the tombstone's exclusive
end boundary and the maximum sequence number. The range deletion sentinel
serves as a marker, indicating that the file's bounds end immediately before
the first key with the sentinel's user key.

In many places such as growing compactions or determining atomic compaction
units, Pebble considers the sentinel as exclusive and avoids unnecessarily
including files that share the same user key but only as a sentinel's exclusive
end bounary.

The `(*manifest.Version).Overlaps` method did not share this logic and only
considered user keys. This method is used during compaction picking when
finding the initial compaction inputs, for determining in-use key ranges,
grandparent files, etc.

This change adapts this method to consider a file with a largest user key equal
to the search range's start user key nonoverlapping if the file's largest key
is a range deletion sentinel. Additionally, Overlaps now takes an exclusiveEnd
parameter, indicating whether the end user key provided to Overlaps should
similarly be treated as an exclusive bound.

This change is expected to reduce write amplification in the presence of range
deletions by avoiding unnecessarily pulling in files due to perceived overlap.

Additionally, some of these RangeDeletionSentinel comparisons are updated to
use a new (*InternalKey).IsExclusiveSentinel helper. With the introduction of
range keys, we will have additional exclusive end boundary key kinds.
Copy link
Copy Markdown
Contributor Author

@jbowens jbowens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFTRs!

Reviewable status: 15 of 23 files reviewed, 2 unresolved discussions (waiting on @itsbilal, @jbowens, and @sumeerbhola)


internal/manifest/version.go, line 275 at r3 (raw file):

Previously, sumeerbhola wrote…

I see. Could you add a tiny example in a code comment -- it will make future readers very happy :)

Done!

@jbowens jbowens merged commit d4f0a8d into cockroachdb:master Jan 13, 2022
@jbowens jbowens deleted the overlaps-sentinel branch January 13, 2022 02:41
jbowens added a commit to jbowens/pebble that referenced this pull request Jan 20, 2022
When Version.Overlaps is called for L0, the overlap window is
iteratively expanded until stable. In cockroachdb#1432, the Overlaps function was
adjusted to allow specifying that the end bound should be considered
exclusive. However, cockroachdb#1432 failed to update the exclusivity of the end
bound when the range widened. This improperly excluded files with
largest keys that exactly equaled the new widened end bound.

This commit also transforms the TestOverlaps test into a datadriven
test, introducing a few helpers for parsing the DebugString output of a
Version.

Fix cockroachdb#1459.
jbowens added a commit to jbowens/pebble that referenced this pull request Jan 21, 2022
When Version.Overlaps is called for L0, the overlap window is
iteratively expanded until stable. In cockroachdb#1432, the Overlaps function was
adjusted to allow specifying that the end bound should be considered
exclusive. However, cockroachdb#1432 failed to update the exclusivity of the end
bound when the range widened. This improperly excluded files with
largest keys that exactly equaled the new widened end bound.

This commit also transforms the TestOverlaps test into a datadriven
test, introducing a few helpers for parsing the DebugString output of a
Version.

Fix cockroachdb#1459.
jbowens added a commit that referenced this pull request Jan 21, 2022
When Version.Overlaps is called for L0, the overlap window is
iteratively expanded until stable. In #1432, the Overlaps function was
adjusted to allow specifying that the end bound should be considered
exclusive. However, #1432 failed to update the exclusivity of the end
bound when the range widened. This improperly excluded files with
largest keys that exactly equaled the new widened end bound.

This commit also transforms the TestOverlaps test into a datadriven
test, introducing a few helpers for parsing the DebugString output of a
Version.

Fix #1459.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants