Skip to content

*: Skip over files with no point keys in point key levelIter#1568

Merged
itsbilal merged 1 commit intocockroachdb:masterfrom
itsbilal:levelmeta-seek-rangekey
Mar 15, 2022
Merged

*: Skip over files with no point keys in point key levelIter#1568
itsbilal merged 1 commit intocockroachdb:masterfrom
itsbilal:levelmeta-seek-rangekey

Conversation

@itsbilal
Copy link
Copy Markdown
Contributor

@itsbilal itsbilal commented Mar 9, 2022

This change updates the LevelMetadata Seek{GE,LT} calls to
support specifying whether we are interested in files with
point keys, range keys, or both/combined. In future changes,
we will optimize this seek to be more efficient if we want
to skip over files with no range keys.

@itsbilal itsbilal requested review from a team, jbowens and nicktrav March 9, 2022 20:30
@itsbilal itsbilal self-assigned this Mar 9, 2022
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@itsbilal
Copy link
Copy Markdown
Contributor Author

itsbilal commented Mar 9, 2022

Flushing this relatively small change before the levelIter pieces land

@itsbilal itsbilal force-pushed the levelmeta-seek-rangekey branch from be3596b to 759314b Compare March 9, 2022 20:38
Copy link
Copy Markdown
Contributor

@nicktrav nicktrav left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Are there any tests we can add to manifest.LevelIter to exercise the filtering?

Reviewed 10 of 10 files at r1, all commit messages.
Reviewable status: all files reviewed, 4 unresolved discussions (waiting on @itsbilal and @jbowens)


level_iter.go, line 233 at r1 (raw file):

	// (see the comment in that function).

	m := l.files.SeekGE(l.cmp, key, manifest.SearchKeyPoint)

Why not points and ranges here? Would we never call (*levelIter).SeekGE with a range key? Same below. I may be missing some context.


internal/manifest/level_metadata.go, line 253 at r1 (raw file):

	// SearchKeyCombined denotes a search key in the merged view of point and
	// range keys. No files are skipped.
	SearchKeyCombined SearchKey = iota

super nit: elsewhere where there are point and range keys together, we've used ...PointsAndRanges / ...PointAndRange. Do we want to make this consistent?


internal/manifest/level_metadata.go, line 259 at r1 (raw file):

	// SearchKeyRange denotes a search key among the range keyspace. SSTables with
	// no range keys will be skipped.
	SearchKeyRange

I assume this will start being used in the next set of PRs? I didn't see it used here.


internal/manifest/level_metadata.go, line 418 at r1 (raw file):

optimization to annotate the tree and efficiently skip over files

Nice.


get_iter_test.go, line 511 at r1 (raw file):

					}
				}
				meta.SmallestPointKey = meta.Smallest

We shouldn't need to set these bounds fields directly now (see comment here). These two lines should be handled by L499.


level_iter_test.go, line 58 at r1 (raw file):

				meta := (&fileMetadata{
					FileNum: FileNum(len(metas)),
				}).ExtendPointKeyBounds(

Nice catch.

Copy link
Copy Markdown
Contributor

@jbowens jbowens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering whether we should lift this configuration onto the LevelIterator itself. Specifically, it's not just LevelIterator seeks that need to consider whether a file contains range keys or point keys. Nexts and Prevs too. It seems like what we really want is the ability to project a view of the level that only considers range keys or only considers point keys.

One possibility is to add a Filter method to level iterators. By default a level iterator would be constructed to iterate over all files (SearchKeyCombined or maybe renamed KeyTypesPointsAndRanges or something). The Filter method would return a LevelIterator that will guarantees it'll never return a FileMetadata that does not contain keys of the configured key type, and so it's free to seek based on the configured key type's bounds rather than the combined bounds. That would also allow us to hide the eventual b-tree annotation within the internal/manifest package.

func (i LevelIterator) Filter(keyType KeyType) LevelIterator

Another thought, which I haven't thought through the implications of, is to introduce two separate PointLevelIterator and RangeLevelIterator types that are lightweight wrappers around a LevelIterator. The goal would be to have compile-time type safety assurances that we're using the appropriate iterator everywhere. There may be reasons this is too cumbersome.

Reviewed all commit messages.
Reviewable status: all files reviewed, 4 unresolved discussions (waiting on @itsbilal)

Copy link
Copy Markdown
Contributor Author

@itsbilal itsbilal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the Filter idea. Implemented that instead - thanks! Also has the benefit of really slimming down this PR. Also added a test as part of TestLevelIterSeek.

Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @nicktrav)


level_iter.go, line 233 at r1 (raw file):

Previously, nicktrav (Nick Travers) wrote…

Why not points and ranges here? Would we never call (*levelIter).SeekGE with a range key? Same below. I may be missing some context.

We wouldn't, no. This level iter is exclusively for point keys (and rangedels).


internal/manifest/level_metadata.go, line 253 at r1 (raw file):

Previously, nicktrav (Nick Travers) wrote…

super nit: elsewhere where there are point and range keys together, we've used ...PointsAndRanges / ...PointAndRange. Do we want to make this consistent?

Done.


internal/manifest/level_metadata.go, line 259 at r1 (raw file):

Previously, nicktrav (Nick Travers) wrote…

I assume this will start being used in the next set of PRs? I didn't see it used here.

That is correct. It'll be used in the rangekey levelIter PR for one. We could drop it from here and introduce it there if we want.


get_iter_test.go, line 511 at r1 (raw file):

Previously, nicktrav (Nick Travers) wrote…

We shouldn't need to set these bounds fields directly now (see comment here). These two lines should be handled by L499.

Done.

Copy link
Copy Markdown
Contributor

@nicktrav nicktrav left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All my questions / comments answered. :lgtm:

Reviewed 12 of 12 files at r2, all commit messages.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved (waiting on @itsbilal)

Copy link
Copy Markdown
Contributor

@jbowens jbowens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @itsbilal)


internal/manifest/level_metadata.go, line 266 at r2 (raw file):

// LevelIterator positioning operations. Files not containing any keys of the
// desired type are skipped.
type KeyType int

nit: probably doesn't matter in practice since I think everything else in the LevelIterator struct is 8-byte aligned, but this could be an int8.

This change updates the LevelMetadata positioning calls to
take into account whether we are interested in
point keys, range keys, or both/combined. Currently we just
skip over files that do not have the desired key type by
iterating past them.

In future changes, we will optimize this seek to be more
efficient if we want to skip over files with no range keys, through
the use of btree annotations.
@itsbilal itsbilal force-pushed the levelmeta-seek-rangekey branch from 323d479 to 94decca Compare March 15, 2022 14:57
Copy link
Copy Markdown
Contributor Author

@itsbilal itsbilal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFTRs!

Reviewable status: 11 of 12 files reviewed, all discussions resolved (waiting on @nicktrav)


internal/manifest/level_metadata.go, line 266 at r2 (raw file):

Previously, jbowens (Jackson Owens) wrote…

nit: probably doesn't matter in practice since I think everything else in the LevelIterator struct is 8-byte aligned, but this could be an int8.

Done.

@itsbilal itsbilal merged commit 4ad1b57 into cockroachdb:master Mar 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants