*: Skip over files with no point keys in point key levelIter#1568
*: Skip over files with no point keys in point key levelIter#1568itsbilal merged 1 commit intocockroachdb:masterfrom
Conversation
|
Flushing this relatively small change before the levelIter pieces land |
be3596b to
759314b
Compare
nicktrav
left a comment
There was a problem hiding this comment.
Nice! Are there any tests we can add to manifest.LevelIter to exercise the filtering?
Reviewed 10 of 10 files at r1, all commit messages.
Reviewable status: all files reviewed, 4 unresolved discussions (waiting on @itsbilal and @jbowens)
level_iter.go, line 233 at r1 (raw file):
// (see the comment in that function). m := l.files.SeekGE(l.cmp, key, manifest.SearchKeyPoint)
Why not points and ranges here? Would we never call (*levelIter).SeekGE with a range key? Same below. I may be missing some context.
internal/manifest/level_metadata.go, line 253 at r1 (raw file):
// SearchKeyCombined denotes a search key in the merged view of point and // range keys. No files are skipped. SearchKeyCombined SearchKey = iota
super nit: elsewhere where there are point and range keys together, we've used ...PointsAndRanges / ...PointAndRange. Do we want to make this consistent?
internal/manifest/level_metadata.go, line 259 at r1 (raw file):
// SearchKeyRange denotes a search key among the range keyspace. SSTables with // no range keys will be skipped. SearchKeyRange
I assume this will start being used in the next set of PRs? I didn't see it used here.
internal/manifest/level_metadata.go, line 418 at r1 (raw file):
optimization to annotate the tree and efficiently skip over files
Nice.
get_iter_test.go, line 511 at r1 (raw file):
} } meta.SmallestPointKey = meta.Smallest
We shouldn't need to set these bounds fields directly now (see comment here). These two lines should be handled by L499.
level_iter_test.go, line 58 at r1 (raw file):
meta := (&fileMetadata{ FileNum: FileNum(len(metas)), }).ExtendPointKeyBounds(
Nice catch.
jbowens
left a comment
There was a problem hiding this comment.
I'm wondering whether we should lift this configuration onto the LevelIterator itself. Specifically, it's not just LevelIterator seeks that need to consider whether a file contains range keys or point keys. Nexts and Prevs too. It seems like what we really want is the ability to project a view of the level that only considers range keys or only considers point keys.
One possibility is to add a Filter method to level iterators. By default a level iterator would be constructed to iterate over all files (SearchKeyCombined or maybe renamed KeyTypesPointsAndRanges or something). The Filter method would return a LevelIterator that will guarantees it'll never return a FileMetadata that does not contain keys of the configured key type, and so it's free to seek based on the configured key type's bounds rather than the combined bounds. That would also allow us to hide the eventual b-tree annotation within the internal/manifest package.
func (i LevelIterator) Filter(keyType KeyType) LevelIterator
Another thought, which I haven't thought through the implications of, is to introduce two separate PointLevelIterator and RangeLevelIterator types that are lightweight wrappers around a LevelIterator. The goal would be to have compile-time type safety assurances that we're using the appropriate iterator everywhere. There may be reasons this is too cumbersome.
Reviewed all commit messages.
Reviewable status: all files reviewed, 4 unresolved discussions (waiting on @itsbilal)
759314b to
323d479
Compare
itsbilal
left a comment
There was a problem hiding this comment.
I like the Filter idea. Implemented that instead - thanks! Also has the benefit of really slimming down this PR. Also added a test as part of TestLevelIterSeek.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @nicktrav)
level_iter.go, line 233 at r1 (raw file):
Previously, nicktrav (Nick Travers) wrote…
Why not points and ranges here? Would we never call
(*levelIter).SeekGEwith a range key? Same below. I may be missing some context.
We wouldn't, no. This level iter is exclusively for point keys (and rangedels).
internal/manifest/level_metadata.go, line 253 at r1 (raw file):
Previously, nicktrav (Nick Travers) wrote…
super nit: elsewhere where there are point and range keys together, we've used
...PointsAndRanges/...PointAndRange. Do we want to make this consistent?
Done.
internal/manifest/level_metadata.go, line 259 at r1 (raw file):
Previously, nicktrav (Nick Travers) wrote…
I assume this will start being used in the next set of PRs? I didn't see it used here.
That is correct. It'll be used in the rangekey levelIter PR for one. We could drop it from here and introduce it there if we want.
get_iter_test.go, line 511 at r1 (raw file):
Previously, nicktrav (Nick Travers) wrote…
We shouldn't need to set these bounds fields directly now (see comment here). These two lines should be handled by L499.
Done.
nicktrav
left a comment
There was a problem hiding this comment.
All my questions / comments answered.
Reviewed 12 of 12 files at r2, all commit messages.
Reviewable status:complete! all files reviewed, all discussions resolved (waiting on @itsbilal)
jbowens
left a comment
There was a problem hiding this comment.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @itsbilal)
internal/manifest/level_metadata.go, line 266 at r2 (raw file):
// LevelIterator positioning operations. Files not containing any keys of the // desired type are skipped. type KeyType int
nit: probably doesn't matter in practice since I think everything else in the LevelIterator struct is 8-byte aligned, but this could be an int8.
This change updates the LevelMetadata positioning calls to take into account whether we are interested in point keys, range keys, or both/combined. Currently we just skip over files that do not have the desired key type by iterating past them. In future changes, we will optimize this seek to be more efficient if we want to skip over files with no range keys, through the use of btree annotations.
323d479 to
94decca
Compare
itsbilal
left a comment
There was a problem hiding this comment.
TFTRs!
Reviewable status: 11 of 12 files reviewed, all discussions resolved (waiting on @nicktrav)
internal/manifest/level_metadata.go, line 266 at r2 (raw file):
Previously, jbowens (Jackson Owens) wrote…
nit: probably doesn't matter in practice since I think everything else in the
LevelIteratorstruct is 8-byte aligned, but this could be anint8.
Done.
This change updates the LevelMetadata Seek{GE,LT} calls to
support specifying whether we are interested in files with
point keys, range keys, or both/combined. In future changes,
we will optimize this seek to be more efficient if we want
to skip over files with no range keys.