*: Expose range key iterators through table cache#1517
*: Expose range key iterators through table cache#1517itsbilal merged 1 commit intocockroachdb:masterfrom
Conversation
c739fbf to
41957a7
Compare
jbowens
left a comment
There was a problem hiding this comment.
Looks great. I forgot about the block property filter complication. Pinging @nicktrav because he might remember the options we were considering before.
Reviewed 4 of 7 files at r1, all commit messages.
Reviewable status: 4 of 7 files reviewed, 3 unresolved discussions (waiting on @itsbilal and @nicktrav)
db.go, line 264 at r1 (raw file):
tableCache *tableCacheContainer newIters tableNewIters tableNewRangeKeyIter tableNewRangeKeyIter
nit: call the field newRangeKeyIter to match the naming of newIters/tableNewIters?
level_iter.go, line 26 at r1 (raw file):
// tableNewRangeKeyIter creates a new range key iterator for the given file // number. type tableNewRangeKeyIter func(file *manifest.FileMetadata, opts *IterOptions) (internalIterator, error)
Rather than internalIterator, I think we want the return type to be new keyspan.FragmentIterator. It's for now a superset of the internalIterator interface, although we intend to eventually disentangle them.
Also, I think we want to move this type definition into internal/rangekey. I'm hoping that we'll also be able to stick the range-key level iterator in that package as well, to keep the range-key iterator stack organized together.
table_cache.go, line 409 at r1 (raw file):
} _, empty, err := c.checkAndIntersectFilters(v, opts)
I think we'll need to tweak something here, because I believe we're collecting different block properties for range keys and point keys, but we currently only have the one shared IterOptions.BlockPropertyFilters.
@nicktrav I completely forgot what we discussed about range key iterators and configuring separate filters for them. Did we have a plan here?
nicktrav
left a comment
There was a problem hiding this comment.
Reviewable status: 4 of 7 files reviewed, 3 unresolved discussions (waiting on @itsbilal and @jbowens)
table_cache.go, line 409 at r1 (raw file):
Did we have a plan here?
We'd talked about this over here. Specifically:
It would be configured with IterOptions.{PointFilters,RangeKeyFilters}.
The general idea was that filters for points and ranges would be set separately via separate fields onIterOptions and you'd use the appropriate slice depending on the situation.
nicktrav
left a comment
There was a problem hiding this comment.
Reviewed 7 of 7 files at r1, all commit messages.
Reviewable status: all files reviewed, 5 unresolved discussions (waiting on @itsbilal and @jbowens)
table_cache.go, line 308 at r1 (raw file):
func (c *tableCacheShard) checkAndIntersectFilters( v *tableCacheValue, opts *IterOptions, ) (filterer *sstable.BlockPropertiesFilterer, empty bool, err error) {
Could we adopt ok-like semantics and return true if the filter intersects, and false if it doesn't? e.g. rename empty to intersects and flip the return values.
table_cache.go, line 316 at r1 (raw file):
var bpfs []BlockPropertyFilter if opts != nil {
Could we return early ifopts == nil at the top of the method instead? Saves checking twice, and bpfs := opts.BlockPropertyFilters will be safe to call.
41957a7 to
30781f6
Compare
itsbilal
left a comment
There was a problem hiding this comment.
TFTR!
Reviewable status: 2 of 9 files reviewed, 2 unresolved discussions (waiting on @itsbilal, @jbowens, and @nicktrav)
db.go, line 264 at r1 (raw file):
Previously, jbowens (Jackson Owens) wrote…
nit: call the field
newRangeKeyIterto match the naming ofnewIters/tableNewIters?
That's what I was doing initially, but the issue is that db already has a newRangeKeyIter method that creates a merged rangeKeyIter. Since this function is to be used for creating table-specific iterators, I went with prefixing it with table, even though it's slightly inconsistent with newIters.
level_iter.go, line 26 at r1 (raw file):
Previously, jbowens (Jackson Owens) wrote…
Rather than
internalIterator, I think we want the return type to be newkeyspan.FragmentIterator. It's for now a superset of theinternalIteratorinterface, although we intend to eventually disentangle them.Also, I think we want to move this type definition into
internal/rangekey. I'm hoping that we'll also be able to stick the range-key level iterator in that package as well, to keep the range-key iterator stack organized together.
Done - at least about the FragmentIterator. I ended up not moving the definition to internal/RangeKey as untangling the dependency on IterOptions and particularly BlockPropertiesFilter was pretty unintuitive and dirty, so it feels cleaner this way.
table_cache.go, line 308 at r1 (raw file):
Previously, nicktrav (Nick Travers) wrote…
Could we adopt
ok-like semantics and returntrueif the filter intersects, andfalseif it doesn't? e.g. renameemptytointersectsand flip the return values.
Done. Good idea!
table_cache.go, line 409 at r1 (raw file):
Previously, nicktrav (Nick Travers) wrote…
Did we have a plan here?
We'd talked about this over here. Specifically:
It would be configured with IterOptions.{PointFilters,RangeKeyFilters}.
The general idea was that filters for points and ranges would be set separately via separate fields on
IterOptionsand you'd use the appropriate slice depending on the situation.
Done. Incorporated the change to add those fields to IterOptions too. Thanks!
nicktrav
left a comment
There was a problem hiding this comment.
Reviewed 7 of 7 files at r2, all commit messages.
Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @itsbilal and @jbowens)
options.go, line 97 at r2 (raw file):
TableFilter func(userProps map[string]string) bool // PointKeyFilters can be used to avoid scanning tables and blocks in tables // when iterating over point keys. It is requires that this slice is sorted in
nit: it is required
jbowens
left a comment
There was a problem hiding this comment.
Reviewable status: all files reviewed, 4 unresolved discussions (waiting on @itsbilal)
level_iter.go, line 26 at r1 (raw file):
Previously, itsbilal (Bilal Akhtar) wrote…
Done - at least about the
FragmentIterator. I ended up not moving the definition tointernal/RangeKeyas untangling the dependency onIterOptionsand particularlyBlockPropertiesFilterwas pretty unintuitive and dirty, so it feels cleaner this way.
Dang, yeah, I forgot about the dependency from the sstable package on internal/rangekey.
sstable/reader.go, line 107 at r2 (raw file):
// support to raw block iterators. For use when the only part of the table // we want to iterate over is contained in one block, eg. the range keys block. type wrappedBlockIter struct {
#1529 reworked the range key iterator to have a separate type fragmentBlockIter, which does its own wrapping of the blockIter. I think we can move the SetCloseHook method onto that new type now.
sstable/reader.go, line 120 at r2 (raw file):
var err error if w.closeHook != nil { err = w.closeHook(w)
Soon keyspan.FragmentIterator will not be able to satisfy the sstable.Iterator interface, so we won't be able to pass the range key iterator into a func(Iterator)error. Can we change the SetCloseHook interface to take a func(FragmentIterator) error instead?
Currently, we instantiate new per-table iterators through a `tableNewIters` function that is defined on the tableCacheContainer and has a reference in the db. This function returns a point key and a rangedel iterator. Having the same function return a range key iterator would be wasteful as we are likely to have point and range key iterators open on _different_ files at once; they will be merged at the high level iterator and the range key iterators will live in their own levelIter. This change adds a newRangeKeyIter function to the table cache that creates this new rangeKeyIter using the Reader managed by the table cache. Also adds a RangeKeyFilters field to IterOptions to specify different block-level filters for range key iteration. Renames BlockFilters to PointKeyFilters.
30781f6 to
319b28b
Compare
itsbilal
left a comment
There was a problem hiding this comment.
TFTRs!
Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @itsbilal and @jbowens)
sstable/reader.go, line 107 at r2 (raw file):
Previously, jbowens (Jackson Owens) wrote…
#1529 reworked the range key iterator to have a separate type
fragmentBlockIter, which does its own wrapping of theblockIter. I think we can move theSetCloseHookmethod onto that new type now.
Done.
sstable/reader.go, line 120 at r2 (raw file):
Previously, jbowens (Jackson Owens) wrote…
Soon
keyspan.FragmentIteratorwill not be able to satisfy thesstable.Iteratorinterface, so we won't be able to pass the range key iterator into afunc(Iterator)error. Can we change theSetCloseHookinterface to take afunc(FragmentIterator) errorinstead?
Done.
|
options.go, line 97 at r2 (raw file): Previously, nicktrav (Nick Travers) wrote…
oops, made this fix but then lost it in a stash when rebasing 😬 I guess a future PR will fix this grammar. |
Currently, we instantiate new per-table iterators through a
tableNewItersfunction that is defined on the tableCacheContainerand has a reference in the db. This function returns a point key
and a rangedel iterator. Having the same function return a range
key iterator would be wasteful as we are likely to have
point and range key iterators open on different files at once;
they will be merged at the high level iterator and the range key
iterators will live in their own levelIter.
This change adds a newRangeKeyIter function to the table cache
that creates this new rangeKeyIter using the Reader managed
by the table cache.