sstable: add RangeKeyIter iterator#1513
Conversation
Add a range key iterator type intended to be used externally for reading the range-keys contained within an sstable. The interface is simple, providing forward iteration from beginning to end only. Each individual range key is surfaced separately, even if coalesced into single physical fragments.
jbowens
left a comment
There was a problem hiding this comment.
@erikgrinaker A simple forward iteration interface is enough, right?
Reviewable status: 0 of 6 files reviewed, all discussions resolved (waiting on @erikgrinaker, @nicktrav, and @sumeerbhola)
erikgrinaker
left a comment
There was a problem hiding this comment.
A simple forward iteration interface is enough, right?
Yep, that'll do, thanks! Would it be easy to allow lower/upper key bounds? Otherwise, it's trivial to just do the filtering in the MVCC range key iterator too.
Reviewable status: 0 of 6 files reviewed, all discussions resolved (waiting on @nicktrav and @sumeerbhola)
erikgrinaker
left a comment
There was a problem hiding this comment.
On second thought, we'd ideally want the CRDB SSTIterator to satisfy the SimpleMVCCIterator interface, which uses interleaved point/range key iteration. This is based on sstable.Reader.NewIter(). Would it be possible to reuse the interleaving logic that we get from pebble.Iterator?
If not, nbd -- I can create separate SST iterator implementations for point and range keys, which covers my needs for now, and we can revisit this later.
Reviewable status: 0 of 6 files reviewed, all discussions resolved (waiting on @nicktrav and @sumeerbhola)
nicktrav
left a comment
There was a problem hiding this comment.
Reviewed 6 of 6 files at r1, all commit messages.
Reviewable status:complete! all files reviewed, all discussions resolved (waiting on @sumeerbhola)
sumeerbhola
left a comment
There was a problem hiding this comment.
On second thought, we'd ideally want the CRDB SSTIterator to satisfy the SimpleMVCCIterator interface, which uses interleaved point/range key iteration.
Where is this sst originating from? We can't defragment across ssts in this context, so we would need to expose range keys that are not prefix keys. I'm worried about polluting the SimpleMVCCIterator interface -- I thought we had managed to come up with a scheme such that our common iterator interfaces in the storage package would expose [start, end) pairs that were roachpb.Keys.
Reviewed 6 of 6 files at r1, all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @jbowens)
sstable/external.go, line 100 at r1 (raw file):
endKey, value, ok := rangekey.DecodeEndKey(i.iterKey.Kind(), v) if !ok { panic("pebble: unable to decode rangekey end")
I believe this can happen due to a corruption, in which case we should not panic.
erikgrinaker
left a comment
There was a problem hiding this comment.
Where is this sst originating from? We can't defragment across ssts in this context, so we would need to expose range keys that are not prefix keys. I'm worried about polluting the SimpleMVCCIterator interface -- I thought we had managed to come up with a scheme such that our common iterator interfaces in the storage package would expose [start, end) pairs that were roachpb.Keys.
These will e.g. be emitted by Export or ingested by AddSSTable, so they're generated by client and server logic. Consider e.g. streaming replication, which will reads events off of a rangefeed, build SSTs which include range tombstones, and ingest these into the target cluster. Requiring range keys to have roachpb.Key bounds seems perfectly fine for these, and we can assert that in AddSSTable during race test builds (or maybe always, if it's cheap enough).
The iteration is needed e.g. by backup restoration reading exported SSTs from an S3 bucket somewhere, combining them to figure out the latest visible point keys, then building an SST containing the final visible point keys to be ingested and restored. It's also needed for tests.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @jbowens)
|
Closing in favor of #1529. |
Add a range key iterator type intended to be used externally for reading the
range-keys contained within an sstable. The interface is simple, providing
forward iteration from beginning to end only. Each individual range key is
surfaced separately, even if coalesced into single physical fragments.