storage/engine: add Pebble support to ExportToSst#42134
storage/engine: add Pebble support to ExportToSst#42134craig[bot] merged 1 commit intocockroachdb:masterfrom
Conversation
itsbilal
left a comment
There was a problem hiding this comment.
Great work so far! This looks pretty solid overall - just some relatively minor comments. Thanks for doing this!
Reviewed 18 of 18 files at r1.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @hueypark)
pkg/ccl/storageccl/export_test.go, line 230 at r1 (raw file):
EndTime: endTime, UpperBound: endKey, EnableTimeBoundIteratorOptimization: enableTimeBoundIteratorOptimization,
I see that enableTimeBoundIteratorOptimization is a parameter of this method, but IterOptions.{Min, Max}TimestampHint isn't being set accordingly if it's true. Maybe add that check right before NewMVCCIncrementalIterator is called?
pkg/storage/bulk/sst_batcher.go, line 341 at r1 (raw file):
} // SSTSender is the sst sender interface to an engine's data.
Nit: Probably not the best idea to describe an interface with its name alone, maybe "SSTSender is an interface to send SST data to an engine" ?
pkg/storage/engine/pebble_batch.go, line 95 at r1 (raw file):
start, end MVCCKey, exportAllRevisions bool, io IterOptions, ) ([]byte, roachpb.BulkOpSummary, error) { return pebbleExportToSst(p, start, end, exportAllRevisions, io)
While this works, it's also worthwhile to just panic here for consistency with rocksdb. See comment below.
pkg/storage/engine/rocksdb.go, line 1711 at r1 (raw file):
start, end MVCCKey, exportAllRevisions bool, io IterOptions, ) ([]byte, roachpb.BulkOpSummary, error) { return r.parent.ExportToSst(start, end, exportAllRevisions, io)
Looks like this is just a passthrough onto the parent db instance's ExportToSst, potentially skipping over mutations in this batch. The previous code didn't support ExportToSst on a rocksdb batch anyway, so it's probably worthwhile to just do a panic("unimplemented") here in case we ever unintentionally exercise this functionality.
pkg/storage/engine/sst_writer.go, line 151 at r1 (raw file):
// MakeRocksSST creates RocksDB SST. func MakeRocksSST(kvs []MVCCKeyValue) ([]byte, error) { w, err := MakeRocksDBSstFileWriter()
I see this was copied over from testing code two different files, one is now in bulk and the other is in engine.. But I'm not sure how good of an idea it is to make it an exported non-test function - it's only called in testing code, plus we want to remove usage of MakeRocksDBSstFileWriter. I'd say move these two functions back to the test files (even if it's duplicated in both of them) and make them unexported?
petermattis
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @hueypark)
pkg/storage/engine/engine.go, line 193 at r1 (raw file):
// SSTable containing the exported keys, the size of exported data, or an error. ExportToSst( start, end MVCCKey, exportAllRevisions bool, io IterOptions,
The semantics of the range exported are a bit subtle. The fact that start.Key is inclusive while start.Timestamp is exclusive could be confusing. I think it might be better to have separate startKey,endKey roachpb.Key and startTS, endTS hlc.Timestamp options. Rather than make this change in this PR, can you add a TODO?
d8a5193 to
facb927
Compare
hueypark
left a comment
There was a problem hiding this comment.
Thank you for the review!
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @itsbilal and @petermattis)
pkg/ccl/storageccl/export_test.go, line 230 at r1 (raw file):
Previously, itsbilal (Bilal Akhtar) wrote…
I see that
enableTimeBoundIteratorOptimizationis a parameter of this method, butIterOptions.{Min, Max}TimestampHintisn't being set accordingly if it's true. Maybe add that check right beforeNewMVCCIncrementalIteratoris called?
Nice catch! Done.
pkg/storage/bulk/sst_batcher.go, line 341 at r1 (raw file):
Previously, itsbilal (Bilal Akhtar) wrote…
Nit: Probably not the best idea to describe an interface with its name alone, maybe "SSTSender is an interface to send SST data to an engine" ?
Thanks. Done.
pkg/storage/engine/engine.go, line 193 at r1 (raw file):
Previously, petermattis (Peter Mattis) wrote…
The semantics of the range exported are a bit subtle. The fact that
start.Keyis inclusive whilestart.Timestampis exclusive could be confusing. I think it might be better to have separatestartKey,endKey roachpb.KeyandstartTS, endTS hlc.Timestampoptions. Rather than make this change in this PR, can you add a TODO?
I added TODO.
pkg/storage/engine/pebble_batch.go, line 95 at r1 (raw file):
Previously, itsbilal (Bilal Akhtar) wrote…
While this works, it's also worthwhile to just panic here for consistency with rocksdb. See comment below.
Done.
pkg/storage/engine/rocksdb.go, line 1711 at r1 (raw file):
Previously, itsbilal (Bilal Akhtar) wrote…
Looks like this is just a passthrough onto the parent db instance's
ExportToSst, potentially skipping over mutations in this batch. The previous code didn't supportExportToSston a rocksdb batch anyway, so it's probably worthwhile to just do apanic("unimplemented")here in case we ever unintentionally exercise this functionality.
Done.
pkg/storage/engine/sst_writer.go, line 151 at r1 (raw file):
Previously, itsbilal (Bilal Akhtar) wrote…
I see this was copied over from testing code two different files, one is now in
bulkand the other is inengine.. But I'm not sure how good of an idea it is to make it an exported non-test function - it's only called in testing code, plus we want to remove usage ofMakeRocksDBSstFileWriter. I'd say move these two functions back to the test files (even if it's duplicated in both of them) and make them unexported?
Done.
itsbilal
left a comment
There was a problem hiding this comment.
save for two really minor suggestions. Thanks so much!
Reviewed 10 of 10 files at r2.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @hueypark, @itsbilal, and @petermattis)
pkg/ccl/storageccl/export_test.go, line 231 at r2 (raw file):
if enableTimeBoundIteratorOptimization { io.MaxTimestampHint = endTime io.MinTimestampHint = startTime
While this works too, startTime.Next() is more correct here- and matches what the old code did.
pkg/storage/bulk/sst_batcher.go, line 341 at r1 (raw file):
Previously, hueypark (Jaewan Park) wrote…
Thanks. Done.
👍
pkg/storage/engine/engine.go, line 192 at r2 (raw file):
// requested or if the start.Timestamp is non-zero. Returns the bytes of an // SSTable containing the exported keys, the size of exported data, or an error. // TODO(hueypark): Separate MVCCKey into roachpb.Key and hlc.Timestamp.
Nit: add a blank line before the TODO to make it more obvious.
Most of these commits are from [engine: Add ExportToSst method with all logic moved to C++](cockroachdb@75f1314). Fixes cockroachdb#41739 Release note: None
facb927 to
b058f23
Compare
hueypark
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @itsbilal and @petermattis)
pkg/ccl/storageccl/export_test.go, line 231 at r2 (raw file):
Previously, itsbilal (Bilal Akhtar) wrote…
While this works too,
startTime.Next()is more correct here- and matches what the old code did.
Done.
pkg/storage/engine/engine.go, line 192 at r2 (raw file):
Previously, itsbilal (Bilal Akhtar) wrote…
Nit: add a blank line before the TODO to make it more obvious.
Done.
itsbilal
left a comment
There was a problem hiding this comment.
I'll go ahead and have this merged now, thanks!
bors r+
Reviewed 4 of 4 files at r3.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @itsbilal and @petermattis)
Build failed |
|
Unrelated CI flake. Retrying. bors r+ |
Build succeeded |
Most of these commits are from engine: Add ExportToSst method with all logic moved to C++.
Fixes #41739
Release note: None