Skip to content

storage/engine: add Pebble support to ExportToSst#42134

Merged
craig[bot] merged 1 commit intocockroachdb:masterfrom
hueypark:support-export
Nov 7, 2019
Merged

storage/engine: add Pebble support to ExportToSst#42134
craig[bot] merged 1 commit intocockroachdb:masterfrom
hueypark:support-export

Conversation

@hueypark
Copy link
Copy Markdown
Contributor

@hueypark hueypark commented Nov 2, 2019

Most of these commits are from engine: Add ExportToSst method with all logic moved to C++.

Fixes #41739

Release note: None

@hueypark hueypark requested a review from a team as a code owner November 2, 2019 08:36
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

Copy link
Copy Markdown
Contributor

@itsbilal itsbilal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work so far! This looks pretty solid overall - just some relatively minor comments. Thanks for doing this!

Reviewed 18 of 18 files at r1.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @hueypark)


pkg/ccl/storageccl/export_test.go, line 230 at r1 (raw file):

		EndTime:                             endTime,
		UpperBound:                          endKey,
		EnableTimeBoundIteratorOptimization: enableTimeBoundIteratorOptimization,

I see that enableTimeBoundIteratorOptimization is a parameter of this method, but IterOptions.{Min, Max}TimestampHint isn't being set accordingly if it's true. Maybe add that check right before NewMVCCIncrementalIterator is called?


pkg/storage/bulk/sst_batcher.go, line 341 at r1 (raw file):

}

// SSTSender is the sst sender interface to an engine's data.

Nit: Probably not the best idea to describe an interface with its name alone, maybe "SSTSender is an interface to send SST data to an engine" ?


pkg/storage/engine/pebble_batch.go, line 95 at r1 (raw file):

	start, end MVCCKey, exportAllRevisions bool, io IterOptions,
) ([]byte, roachpb.BulkOpSummary, error) {
	return pebbleExportToSst(p, start, end, exportAllRevisions, io)

While this works, it's also worthwhile to just panic here for consistency with rocksdb. See comment below.


pkg/storage/engine/rocksdb.go, line 1711 at r1 (raw file):

	start, end MVCCKey, exportAllRevisions bool, io IterOptions,
) ([]byte, roachpb.BulkOpSummary, error) {
	return r.parent.ExportToSst(start, end, exportAllRevisions, io)

Looks like this is just a passthrough onto the parent db instance's ExportToSst, potentially skipping over mutations in this batch. The previous code didn't support ExportToSst on a rocksdb batch anyway, so it's probably worthwhile to just do a panic("unimplemented") here in case we ever unintentionally exercise this functionality.


pkg/storage/engine/sst_writer.go, line 151 at r1 (raw file):

// MakeRocksSST creates RocksDB SST.
func MakeRocksSST(kvs []MVCCKeyValue) ([]byte, error) {
	w, err := MakeRocksDBSstFileWriter()

I see this was copied over from testing code two different files, one is now in bulk and the other is in engine.. But I'm not sure how good of an idea it is to make it an exported non-test function - it's only called in testing code, plus we want to remove usage of MakeRocksDBSstFileWriter. I'd say move these two functions back to the test files (even if it's duplicated in both of them) and make them unexported?

Copy link
Copy Markdown
Collaborator

@petermattis petermattis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @hueypark)


pkg/storage/engine/engine.go, line 193 at r1 (raw file):

	// SSTable containing the exported keys, the size of exported data, or an error.
	ExportToSst(
		start, end MVCCKey, exportAllRevisions bool, io IterOptions,

The semantics of the range exported are a bit subtle. The fact that start.Key is inclusive while start.Timestamp is exclusive could be confusing. I think it might be better to have separate startKey,endKey roachpb.Key and startTS, endTS hlc.Timestamp options. Rather than make this change in this PR, can you add a TODO?

Copy link
Copy Markdown
Contributor Author

@hueypark hueypark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the review!

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @itsbilal and @petermattis)


pkg/ccl/storageccl/export_test.go, line 230 at r1 (raw file):

Previously, itsbilal (Bilal Akhtar) wrote…

I see that enableTimeBoundIteratorOptimization is a parameter of this method, but IterOptions.{Min, Max}TimestampHint isn't being set accordingly if it's true. Maybe add that check right before NewMVCCIncrementalIterator is called?

Nice catch! Done.


pkg/storage/bulk/sst_batcher.go, line 341 at r1 (raw file):

Previously, itsbilal (Bilal Akhtar) wrote…

Nit: Probably not the best idea to describe an interface with its name alone, maybe "SSTSender is an interface to send SST data to an engine" ?

Thanks. Done.


pkg/storage/engine/engine.go, line 193 at r1 (raw file):

Previously, petermattis (Peter Mattis) wrote…

The semantics of the range exported are a bit subtle. The fact that start.Key is inclusive while start.Timestamp is exclusive could be confusing. I think it might be better to have separate startKey,endKey roachpb.Key and startTS, endTS hlc.Timestamp options. Rather than make this change in this PR, can you add a TODO?

I added TODO.


pkg/storage/engine/pebble_batch.go, line 95 at r1 (raw file):

Previously, itsbilal (Bilal Akhtar) wrote…

While this works, it's also worthwhile to just panic here for consistency with rocksdb. See comment below.

Done.


pkg/storage/engine/rocksdb.go, line 1711 at r1 (raw file):

Previously, itsbilal (Bilal Akhtar) wrote…

Looks like this is just a passthrough onto the parent db instance's ExportToSst, potentially skipping over mutations in this batch. The previous code didn't support ExportToSst on a rocksdb batch anyway, so it's probably worthwhile to just do a panic("unimplemented") here in case we ever unintentionally exercise this functionality.

Done.


pkg/storage/engine/sst_writer.go, line 151 at r1 (raw file):

Previously, itsbilal (Bilal Akhtar) wrote…

I see this was copied over from testing code two different files, one is now in bulk and the other is in engine.. But I'm not sure how good of an idea it is to make it an exported non-test function - it's only called in testing code, plus we want to remove usage of MakeRocksDBSstFileWriter. I'd say move these two functions back to the test files (even if it's duplicated in both of them) and make them unexported?

Done.

Copy link
Copy Markdown
Contributor

@itsbilal itsbilal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: save for two really minor suggestions. Thanks so much!

Reviewed 10 of 10 files at r2.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @hueypark, @itsbilal, and @petermattis)


pkg/ccl/storageccl/export_test.go, line 231 at r2 (raw file):

	if enableTimeBoundIteratorOptimization {
		io.MaxTimestampHint = endTime
		io.MinTimestampHint = startTime

While this works too, startTime.Next() is more correct here- and matches what the old code did.


pkg/storage/bulk/sst_batcher.go, line 341 at r1 (raw file):

Previously, hueypark (Jaewan Park) wrote…

Thanks. Done.

👍


pkg/storage/engine/engine.go, line 192 at r2 (raw file):

	// requested or if the start.Timestamp is non-zero. Returns the bytes of an
	// SSTable containing the exported keys, the size of exported data, or an error.
	// TODO(hueypark): Separate MVCCKey into roachpb.Key and hlc.Timestamp.

Nit: add a blank line before the TODO to make it more obvious.

Most of these commits are from [engine: Add ExportToSst method with all logic moved to C++](cockroachdb@75f1314).

Fixes cockroachdb#41739

Release note: None
Copy link
Copy Markdown
Contributor Author

@hueypark hueypark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @itsbilal and @petermattis)


pkg/ccl/storageccl/export_test.go, line 231 at r2 (raw file):

Previously, itsbilal (Bilal Akhtar) wrote…

While this works too, startTime.Next() is more correct here- and matches what the old code did.

Done.


pkg/storage/engine/engine.go, line 192 at r2 (raw file):

Previously, itsbilal (Bilal Akhtar) wrote…

Nit: add a blank line before the TODO to make it more obvious.

Done.

Copy link
Copy Markdown
Contributor

@itsbilal itsbilal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

I'll go ahead and have this merged now, thanks!

bors r+

Reviewed 4 of 4 files at r3.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @itsbilal and @petermattis)

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Nov 7, 2019

Build failed

@petermattis
Copy link
Copy Markdown
Collaborator

Unrelated CI flake. Retrying.

bors r+

craig bot pushed a commit that referenced this pull request Nov 7, 2019
42134: storage/engine: add Pebble support to ExportToSst r=petermattis a=hueypark

Most of these commits are from [engine: Add ExportToSst method with all logic moved to C++](75f1314).

Fixes #41739

Release note: None

Co-authored-by: Jaewan Park <jaewan.huey.park@gmail.com>
@craig
Copy link
Copy Markdown
Contributor

craig bot commented Nov 7, 2019

Build succeeded

@craig craig bot merged commit b058f23 into cockroachdb:master Nov 7, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

storage/engine: add Pebble support to ExportToSst

4 participants