Add command to stream contents of DB into another DB.#1463
Conversation
parasssh
left a comment
There was a problem hiding this comment.
LGTM but I think Ibrahim should also approve.
jarifibrahim
left a comment
There was a problem hiding this comment.
Got some comments.
Reviewable status: 0 of 3 files reviewed, 11 unresolved discussions (waiting on @ashish-goswami, @jarifibrahim, @manishrjain, and @martinmr)
db.go, line 1755 at r1 (raw file):
// Stream the contents of this DB to a new DB with options outOptions that will be // created in outDir. func (db *DB) StreamDB(outDir string , outOptions Options) error {
You don't need outDir. outOptions already contains the outDir.
db.go, line 1756 at r1 (raw file):
// created in outDir. func (db *DB) StreamDB(outDir string , outOptions Options) error { if err := os.MkdirAll(outDir, 0700); err != nil {
We can remove this as well.
db_test.go, line 385 at r1 (raw file):
require.NoError(t, err) defer removeDir(dir) opts := getTestOptions(dir)
Compression is disabled by default. You should enable it here so that we can disable it while streaming out.
db_test.go, line 403 at r1 (raw file):
outDir, err := ioutil.TempDir("", "badger-test") require.NoError(t, err) outOpt := getTestOptions(outDir).WithCompression(options.None).WithReadOnly(false)
You can remove WithCompression and WithReadOnly. They're set correctly by default.
db_test.go, line 415 at r1 (raw file):
key := []byte(fmt.Sprintf("key%d", i)) val := []byte(fmt.Sprintf("val%d", i)) txn := db.NewTransactionAt(1, false)
This should be outDB. Inserted into db, read from outDB.
or you could do get on both the DBs and compare the values.
badger/cmd/stream.go, line 32 at r1 (raw file):
Short: "Stream DB into another DB with different options", Long: ` This command streams the contents of this DB into another DB with the given options.
Mention over here that outDir should be empty.
The stream writer will drop all data if the directory already contains data.
You can add two checks as well
- If the
outDirexists, ensure it is empty. - If the
outDirexists and it is non-empty, abort.
We shouldn't modify an existing Badger DB. If there is data in outDir, the user can use a different directory or clean up the outDir.
badger/cmd/stream.go, line 42 at r1 (raw file):
// TODO: Add more options. RootCmd.AddCommand(streamCmd) streamCmd.Flags().StringVarP(&outDir, "out", "o", "", "Path to input DB")
Path to output DB
badger/cmd/stream.go, line 45 at r1 (raw file):
streamCmd.Flags().BoolVarP(&truncate, "truncate", "", false, "Option to truncate the DBs") streamCmd.Flags().BoolVarP(&readOnly, "read_only", "", true, "Option to open in DB in read-only mode")
Option to open input DB in read-only mode.
badger/cmd/stream.go, line 54 at r1 (raw file):
WithTruncate(truncate). WithValueThreshold(1 << 10 /* 1KB */). WithNumVersionsToKeep(math.MaxInt32)
Add a flag for NumVersionsToKeep. Use math.MaxInt32 value if the flag is set to 0 (mention this in the flag description).
badger/cmd/stream.go, line 58 at r1 (raw file):
// Options for output DB. outOpt := inOpt.WithDir(outDir).WithValueDir(outDir). WithCompression(options.None).WithReadOnly(false)
Allow user to specify different compression algorithm. You can have 3 values. 0 to disable, 1 for snappy and 2 for zstd.
This would allow us to switch/remove compression from a badger directory.
badger/cmd/stream.go, line 65 at r1 (raw file):
} defer inDB.Close() return inDB.StreamDB(outDir, outOpt)
We don't need the outDir. It is set in the outOpt.
martinmr
left a comment
There was a problem hiding this comment.
Reviewable status: 0 of 3 files reviewed, 11 unresolved discussions (waiting on @ashish-goswami, @jarifibrahim, and @manishrjain)
db.go, line 1755 at r1 (raw file):
Previously, jarifibrahim (Ibrahim Jarif) wrote…
You don't need
outDir.outOptionsalready contains theoutDir.
Done.
db.go, line 1756 at r1 (raw file):
Previously, jarifibrahim (Ibrahim Jarif) wrote…
We can remove this as well.
Done.
db_test.go, line 385 at r1 (raw file):
Previously, jarifibrahim (Ibrahim Jarif) wrote…
Compression is disabled by default. You should enable it here so that we can disable it while streaming out.
Done.
db_test.go, line 403 at r1 (raw file):
Previously, jarifibrahim (Ibrahim Jarif) wrote…
You can remove
WithCompressionandWithReadOnly. They're set correctly by default.
Done.
db_test.go, line 415 at r1 (raw file):
Previously, jarifibrahim (Ibrahim Jarif) wrote…
This should be
outDB. Inserted intodb, read fromoutDB.
or you could do get on both the DBs and compare the values.
Done.
badger/cmd/stream.go, line 32 at r1 (raw file):
Previously, jarifibrahim (Ibrahim Jarif) wrote…
Mention over here that outDir should be empty.
The stream writer will drop all data if the directory already contains data.
You can add two checks as well
- If the
outDirexists, ensure it is empty.- If the
outDirexists and it is non-empty, abort.We shouldn't modify an existing Badger DB. If there is data in
outDir, the user can use a different directory or clean up theoutDir.
Done.
badger/cmd/stream.go, line 42 at r1 (raw file):
Previously, jarifibrahim (Ibrahim Jarif) wrote…
Path to
output DB
Done.
badger/cmd/stream.go, line 45 at r1 (raw file):
Previously, jarifibrahim (Ibrahim Jarif) wrote…
Option to open input DB in read-only mode.
Done.
badger/cmd/stream.go, line 54 at r1 (raw file):
Previously, jarifibrahim (Ibrahim Jarif) wrote…
Add a flag for
NumVersionsToKeep. Use math.MaxInt32 value if the flag is set to 0 (mention this in the flag description).
Done.
badger/cmd/stream.go, line 58 at r1 (raw file):
Previously, jarifibrahim (Ibrahim Jarif) wrote…
Allow user to specify different compression algorithm. You can have 3 values. 0 to disable, 1 for snappy and 2 for zstd.
This would allow us to switch/remove compression from a badger directory.
Done.
badger/cmd/stream.go, line 65 at r1 (raw file):
Previously, jarifibrahim (Ibrahim Jarif) wrote…
We don't need the
outDir. It is set in theoutOpt.
Done.
jarifibrahim
left a comment
There was a problem hiding this comment.
Reviewable status: 0 of 3 files reviewed, 1 unresolved discussion (waiting on @ashish-goswami, @jarifibrahim, @manishrjain, and @martinmr)
db.go, line 1756 at r2 (raw file):
// created in outDir. func (db *DB) StreamDB(outOptions Options) error { if outOptions.Dir != outOptions.ValueDir {
We don't need them to match. Users can specify different out dir for vlog and ssts.
martinmr
left a comment
There was a problem hiding this comment.
Reviewable status: 0 of 3 files reviewed, 1 unresolved discussion (waiting on @ashish-goswami, @jarifibrahim, and @manishrjain)
db.go, line 1756 at r2 (raw file):
Previously, jarifibrahim (Ibrahim Jarif) wrote…
We don't need them to match. Users can specify different out dir for vlog and ssts.
Done.
3ac07a9 to
9a7f62d
Compare
(cherry picked from commit dc653b0)
For now this tool streams the contents into another DB with compression turned off.
This change is