*: Introduce read triggered compaction heuristic#1009
*: Introduce read triggered compaction heuristic#1009aadityasondhi merged 1 commit intocockroachdb:masterfrom
Conversation
itsbilal
left a comment
There was a problem hiding this comment.
Overall some really exciting work! I left some comments after a first pass that you can work on in the meantime. One thing in addition to those: we're going to need tests for this functionality, and the way I see it, there are two moving pieces that need to be individually unit tested: 1) pickReadTriggeredCompaction, and 2) the read sampling in Iterator.
For the compaction picking side, it would be a good idea to spin up a datadriven test that lets you define a couple d.mu.readCompactions scenarios (some with files moved around, some with the simple case) and confirm if pickReadTriggeredCompaction still picks the sensible top file in each case.
The sampling side will be a bit harder, but I can imagine some tests (datadriven or not) that spin up an iterator, reduce AllowedSeeks to 1 for a file at the top, really increase the sampling rate, then confirm that one read puts that file in readCompactions. Feel free to look around the existing iterator tests to see if you can draw some inspiration for something like this, and let me know if you need more help with this part.
Reviewed 15 of 15 files at r1.
Reviewable status: all files reviewed, 13 unresolved discussions (waiting on @aadityasondhi, @jbowens, @petermattis, and @sumeerbhola)
compaction.go, line 1064 at r1 (raw file):
} type readCompaction struct {
Would be a good idea to add a comment here on why readCompactions are based off of key ranges and not file handles (as the version could change between the sampling and the scheduling).
compaction.go, line 1210 at r1 (raw file):
} func (d *DB) checkFlushThreshold() bool {
Nit: this can be passedFlushThreshold to clarify what the return value is.
compaction.go, line 1540 at r1 (raw file):
return } for len(d.mu.compact.readCompactions) > 0 && d.mu.compact.compactingCount < d.opts.MaxConcurrentCompactions && !d.mu.compact.flushing {
All of this should be inside the compaction picker, instead of in this db-level function. See the case of pickElisionOnlyCompaction inside compactionPickerByScore.pickAuto; I imagine pickReadTriggeredCompaction will just be another case like pickElisionOnlyCompaction at the end of that method. You can do everything except for the actual c := newCompaction(...); d.addInProgressCompaction(c); go d.compact() stuff in that method.
The advantage of this approach is that it separates compaction picking code with compaction executing code.
compaction_picker.go, line 1421 at r1 (raw file):
) (slice manifest.LevelSlice, shouldCompact bool) { numOverlap, topLevel := 0, 0 var topOverlapping manifest.LevelSlice
Nit: this can be topOverlaps for better connection with the method that generated it.
compaction_picker.go, line 1434 at r1 (raw file):
} if numOverlap >= 2 { shouldCompact = true
Nit: unnecessary
db.go, line 313 at r1 (raw file):
// inProgress is the set of in-progress flushes and compactions. inProgress map[*compaction]struct{} // readCompactions is a list read triggered compactions. The next compaction
s/list/list of/
iterator.go, line 177 at r1 (raw file):
} // Approximate gap in bytes between samples of data read during iteration. readBytesPeriod := 1048576 * i.readState.db.opts.Experimental.ReadSamplingRate
The 1048576 value should be in a const defined at the top. ReadSamplingRate can probably be renamed to ReadSamplingMultiplier.
iterator.go, line 197 at r1 (raw file):
func (i *Iterator) sampleRead() { topFile := i.readSampling.topFile topLevel, numOverlappingLevels := numLevels+1, 0
numLevels is already an invalid level number (7), so you could just set it to that here, and check if topLevel >= numLevels below.
iterator.go, line 200 at r1 (raw file):
if mi, ok := i.iter.(*mergingIter); ok { if len(mi.levels) > 1 { for _, iter := range mi.levels {
This code block looks like a pyramid. Might be a good idea to add a "forEachLevelItermethod tomergingIterthat takes in afunc(li *levelIter), and you define the stuff in the ok` block below in a separate closure that you pass into that method.
iterator.go, line 205 at r1 (raw file):
file := li.files.Current() if file != nil { numOverlappingLevels++
You can probably stop this loop right here if numOverlappingLevels has exceeded 2. Quick optimization. I believe mi.levels is organized top-to-bottom so you should be able to do this no problem, but you can confirm that.
options.go, line 336 at r1 (raw file):
// ReadCompactionThreshold controls the frequency of read triggered // compactions by adjusting FileMetadata.AllowedSeeks.
You could mention the formula used to set AllowedSeeks here. Mostly to give the intuition that a higher threshold = more read compactions happening, as allowedSeeks would be lower. In that case, it seems less like a read compaction threshold and more like a read compaction rate., as the units are "number of bytes of file size per allowed seek"
internal/manifest/level.go, line 27 at r1 (raw file):
// LevelInt returns the int representation of a Level func LevelInt(l Level) int {
This sounds like a type in itself. Could probably be renamed to LevelToInt.
internal/manifest/version.go, line 97 at r1 (raw file):
IsIntraL0Compacting bool Atomic struct { AllowedSeeks int64
Exported vars usually have a comment.
972300e to
3ed701d
Compare
aadityasondhi
left a comment
There was a problem hiding this comment.
Reviewable status: 2 of 16 files reviewed, 9 unresolved discussions (waiting on @itsbilal, @jbowens, @petermattis, and @sumeerbhola)
compaction.go, line 1540 at r1 (raw file):
Previously, itsbilal (Bilal Akhtar) wrote…
All of this should be inside the compaction picker, instead of in this db-level function. See the case of
pickElisionOnlyCompactioninsidecompactionPickerByScore.pickAuto; I imagine pickReadTriggeredCompaction will just be another case like pickElisionOnlyCompaction at the end of that method. You can do everything except for the actualc := newCompaction(...); d.addInProgressCompaction(c); go d.compact()stuff in that method.The advantage of this approach is that it separates compaction picking code with compaction executing code.
Done.
db.go, line 313 at r1 (raw file):
Previously, itsbilal (Bilal Akhtar) wrote…
s/list/list of/
Done.
iterator.go, line 177 at r1 (raw file):
Previously, itsbilal (Bilal Akhtar) wrote…
The 1048576 value should be in a const defined at the top. ReadSamplingRate can probably be renamed to ReadSamplingMultiplier.
Done.
iterator.go, line 197 at r1 (raw file):
Previously, itsbilal (Bilal Akhtar) wrote…
numLevels is already an invalid level number (7), so you could just set it to that here, and check if
topLevel >= numLevelsbelow.
Done.
iterator.go, line 200 at r1 (raw file):
Previously, itsbilal (Bilal Akhtar) wrote…
This code block looks like a pyramid. Might be a good idea to add a "forEachLevelIter
method tomergingIterthat takes in afunc(li *levelIter), and you define the stuff in theok` block below in a separate closure that you pass into that method.
Done.
iterator.go, line 205 at r1 (raw file):
Previously, itsbilal (Bilal Akhtar) wrote…
You can probably stop this loop right here if numOverlappingLevels has exceeded 2. Quick optimization. I believe mi.levels is organized top-to-bottom so you should be able to do this no problem, but you can confirm that.
Done.
options.go, line 336 at r1 (raw file):
Previously, itsbilal (Bilal Akhtar) wrote…
You could mention the formula used to set AllowedSeeks here. Mostly to give the intuition that a higher threshold = more read compactions happening, as allowedSeeks would be lower. In that case, it seems less like a read compaction threshold and more like a read compaction rate., as the units are "number of bytes of file size per allowed seek"
Done.
internal/manifest/level.go, line 27 at r1 (raw file):
Previously, itsbilal (Bilal Akhtar) wrote…
This sounds like a type in itself. Could probably be renamed to
LevelToInt.
Done.
internal/manifest/version.go, line 97 at r1 (raw file):
Previously, itsbilal (Bilal Akhtar) wrote…
Exported vars usually have a comment.
Done.
itsbilal
left a comment
There was a problem hiding this comment.
I like how this is taking shape! Some more, relatively minor comments that you can get to soon enough. The tests are really the only remaining major piece; after that, this should be good.
Reviewed 14 of 14 files at r2.
Reviewable status: all files reviewed, 7 unresolved discussions (waiting on @aadityasondhi, @jbowens, @petermattis, and @sumeerbhola)
compaction_picker.go, line 1338 at r2 (raw file):
) (pc *pickedCompaction) { if env.readTriggeredCompactionEnv.flushing || env.readTriggeredCompactionEnv.readCompactions == nil { return nil
Worthwhile to add a comment here on why we don't schedule read compactions if flushing is true.
compaction_picker.go, line 1349 at r2 (raw file):
if pc != nil { if pc.startLevel.level == 0 && pc.outputLevel.level == 0 { panic("intra L0 = BAD!")
I assume this was a debug statement, ha. Either way, newPickedCompaction takes care of this (it only creates non-intra-L0 compactions) so you're good.
iterator.go, line 187 at r2 (raw file):
} if i.readSampling.randvar == nil { i.readSampling.randvar = randvar.NewUniform(0, 2*samplingPeriod)
Any reason to not roll this 2* multiplier into the sampling period itself, or in the default for ReadSamplingMultiplier?
iterator.go, line 201 at r2 (raw file):
topLevel, numOverlappingLevels := numLevels, 0 if mi, ok := i.iter.(*mergingIter); ok { if len(mi.levels) > 1 {
This is unnecessary, as your closure would simply not be executed if for any reason there are no levels.
iterator.go, line 208 at r2 (raw file):
numOverlappingLevels++ if numOverlappingLevels >= 2 { // terminate the loop early if at least 2 overlapping levels are found
Nit: Comments in Pebble are usually capitalized like full sentences and with periods at the end. (Same for the other one below).
version_set.go, line 410 at r2 (raw file):
var err error newVersion, zombies, err = bve.Apply(currentVersion, vs.cmp, vs.opts.Comparer.FormatKey, vs.opts.FlushSplitBytes, vs.opts.Experimental.ReadCompactionRate)
This function signature is getting very long. This doesn't need to be done in this PR, but it would be worthwhile to have an applyOptions struct, defined in the same package as Apply (so internal/manifest), that takes all of these options.
internal/manifest/version.go, line 99 at r2 (raw file):
Atomic struct { // AllowedSeeks is used to determine if a file should be picked for // a read triggered compaction. It is decremented when read sampling
Mention when it's decremented - after every positioning operation that returns a user key (eg. Next, Prev, SeekGE, SeekLT, etc).
488a28a to
82e58b4
Compare
67f1556 to
19cbfd1
Compare
aadityasondhi
left a comment
There was a problem hiding this comment.
Reviewable status: 12 of 19 files reviewed, 5 unresolved discussions (waiting on @itsbilal, @jbowens, @petermattis, and @sumeerbhola)
compaction_picker.go, line 1349 at r2 (raw file):
Previously, itsbilal (Bilal Akhtar) wrote…
I assume this was a debug statement, ha. Either way,
newPickedCompactiontakes care of this (it only creates non-intra-L0 compactions) so you're good.
Done.
iterator.go, line 187 at r2 (raw file):
Previously, itsbilal (Bilal Akhtar) wrote…
Any reason to not roll this 2* multiplier into the sampling period itself, or in the default for ReadSamplingMultiplier?
This is because the random variable is a uniform distribution and we want the expected value to be the constant we set. We get that by multiplying by 2. LevelDB also does something similar here: https://github.com/google/leveldb/blob/master/db/db_iter.cc#L106
iterator.go, line 201 at r2 (raw file):
Previously, itsbilal (Bilal Akhtar) wrote…
This is unnecessary, as your closure would simply not be executed if for any reason there are no levels.
This was intended to be a bit of a micro-optimization. I am checking for at least 2 levels here so that there is a chance there is an overlap. If there is only one level then my loop underneath would run but there is no chance I find an overlap. Maybe I am just overthinking. Let me know if this makes sense, otherwise happy to remove.
version_set.go, line 410 at r2 (raw file):
Previously, itsbilal (Bilal Akhtar) wrote…
This function signature is getting very long. This doesn't need to be done in this PR, but it would be worthwhile to have an applyOptions struct, defined in the same package as Apply (so
internal/manifest), that takes all of these options.
Going to defer this to later in a separate PR.
internal/manifest/version.go, line 99 at r2 (raw file):
Previously, itsbilal (Bilal Akhtar) wrote…
Mention when it's decremented - after every positioning operation that returns a user key (eg. Next, Prev, SeekGE, SeekLT, etc).
Done.
itsbilal
left a comment
There was a problem hiding this comment.
Great work on tests! I think the non-test code is in very good shape, save for the one edge case we discussed this morning (and I see you have a TODO for that already).
For some of the tests I'm not entirely convinced if maybe-compact is actually doing read compactions or if it's doing score-based compactions that happen to be what you expect, so I left some suggestions around confirming this and messing with file sizes to quieten down score-based compactions.
Reviewed 8 of 8 files at r4.
Reviewable status: all files reviewed, 17 unresolved discussions (waiting on @aadityasondhi, @itsbilal, @jbowens, @petermattis, and @sumeerbhola)
compaction_picker.go, line 1338 at r4 (raw file):
) (pc *pickedCompaction) { // If a flush is in-progress or expected to happen soon, it means more writes are taking place. We would // then be scheduling more write focussed compactions. In this case, skip read compactions as they are
Nit: We would "soon" be scheduling more write-based compactions.
compaction_picker.go, line 1340 at r4 (raw file):
// then be scheduling more write focussed compactions. In this case, skip read compactions as they are // lower priority. if env.readTriggeredCompactionEnv.flushing || env.readTriggeredCompactionEnv.readCompactions == nil {
len(env.readTriggeredCompactionEnv.readCompactions) == 0 is slightly more general than the nil check and more performant.
Also readTriggeredCompactionEnv seems like a very long variable name. I'd be open to just readCompactionEnv or rtCompactionEnv.
compaction_test.go, line 1705 at r4 (raw file):
case "true": d.mu.compact.flushing = true case "false":
Nit: can remove this, as default would capture it too.
compaction_test.go, line 1718 at r4 (raw file):
parts := strings.Split(line, " ") if len(parts) != 2 { panic("malformed data for set-read-compaction")
Might be better to just return an error string here, like "
compaction_test.go, line 1724 at r4 (raw file):
rc := readCompaction{ level: l, start: []byte(keys[0]),
I think you'd have to run this by ParseInternalKey, and also use the a.SET.5 format instead of a#5,SET.
compaction_test.go, line 1738 at r4 (raw file):
d.mu.Lock() d.opts.private.disableAutomaticCompactions = false d.maybeScheduleCompaction()
You might be scheduling non-read compactions here by mistake, if you aren't defining enough files in other levels to have a proper LSM shape. It would be worth setting the size parameter in your tests so you can ensure that isn't happening. You can also add a show-read-compactions command that just shows that queue, and if it's empty after this call, you can confirm that the read compaction you added earlier was taken up.
iterator.go, line 201 at r2 (raw file):
Previously, aadityasondhi (Aaditya Sondhi) wrote…
This was intended to be a bit of a micro-optimization. I am checking for at least 2 levels here so that there is a chance there is an overlap. If there is only one level then my loop underneath would run but there is no chance I find an overlap. Maybe I am just overthinking. Let me know if this makes sense, otherwise happy to remove.
Sounds good.
iterator.go, line 191 at r4 (raw file):
} if i.readSampling.randvar == nil { i.readSampling.randvar = randvar.NewUniform(0, 2*samplingPeriod)
Having a lot of conditionals can also be bad for performance. It might be a good idea to merge this with the above conditional, as you should never have any case where rand == nil but randVar != nil or vice versa. Just do this initialization inside the same if statement as above.
iterator.go, line 209 at r4 (raw file):
l := manifest.LevelToInt(li.level) if file := li.files.Current(); file != nil { if containsKey := i.cmp(file.Smallest.UserKey, i.key) <= 0 && i.cmp(file.Largest.UserKey, i.key) >= 0; containsKey {
Full-key comparisons can actually be somewhat slow. There's one optimization you can make here; you need to only check i.cmp(file.Smallest.UserKey, i.key) <= 0 if if i.dir == iterPosNext || i.dir == iterPosCurForward, and i.cmp(file.Largest.UserKey, i.key) >= 0 if i.dir == iterPosPrev || i.dir == iterPosCurReverse
Also containsKey var is unnecessary. But if this line gets really long after the suggested change, you can break that var definition out to be before the if statement, and then just do if containsKey.
iterator_test.go, line 536 at r4 (raw file):
return err.Error() } d.opts.private.forceReadSampling = true
As mentioned offline, putting this in the iterator may be cleaner. You can do that down in the iter case.
iterator_test.go, line 589 at r4 (raw file):
return runIterCmd(td, iter) case "read-compaction":
Nit: read-compactions, the plural
iterator_test.go, line 595 at r4 (raw file):
d.mu.Lock() s := fmt.Sprintf("%+v", d.mu.compact.readCompactions)
While this works, it's neater to print one read compaction per line, in a more human-friendly format.
iterator_test.go, line 599 at r4 (raw file):
return s case "pending-read-compaction":
Nit: iter-read-compactions may be a bit more intuitive of a name.
options.go, line 342 at r4 (raw file):
// // TODO(aadityasondhi): Experiment with this to find a good value ReadSamplingMultiplier float64
This can become a uint64 again now that you don't need to pass in fractions for tests.
testdata/compaction_read_triggered, line 17 at r4 (raw file):
---- maybe-compact
Add a read-compactions call that shows the state of that queue before and after each of these maybe-compact calls. Otherwise tests look good.
|
compaction_test.go, line 1718 at r4 (raw file): Previously, itsbilal (Bilal Akhtar) wrote…
Oops, incomplete comment. I meant like |
19cbfd1 to
daa62ea
Compare
aadityasondhi
left a comment
There was a problem hiding this comment.
TFTR! I made almost all of the changes you recommended. Had a couple of comments/questions below.
Reviewable status: 11 of 19 files reviewed, 13 unresolved discussions (waiting on @itsbilal, @jbowens, @petermattis, and @sumeerbhola)
compaction_picker.go, line 1340 at r4 (raw file):
Previously, itsbilal (Bilal Akhtar) wrote…
len(env.readTriggeredCompactionEnv.readCompactions) == 0is slightly more general than thenilcheck and more performant.Also
readTriggeredCompactionEnvseems like a very long variable name. I'd be open to justreadCompactionEnvorrtCompactionEnv.
env.readTriggeredCompactionEnv.readCompactions is a pointer to the list so I am doing a null check. I think it would panic if I just check for len() without making sure its not nil first.
I renamed it though. I agree it is a little too long
compaction_test.go, line 1718 at r4 (raw file):
Previously, itsbilal (Bilal Akhtar) wrote…
Oops, incomplete comment. I meant like
usage: <level>: start.SET.1-end.SET.2
Done.
compaction_test.go, line 1724 at r4 (raw file):
Previously, itsbilal (Bilal Akhtar) wrote…
I think you'd have to run this by
ParseInternalKey, and also use thea.SET.5format instead ofa#5,SET.
Done.
compaction_test.go, line 1738 at r4 (raw file):
Previously, itsbilal (Bilal Akhtar) wrote…
You might be scheduling non-read compactions here by mistake, if you aren't defining enough files in other levels to have a proper LSM shape. It would be worth setting the
sizeparameter in your tests so you can ensure that isn't happening. You can also add ashow-read-compactionscommand that just shows that queue, and if it's empty after this call, you can confirm that the read compaction you added earlier was taken up.
I added the show-read-compactions command. By size, are you referring to FileMetadata.Size?
iterator.go, line 191 at r4 (raw file):
Previously, itsbilal (Bilal Akhtar) wrote…
Having a lot of conditionals can also be bad for performance. It might be a good idea to merge this with the above conditional, as you should never have any case where
rand == nilbutrandVar != nilor vice versa. Just do this initialization inside the same if statement as above.
Put it in with a || to still check both. Let me know if we would rather just only check one. I feel a little uneasy not doing the null check for both even if there isn't a case where it could happen.
iterator.go, line 209 at r4 (raw file):
Previously, itsbilal (Bilal Akhtar) wrote…
Full-key comparisons can actually be somewhat slow. There's one optimization you can make here; you need to only check
i.cmp(file.Smallest.UserKey, i.key) <= 0if ifi.dir == iterPosNext || i.dir == iterPosCurForward, andi.cmp(file.Largest.UserKey, i.key) >= 0ifi.dir == iterPosPrev || i.dir == iterPosCurReverseAlso
containsKeyvar is unnecessary. But if this line gets really long after the suggested change, you can break that var definition out to be before the if statement, and then just doif containsKey.
That's true, thanks for pointing it out.
iterator_test.go, line 536 at r4 (raw file):
Previously, itsbilal (Bilal Akhtar) wrote…
As mentioned offline, putting this in the iterator may be cleaner. You can do that down in the
itercase.
Done.
iterator_test.go, line 595 at r4 (raw file):
Previously, itsbilal (Bilal Akhtar) wrote…
While this works, it's neater to print one read compaction per line, in a more human-friendly format.
Done.
options.go, line 342 at r4 (raw file):
Previously, itsbilal (Bilal Akhtar) wrote…
This can become a
uint64again now that you don't need to pass in fractions for tests.
Done.
testdata/compaction_read_triggered, line 17 at r4 (raw file):
Previously, itsbilal (Bilal Akhtar) wrote…
Add a
read-compactionscall that shows the state of that queue before and after each of thesemaybe-compactcalls. Otherwise tests look good.
Done.
daa62ea to
047010c
Compare
itsbilal
left a comment
There was a problem hiding this comment.
Great stuff! I imagine this is the last round of suggested fixes, and then this is good to go. The sampling tests look good. Only major suggestion is to update the way define works in the read compaction picking tests, and to test / highlight two other cases there.
Non-test code looks good to merge.
Reviewed 10 of 10 files at r5.
Reviewable status: all files reviewed, 8 unresolved discussions (waiting on @aadityasondhi, @jbowens, @petermattis, and @sumeerbhola)
compaction_picker.go, line 1340 at r4 (raw file):
Previously, aadityasondhi (Aaditya Sondhi) wrote…
env.readTriggeredCompactionEnv.readCompactionsis a pointer to the list so I am doing a null check. I think it would panic if I just check forlen()without making sure its notnilfirst.I renamed it though. I agree it is a little too long
If you do len(nil) it returns 0 instead of panicking as nil is a valid 0-length slice. You can try it in Go Playground. That's why it's preferable; it handles both the nil case as well as the empty slice case.
compaction_test.go, line 1738 at r4 (raw file):
Previously, aadityasondhi (Aaditya Sondhi) wrote…
I added the
show-read-compactionscommand. Bysize, are you referring toFileMetadata.Size?
Yep, I meant FileMetadata.Size. I thought runDbDefineCmd would let you pass in file sizes for each file but apparently not. It's more work than I guessed at first (but also not significantly so). Basically, runDBDefineCmd lets you pass in individual keys for each level, that it then flushes directly to the level you specify, using some custom flushing/level-setting logic.
If you ctrl+f for "parseMeta" in compaction_test.go or compaction_picker_test.go, you'll see some tests that use a custom parseMeta function that allow you to directly specify files and sometimes file sizes. You may want to copy one of those define cases verbatim, such as the one in TestCompactionFindL0Limit, as it's closer to what you want. Once you have made that change, you can do something like this:
L5
100: a.SET.5-b.SET.6 size=10
L6
010: a.SET.0-b.SET.0 size=100
Now L6 and L5 have the right relative shape, so you're guaranteed to not get a score-based compaction in the tests.
compaction_test.go, line 1697 at r5 (raw file):
return s case "set-read-compaction":
Nit: set implies you're clearing it first, and then adding all the read compactions you've specified. Since you're doing an append here, it makes sense to rename it to add-read-compaction.
compaction_test.go, line 1723 at r5 (raw file):
rc := readCompaction{ level: l, start: base.ParseInternalKey(keys[0]).UserKey,
Ah oops, I didn't realize you were just taking the UserKey anyway. Then you can go back to the old []byte() case, but ensure you don't specify sequence number and kind in the tests. I.e. it'd just be
add-read-compaction
a-b
iterator.go, line 191 at r4 (raw file):
Previously, aadityasondhi (Aaditya Sondhi) wrote…
Put it in with a
||to still check both. Let me know if we would rather just only check one. I feel a little uneasy not doing the null check for both even if there isn't a case where it could happen.
Usually we're often okay with panicking if one is set but not the other, as it'd highlight an issue more quickly. But I'm also okay with a defensive programming solution like you've done, so it's okay.
iterator_test.go, line 533 at r5 (raw file):
var err error if d, err = runDBDefineCmd(td, opts); err != nil {
BTW, this use of runDBDefineCmd is okay as this test isn't influenced by file sizes / bounds too much.
testdata/compaction_read_triggered, line 6 at r5 (raw file):
a.SET.55:a b.SET.5:b L6 a.SET.55:a b.SET.5:b
I would also add a simple variant of this test case after you've made the define changes, where L5 and L6 have another file beside the a-b one (eg. e-f), that stays unaffected throughout the whole thing. Or you could just update this test case to have that second file in both levels, up to you.
testdata/compaction_read_triggered, line 41 at r5 (raw file):
Nit: too many blank lines (and the same thing below). The large comment is enough of a test case delimiter.
testdata/compaction_read_triggered, line 175 at r5 (raw file):
---- 6: 000006:[a#0,SET-b#0,SET]
I would add one more test case (after you make the define changes): where L0 has a file in the key range, and so does L5 and L6. And you add-read-compaction the range but in L4. Your logic should compact the file in L0. To make things interesting you can also use slightly different ranges; the L0 file can be a-b and size 10, the L5 file a-d and size 100, and L6 a-e and size 1000. And you can add-read-compaction L4 a-c and see what happens.
Otherwise this file looks great.
testdata/iterator_read_sampling, line 169 at r5 (raw file):
# EDGE CASE (TODO(aaditya): Investigate the performance impact of the fix for this)
I'd move this TODO to sampleRead for higher visibility, and if you think you won't be able to get to it by the end of your internship, you would also want to file an issue about this missed case in read compactions. I'd also expand the comment below with a short sentence to explain why sampling skips that file; it's because the iterator has already moved past it.
|
compaction_picker.go, line 1340 at r4 (raw file): Previously, itsbilal (Bilal Akhtar) wrote…
Oops nevermind, I missed the fact that it's a pointer to a slice. This is good then. |
047010c to
9486b4c
Compare
aadityasondhi
left a comment
There was a problem hiding this comment.
TFTR! I made most of the changes recommended. I took a slightly different approach to address the missing test cases which I have outlined in the comments below. I also found a potential bug in the logic that checks for a file picked for read compaction that has been compacted by another compaction. I will be investigating it.
Reviewable status: 15 of 21 files reviewed, 5 unresolved discussions (waiting on @itsbilal, @jbowens, @petermattis, and @sumeerbhola)
compaction_test.go, line 1738 at r4 (raw file):
Previously, itsbilal (Bilal Akhtar) wrote…
Yep, I meant
FileMetadata.Size. I thoughtrunDbDefineCmdwould let you pass in file sizes for each file but apparently not. It's more work than I guessed at first (but also not significantly so). Basically,runDBDefineCmdlets you pass in individual keys for each level, that it then flushes directly to the level you specify, using some custom flushing/level-setting logic.If you ctrl+f for "parseMeta" in
compaction_test.goorcompaction_picker_test.go, you'll see some tests that use a customparseMetafunction that allow you to directly specify files and sometimes file sizes. You may want to copy one of thosedefinecases verbatim, such as the one inTestCompactionFindL0Limit, as it's closer to what you want. Once you have made that change, you can do something like this:L5 100: a.SET.5-b.SET.6 size=10 L6 010: a.SET.0-b.SET.0 size=100Now L6 and L5 have the right relative shape, so you're guaranteed to not get a score-based compaction in the tests.
I added the tests but took a slightly different approach that made more sense to me. I added these tests in compaction_picker_test.go instead so that I could preserve testing the db related logic that included setting flush and copying a pointer for db.mu.compact.readCompactions into compactionEnv inside compaction_test.go. Open for suggestions if you know of a better way to split these up.
compaction_test.go, line 1723 at r5 (raw file):
Previously, itsbilal (Bilal Akhtar) wrote…
Ah oops, I didn't realize you were just taking the UserKey anyway. Then you can go back to the old
[]byte()case, but ensure you don't specify sequence number and kind in the tests. I.e. it'd just beadd-read-compaction a-b
Done.
testdata/compaction_read_triggered, line 6 at r5 (raw file):
Previously, itsbilal (Bilal Akhtar) wrote…
I would also add a simple variant of this test case after you've made the
definechanges, where L5 and L6 have another file beside thea-bone (eg.e-f), that stays unaffected throughout the whole thing. Or you could just update this test case to have that second file in both levels, up to you.
Same as previous. Added to tests in compaction_picker_test.go.
testdata/compaction_read_triggered, line 175 at r5 (raw file):
Previously, itsbilal (Bilal Akhtar) wrote…
I would add one more test case (after you make the
definechanges): where L0 has a file in the key range, and so does L5 and L6. And youadd-read-compactionthe range but in L4. Your logic should compact the file in L0. To make things interesting you can also use slightly different ranges; the L0 file can be a-b and size 10, the L5 file a-d and size 100, and L6 a-e and size 1000. And you canadd-read-compaction L4 a-cand see what happens.Otherwise this file looks great.
Added the test in compaction_test.go. Interestingly enough, it highlights a bug in the code where base level is being set to 1 even though L1-L4 are empty. I will look into it.
testdata/iterator_read_sampling, line 169 at r5 (raw file):
Previously, itsbilal (Bilal Akhtar) wrote…
I'd move this TODO to
sampleReadfor higher visibility, and if you think you won't be able to get to it by the end of your internship, you would also want to file an issue about this missed case in read compactions. I'd also expand the comment below with a short sentence to explain why sampling skips that file; it's because the iterator has already moved past it.
Moved it to the code. I still plan to get to it but in case I do not, I will file an issue for this.
9486b4c to
d5d9287
Compare
aadityasondhi
left a comment
There was a problem hiding this comment.
Fixed the "bug" in the most recent revision. As discussed offline, it was just a test setup issue.
Reviewable status: 15 of 22 files reviewed, 5 unresolved discussions (waiting on @itsbilal, @jbowens, @petermattis, and @sumeerbhola)
itsbilal
left a comment
There was a problem hiding this comment.
great work! This is good to merge. Just a couple of minor changes and then you can merge as-is without waiting for a re-review from me.
One more thing: Update the commit message and PR description to say "Fixes #29", so that that issue gets closed when you merge this PR.
Reviewed 7 of 7 files at r6.
Reviewable status: all files reviewed, 5 unresolved discussions (waiting on @aadityasondhi, @jbowens, @petermattis, and @sumeerbhola)
compaction_picker_test.go, line 902 at r6 (raw file):
parts = strings.Split(fields[0], "-") if len(parts) != 2 { return nil, errors.Errorf("malformed table spec: %s", s)
You might want to add usage: <file-num>: start.SET.1-end.SET.2 as that makes this all more readable.
data_test.go, line 398 at r6 (raw file):
} d.mu.Lock() if baseLevel < 0 {
I'd remove this condition entirely (i.e. do nothing if baseLevel is negative), and leave dynamicBaseLevel to its default of true.
data_test.go, line 401 at r6 (raw file):
d.mu.versions.dynamicBaseLevel = false } else { d.mu.versions.dynamicBaseLevel = true
This should be false if you're manually setting a baseLevel.
testdata/compaction_picker_read_triggered, line 19 at r6 (raw file):
000011:[g#1,SET-l#2,SET] add-read-compaction
I would add a pick-auto before this that doesn't pick anything, just to make it very clear that this is working as intended.
testdata/compaction_read_triggered, line 84 at r6 (raw file):
# Test case where there is mismatch in the level of chosen read compaction and current version, but the correct target # file is in L0. PickReadTriggeredCompaction should be able to handle this and the L0 file that overlaps in the same # key range to compact.
I think you should move this test to compaction_picker_read_triggered. Two benefits: 1) it tests this "complex case" in that test file as well, and 2) you don't need to mess with base-level as (I believe) that code path sets it correctly.
|
Exciting!!! 🎉 |
Previously, our compactions were triggered on write based heuristics. This change introduces compactions based on high read activity to improve read performance in read-heavy workloads. It is inspired from LevelDB's read based compaction heuristic which is outlined in the following issue: cockroachdb#29. These compactions are triggered using an `AllowedSeeks` parameter on each file's `FileMetadata`. Reads are sampled based on a rate defined in `iterator.MaybeSampleRead()`. If a read is sampled, `AllowedSeeks` is decremented on the file in the top-most level containing the key. Once `AllowedSeeks` reaches 0, a compaction for the key range is scheduled. Read triggered compactions are only considered if no other compaction is possible at the time. This helps prioritize score based compactions to maintain a healthy LSM shape. The results of this change in benchmarks are outlined in the github issue: cockroachdb#29. Fixes cockroachdb#29.
d5d9287 to
27b322c
Compare
aadityasondhi
left a comment
There was a problem hiding this comment.
Reviewable status: 18 of 21 files reviewed, 4 unresolved discussions (waiting on @itsbilal, @jbowens, @petermattis, and @sumeerbhola)
data_test.go, line 398 at r6 (raw file):
Previously, itsbilal (Bilal Akhtar) wrote…
I'd remove this condition entirely (i.e. do nothing if baseLevel is negative), and leave
dynamicBaseLevelto its default of true.
Removed based on the comment below.
data_test.go, line 401 at r6 (raw file):
Previously, itsbilal (Bilal Akhtar) wrote…
This should be
falseif you're manually setting a baseLevel.
Removed based on the comment below.
testdata/compaction_picker_read_triggered, line 19 at r6 (raw file):
Previously, itsbilal (Bilal Akhtar) wrote…
I would add a
pick-autobefore this that doesn't pick anything, just to make it very clear that this is working as intended.
Done.
testdata/compaction_read_triggered, line 84 at r6 (raw file):
Previously, itsbilal (Bilal Akhtar) wrote…
I think you should move this test to
compaction_picker_read_triggered. Two benefits: 1) it tests this "complex case" in that test file as well, and 2) you don't need to mess with base-level as (I believe) that code path sets it correctly.
Done.
Previously, our compactions were triggered on write based heuristics.
This change introduces compactions based on high read activity to improve
read performance in read-heavy workloads. It is inspired from LevelDB's
read based compaction heuristic which is outlined in the following issue:
#29.
These compactions are triggered using an
AllowedSeeksparameter oneach file's
FileMetadata. Reads are sampled based on a rate defined initerator.MaybeSampleRead(). If a read is sampled,AllowedSeeksisdecremented on the file in the top-most level containing the key. Once
AllowedSeeksreaches 0, a compaction for the key range is scheduled.Read triggered compactions are only considered if no other compaction is
possible at the time. This helps prioritize score based compactions to
maintain a healthy LSM shape.
The results of this change in benchmarks are outlined in the github
issue: #29.
Fixes #29.