db: check for in-progress compactions during input expansion#927
Conversation
3961a98 to
122080c
Compare
itsbilal
left a comment
There was a problem hiding this comment.
Looks solid so far, great work! I'll review the test very shortly, but until I have a suggestion to get an additional isCompacting check without much additional work.
Reviewed 3 of 6 files at r1.
Reviewable status: 3 of 6 files reviewed, 4 unresolved discussions (waiting on @aadityasondhi, @itsbilal, and @jbowens)
compaction_picker.go, line 261 at r1 (raw file):
} grow0 := pc.version.Overlaps(pc.startLevel.level, pc.cmp, sm.UserKey, la.UserKey) grow0, isCompacting := expandToAtomicUnit(pc.cmp, grow0)
I just realized two interesting things here:
- We almost always call
expandToAtomicUnitafter callingOverlaps, which grows the compaction, but doesn't have this check - If one file in an atomic compaction unit is compacting, they all must be compacting, and your change already checks the boundary atomic compaction units for isCompacting but only if we grew the compaction on that side to "complete" the ACU.
If we return isCompacting = true even if we didn't grow the compaction, we are basically able to overlay an isCompacting check on top of Overlaps without significant additional work. The change for that would have to be in expandToAtomicUnits
compaction_picker.go, line 328 at r1 (raw file):
if inputs.Empty() { // Nothing to expand. return inputs, isCompacting
For better readability, it might be better to just return inputs, false here.
compaction_picker.go, line 349 at r1 (raw file):
// include prev in the compaction. if cur.Compacting {
Moving this block to the start of this loop would get you the isCompacting = true assignment even if you don't grow the compaction. See earlier comment on the motivation behind this.
compaction_picker.go, line 372 at r1 (raw file):
if cur.Compacting { isCompacting = true
Same point as earlier
122080c to
05f94a2
Compare
aadityasondhi
left a comment
There was a problem hiding this comment.
Reviewable status: 3 of 6 files reviewed, 4 unresolved discussions (waiting on @itsbilal and @jbowens)
compaction_picker.go, line 261 at r1 (raw file):
Previously, itsbilal (Bilal Akhtar) wrote…
I just realized two interesting things here:
- We almost always call
expandToAtomicUnitafter callingOverlaps, which grows the compaction, but doesn't have this check- If one file in an atomic compaction unit is compacting, they all must be compacting, and your change already checks the boundary atomic compaction units for isCompacting but only if we grew the compaction on that side to "complete" the ACU.
If we return
isCompacting = trueeven if we didn't grow the compaction, we are basically able to overlay anisCompactingcheck on top ofOverlapswithout significant additional work. The change for that would have to be inexpandToAtomicUnits
Thanks for pointing that out! I will make the change.
compaction_picker.go, line 328 at r1 (raw file):
Previously, itsbilal (Bilal Akhtar) wrote…
For better readability, it might be better to just
return inputs, falsehere.
Done.
compaction_picker.go, line 349 at r1 (raw file):
Previously, itsbilal (Bilal Akhtar) wrote…
Moving this block to the start of this loop would get you the
isCompacting = trueassignment even if you don't grow the compaction. See earlier comment on the motivation behind this.
Done.
compaction_picker.go, line 372 at r1 (raw file):
Previously, itsbilal (Bilal Akhtar) wrote…
Same point as earlier
Done.
itsbilal
left a comment
There was a problem hiding this comment.
Looks great! Just some more minor changes and then this should be good to go.
Reviewed 3 of 6 files at r1, 1 of 1 files at r2.
Reviewable status: all files reviewed, 6 unresolved discussions (waiting on @aadityasondhi and @jbowens)
compaction_picker.go, line 321 at r2 (raw file):
// tombstones will be truncated at "b" (the largest key in its atomic // compaction unit). In the scenario here, that could result in b#1 becoming // visible when it should be deleted.
It would be worthwhile to add a sentence or two (maybe in a new paragraph) to this comment to specify what isCompacting means. Just mention that it would be true if any of the boundary files / atomic compaction units are already compacting.
compaction_picker.go, line 375 at r2 (raw file):
}) return inputs, isCompacting
The cur.Compacting check would not happen if there's no next/prev file to start with, so you could also double it up with an isCompacting || inputIter.First().Compacting || inputIter.Last().Compacting here.
compaction_picker_test.go, line 573 at r1 (raw file):
opts.EnsureDefaults() parseMeta := func(s string) *fileMetadata { parts := strings.Split(s, ",")
Our convention in other files seems to be to use space to split filemeta-level tags like this, instead of commas.
compaction_picker_test.go, line 645 at r1 (raw file):
[]byte(d.CmdArgs[0].String()), []byte(d.CmdArgs[1].String())) var compactionFlag bool
Nit: s/compactionFlag/isCompacting/ (or isCompactingFlag)
compaction_picker_test.go, line 662 at r1 (raw file):
} if compactionFlag { fmt.Fprintf(&buf, "compacting\n")
Nit: s/compacting/is-compacting
testdata/compaction_setup_inputs, line 95 at r1 (raw file):
L2 a.SET.3-c.SET.4,compacting c.SET.3-d.SET.2
Since you're starting the compaction "rooted" at a, it would be a better idea to put the compacting flag on the next file.
And to ensure you return isCompacting = true even if you didn't grow the compaction but the start/end file is compacting, you could add a case similar to this one, with a compacting a-c file, and a d-e file instead of a c-d file. And to account for the case where there's just no next file but you still return isCompacting = true. you could keep the compacting a-c file and have no other files in L2.
jbowens
left a comment
There was a problem hiding this comment.
It might be outside the scope of this project, but there's a TODO in pickFile that is pretty related:
Lines 772 to 774 in 255915d
nit: we typically use imperative mood for the commit subject line. eg:
db: check for in-progress compactions during input expansion
https://cockroachlabs.atlassian.net/wiki/spaces/CRDB/pages/73072807/Git+Commit+Messages
The changes here look great!
Reviewable status: all files reviewed, 6 unresolved discussions (waiting on @aadityasondhi and @jbowens)
0942861 to
66464f6
Compare
aadityasondhi
left a comment
There was a problem hiding this comment.
Reviewable status: 3 of 6 files reviewed, 5 unresolved discussions (waiting on @aadityasondhi and @itsbilal)
a discussion (no related file):
@jbowens Thanks for pointing me to the commit message guide, I made the change. Re: the TODO item, I took a quick look and it seems like it is covered here:
Line 835 in 66464f6
compaction_picker.go, line 321 at r2 (raw file):
Previously, itsbilal (Bilal Akhtar) wrote…
It would be worthwhile to add a sentence or two (maybe in a new paragraph) to this comment to specify what
isCompactingmeans. Just mention that it would be true if any of the boundary files / atomic compaction units are already compacting.
Will do.
compaction_picker.go, line 375 at r2 (raw file):
Previously, itsbilal (Bilal Akhtar) wrote…
The
cur.Compactingcheck would not happen if there's no next/prev file to start with, so you could also double it up with anisCompacting || inputIter.First().Compacting || inputIter.Last().Compactinghere.
Nice catch, will make the change
compaction_picker_test.go, line 573 at r1 (raw file):
Previously, itsbilal (Bilal Akhtar) wrote…
Our convention in other files seems to be to use space to split filemeta-level tags like this, instead of commas.
Thanks, will do.
testdata/compaction_setup_inputs, line 95 at r1 (raw file):
Previously, itsbilal (Bilal Akhtar) wrote…
Since you're starting the compaction "rooted" at a, it would be a better idea to put the
compactingflag on the next file.And to ensure you return
isCompacting = trueeven if you didn't grow the compaction but the start/end file is compacting, you could add a case similar to this one, with a compactinga-cfile, and ad-efile instead of ac-dfile. And to account for the case where there's just no next file but you still returnisCompacting = true. you could keep the compactinga-cfile and have no other files in L2.
Yeah nice catch, will fix
jbowens
left a comment
There was a problem hiding this comment.
Reviewable status: 3 of 6 files reviewed, 5 unresolved discussions (waiting on @aadityasondhi and @itsbilal)
a discussion (no related file):
Previously, aadityasondhi (Aaditya Sondhi) wrote…
@jbowens Thanks for pointing me to the commit message guide, I made the change. Re: the TODO item, I took a quick look and it seems like it is covered here:
. I am not sure if I am missing something. I can remove the TODO if you think it is being satisfied.Line 835 in 66464f6
I think the unsatisfied bit is that we might pick a file that's not compacting but has an atomic compaction unit that includes a file that is compacting. We will discard the picked compaction when we recognize that it includes compacting inputs, but it's not ideal that we don't pursue any compaction at all.
To fully address the TODO, I think we'd need to check if the file's atomic compaction unit includes any compacting files. If it does, we'd want to pick the file with the next lowest ratio.
Now that I'm describing this, it does seem like a fair bit or work, so maybe it should be out-of-scope.
itsbilal
left a comment
There was a problem hiding this comment.
Reviewable status: 3 of 6 files reviewed, 5 unresolved discussions (waiting on @aadityasondhi and @itsbilal)
a discussion (no related file):
Previously, jbowens (Jackson Owens) wrote…
I think the unsatisfied bit is that we might pick a file that's not compacting but has an atomic compaction unit that includes a file that is compacting. We will discard the picked compaction when we recognize that it includes compacting inputs, but it's not ideal that we don't pursue any compaction at all.
To fully address the TODO, I think we'd need to check if the file's atomic compaction unit includes any compacting files. If it does, we'd want to pick the file with the next lowest ratio.
Now that I'm describing this, it does seem like a fair bit or work, so maybe it should be out-of-scope.
Yeah I think that's out-of-scope for this particular issue, even though that'd be the ideal setup. I think it would be better to file a follow-up issue (as merging this would close #644 ) to be able to go back after an isCompacting check fails, and choose the next best compaction file in the level instead of giving up on that level altogether. For now I think this is good to go.
itsbilal
left a comment
There was a problem hiding this comment.
Actually, here are some test suggestions, after the last change.
Reviewed 3 of 3 files at r3.
Reviewable status: all files reviewed, 7 unresolved discussions (waiting on @aadityasondhi and @itsbilal)
compaction_picker_test.go, line 662 at r3 (raw file):
} if isCompacting { fmt.Fprintf(&buf, "is-compacting\n")
Don't add a trailing \n, datadriven will take care of it. And reviewable complains about double \n's at the end.
testdata/compaction_setup_inputs, line 109 at r3 (raw file):
a.SET.5-b.SET.6 L2 a.SET.3-c.SET.4 compacting
Would also be a good idea to add a negative case:
a.SET.3-c.SET.4
d.SET.3-e.SET.2 compacting
as well as the single-file case:
a.SET.3-c.SET.4 compacting
Currently, `pickedCompaction.setupInputs` does not consider `FileMetada.compacting` when expanding the input slice. This ignores concurrent in-progress compactions. This change lets `setupInputs` return early if there is an in-progress compaction in the files picked for compaction. This avoids wasteful expansion of compaction inputs. Fix cockroachdb#644.
66464f6 to
6e61e17
Compare
aadityasondhi
left a comment
There was a problem hiding this comment.
Reviewable status: 4 of 6 files reviewed, 7 unresolved discussions (waiting on @aadityasondhi and @itsbilal)
compaction_picker_test.go, line 662 at r3 (raw file):
Previously, itsbilal (Bilal Akhtar) wrote…
Don't add a trailing
\n, datadriven will take care of it. And reviewable complains about double \n's at the end.
Done.
testdata/compaction_setup_inputs, line 109 at r3 (raw file):
Previously, itsbilal (Bilal Akhtar) wrote…
Would also be a good idea to add a negative case:
a.SET.3-c.SET.4 d.SET.3-e.SET.2 compactingas well as the single-file case:
a.SET.3-c.SET.4 compacting
Done.
itsbilal
left a comment
There was a problem hiding this comment.
We don't use bors in the Pebble repo so feel free to just use the green button to merge this into Pebble. And the next time one of us do a "vendor bump" in Cockroach to pull in the latest Pebble changes, this change will land there.
Reviewed 2 of 2 files at r4.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @aadityasondhi)
Currently,
pickedCompaction.setupInputsdoes not considerFileMetada.compactingwhen expanding the input slice. This ignores concurrent in-progress compactions.This change lets
setupInputsreturn early if there is an in-progress compaction in the files picked for compaction. This avoids wasteful expansion of compaction inputs.Fix #644.