backupccl: fix key rewriter race in generative split and scatter processor#96313
backupccl: fix key rewriter race in generative split and scatter processor#96313craig[bot] merged 1 commit intocockroachdb:masterfrom
Conversation
|
It looks like your PR touches production code but doesn't add or edit any test code. Did you consider adding tests to your PR? 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
| return nil, err | ||
| } | ||
| numSplitWorkers := int(spec.NumNodes) | ||
| numSecondSplitWorkers := 2 * numSplitWorkers |
There was a problem hiding this comment.
can we add a comment here about why we chose 2 * numSplitWorkers? I realize this is probably not your doing but a high level reason explaining the factor of 2 would be useful.
There was a problem hiding this comment.
done, I can't really find out why exactly the factor of 2 was chosen but I edited the TODO to perhaps tune it in the future.
| @@ -342,8 +360,8 @@ func runGenerativeSplitAndScatter( | |||
|
|
|||
| // TODO(pbardea): This tries to cover for a bad scatter by having 2 * the | |||
There was a problem hiding this comment.
can we move this TODO with the 2* and think about whether it is still applicable? If we want to punt the thinking for later can we add your or my name in the TODO 😋
There was a problem hiding this comment.
I moved the TODO to where the factor of 2 is and put my name on it.
| // scatterers contain the splitAndScatterers for the first group of split and | ||
| // scatter workers. Each worker needs its own scatterer as one cannot be used | ||
| // concurrently. | ||
| scatterers []splitAndScatterer |
There was a problem hiding this comment.
What do you think about calling this one chunkSplitAndScatterer and in the comment indicating that these are used to split and scatter importSpanChunks?
Then the one below can be called chunkEntrySplitAndScatterer or entrySplitAndScatterer? It doesn't perform scatters so maybe we mention that in the comment and describe what it does.
There was a problem hiding this comment.
done, renamed and added the comments.
| flowCtx: flowCtx, | ||
| spec: spec, | ||
| output: output, | ||
| scatterers: scatterers[:numSplitWorkers], |
There was a problem hiding this comment.
sanity checking that this creates a copy of the scatterers slice and doesn't just copy the pointer
There was a problem hiding this comment.
these do just copy the pointer but I changed the code to use two slices just so that we are not confused in the future.
| g := ctxgroup.WithContext(ctx) | ||
|
|
||
| splitWorkers := int(spec.NumNodes) | ||
| splitWorkers := len(scatterers) |
There was a problem hiding this comment.
nit: maybe we call this splitAndScatterWorkers since they do both?
| @@ -291,6 +308,7 @@ func runGenerativeSplitAndScatter( | |||
| importSpanChunksCh := make(chan scatteredChunk, splitWorkers*2) | |||
There was a problem hiding this comment.
I should've done this in your previous PR sorry, but do you think you could add a line or two above each goroutine in this method describing what each of them does?
There was a problem hiding this comment.
done, added comments.
cb7fb4a to
21dde30
Compare
|
you need a rebase to get past the failing DD tests, I ran into the same thing this morning. Since you're going to push anyways can I also bug you to change the |
…essor The generative split and scatter processor is currently causing tests to fail under race because there are many goroutines that are operating with the same splitAndScatterer, which cannot be used concurrently as the underlying key rewriter cannot be used concurrently. Modify the processor so that every worker that uses the splitAndScatterer now uses its own instance. Fixes: cockroachdb#95808 Release note: None
21dde30 to
100b2b9
Compare
|
bors r+ |
|
Build succeeded: |
The generative split and scatter processor is currently causing tests to fail under race because there are many goroutines that are operating with the same splitAndScatterer, which cannot be used concurrently as the underlying key rewriter cannot be used concurrently. Modify the processor so that every worker that uses the splitAndScatterer now uses its own instance.
Fixes: #95808
Release note: None