kv/bulk: parallelize sending SSTs due to range bounds by dt · Pull Request #79967 · cockroachdb/cockroach

dt · 2022-04-14T20:26:17Z

Previously the batcher, when it determined it needed to finish one SST
and send it before starting another, would wait for it to be sent before
moving one. When flushing a buffer that contained data that mapped to
many ranges, this meant many serial flushes, e.g. flushing 512MB of data
from a buffer that had keys uniformly distributed over a table which was
split into 2000 ranges meant waiting for roughly 2000 sequential AddSSTable
requests. When those requests were slow, for example sometimes taking as
much as 1s each or more, this became a major bottleneck.

This change switches the batcher to send files that are ended due to a
round boundary asynchronously, queuing up the request to send and then
starting the next file while it sends, as long as memory capacity in the
monitor allows holding the extra file in memory (as these async sends
could result in using an entire extra buffer's worth of memory if they
all end up in-flight at once, which they easily could if the receivers
are queuing).

Addresses #79615

Release note (performance improvement): Bulk ingestion of unsorted data during IMPORT and schema changes uses a higher level of parallelism to send produced data to the storage layer.

cockroach-teamcity · 2022-04-14T20:26:27Z

This change is

the buffering adder took over making this split so this was dead code. Release note: none.

adityamaru · 2022-04-21T18:22:55Z

pkg/kv/bulk/sst_batcher.go

 	rowCounter storage.RowCounter
+
+	mu  syncutil.Mutex
+	grp ctxgroup.Group


can we add a comment about what operations the group synchronizes, and when is it reset?

I gave it a descriptive name instead. wdyt?

pkg/kv/bulk/sst_batcher.go

adityamaru · 2022-04-21T18:41:23Z

pkg/kv/bulk/sst_batcher.go

 	// rows written in the current batch.
 	rowCounter storage.RowCounter
+
+	mu  syncutil.Mutex


can we wrap all the fields we want to lock/unlock in an:

mu struct { syncutil.Mutex }

Took a crack at this. It is a little tricky since we read from some of these fields in BufferingAdder but that is after a call to Flush() which called Wait(), so it is mostly just the batcher itself that needs to lock.

adityamaru · 2022-04-21T18:42:57Z

pkg/kv/bulk/sst_batcher.go


 // Flush sends the current batch, if any.
 func (b *SSTBatcher) Flush(ctx context.Context) error {
+	if err := b.grp.Wait(); err != nil {


quick comment that we're waiting for addsstable requests across ranges to return

think the name helps here?

pkg/util/ctxgroup/ctxgroup.go

pkg/kv/bulk/sst_batcher.go

dt · 2022-04-21T22:51:22Z

before and after on a large unsorted IMPORT. Note the shape, but also the y axis labels

Previously the batcher, when it determined it needed to finish one SST and send it before starting another, would wait for it to be sent before moving one. When flushing a buffer that contained data that mapped to many ranges, this meant many serial flushes, e.g. flushing 512MB of data from a buffer that had keys uniformly distributed over a table which was split into 2000 ranges meant waiting for roughly 2000 sequential AddSSTable requests. When those requests were slow, for example sometimes taking as much as 1s each or more, this became a major bottleneck. This change switches the batcher to send files that are ended due to a round boundary asynchronously, queuing up the request to send and then starting the next file while it sends, as long as memory capacity in the monitor allows holding the extra file in memory (as these async sends could result in using an entire extra buffer's worth of memory if they all end up in-flight at once, which they easily could if the receivers are queuing). Release note (performance improvement): Bulk ingestion of unsorted data during IMPORT and schema changes uses a higher level of parallelism to send produced data to the storage layer.

adityamaru

LGTM sans one question.

adityamaru · 2022-04-22T13:50:19Z

pkg/kv/bulk/sst_batcher.go

-func (b *SSTBatcher) Close() {
+func (b *SSTBatcher) Close(ctx context.Context) {
 	b.sstWriter.Close()
+	if err := b.asyncAddSSTs.Wait(); err != nil {


Swallowing the error is okay because we expect every user of the sstbatcher to flush before closing? Should we return an error or maybe fatal if we see an error here instead, so that incase there is an addsstable request that we have not waited for before, and it fails we don’t just carry on silently.

Yeah, if you don’t Flush before Close, it is already the case that anything in sstwriter that hasn’t been sent is lost, so already you have to flush and check its error if you want to know if you wrote everything successfully you passed to Add or not, so this one seems moot; we only do it so we don’t leak gotoutines and don’t care about the result

Makes sense, thanks

dt · 2022-04-22T13:57:51Z

TFTR!

bors r+

craig · 2022-04-22T14:24:47Z

Build failed:

GitHub CI (Cockroach)

dt · 2022-04-22T14:35:51Z

backupccl timeout with no stacks in kv/bulk

bors r+

craig · 2022-04-22T16:49:14Z

Build succeeded:

GitHub CI (Cockroach)

80467: kv/bulk: pull lastRange out of mu and always pass monitor r=dt a=dt Follow-up work from #79967. Pulling lastRange out of the mutex and only updating it on sync flushes simplifies the locking since we don't need its result anyway for range-flushes (as we flushed because were done with that range). Additionally, the original patch added a mem monitor and used it unconditionally, however some external creators of batcher (restore, stream ingest) make their own batchers and were not populating it. Second commit fixes that. Co-authored-by: David Taylor <tinystatemachine@gmail.com>

dt requested review from adityamaru and nvb April 14, 2022 20:26

dt requested a review from a team as a code owner April 14, 2022 20:26

dt force-pushed the par-flush branch from eae7b56 to 1d1c5cb Compare April 15, 2022 11:40

kv/bulk: remove initial batcher split

e4e41ef

the buffering adder took over making this split so this was dead code. Release note: none.

dt force-pushed the par-flush branch 3 times, most recently from 30ba771 to 4b7b852 Compare April 21, 2022 12:53

dt requested a review from a team April 21, 2022 12:53

dt force-pushed the par-flush branch 2 times, most recently from f3da663 to 85318e5 Compare April 21, 2022 17:16

adityamaru suggested changes Apr 21, 2022

View reviewed changes

adityamaru self-requested a review April 21, 2022 18:47

dt force-pushed the par-flush branch 2 times, most recently from 5eb6274 to 8496692 Compare April 21, 2022 22:43

dt force-pushed the par-flush branch 3 times, most recently from f6bfa71 to 4272c8f Compare April 22, 2022 01:23

dt force-pushed the par-flush branch from 4272c8f to db8cb1f Compare April 22, 2022 12:19

adityamaru approved these changes Apr 22, 2022

View reviewed changes

dt mentioned this pull request Apr 22, 2022

release-22.1: kv/bulk: track send-wait by store #79707

Closed

dt mentioned this pull request Apr 22, 2022

release-22.1: kv/bulk: parallelize sending SSTs due to range bounds #80386

Merged

craig bot merged commit 11a0a9f into cockroachdb:master Apr 22, 2022

cockroach-teamcity mentioned this pull request Apr 22, 2022

kv/bulk: parallelize sending SSTs due to range bounds cockroachdb/docs#13687

Closed

dt deleted the par-flush branch April 22, 2022 19:07

shermanCRL mentioned this pull request Apr 22, 2022

Improve IMPORT performance for larger unsorted data sets of #79615

Closed

4 tasks

dt mentioned this pull request Apr 25, 2022

kv/bulk: pull lastRange out of mu and always pass monitor #80467

Merged

blathers-crl bot mentioned this pull request Apr 25, 2022

release-22.1.0: release-22.1: kv/bulk: parallelize sending SSTs due to range bounds #80487

Merged

nicktrav mentioned this pull request May 24, 2022

kv/kvserver: consider bumping (or removing) server side rate limiters in favor of admission control #81772

Closed

nvb mentioned this pull request Jul 12, 2022

bulk: range cache miss during index backfill causes sustained slowdown #84290

Closed

msbutler mentioned this pull request Apr 17, 2025

sstbatcher: async flush errors can be swallowed instead of returned to the client. #144650

Closed

Conversation

dt commented Apr 14, 2022 • edited by shermanCRL Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cockroach-teamcity commented Apr 14, 2022

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

dt commented Apr 21, 2022

Uh oh!

adityamaru left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dt Apr 22, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dt commented Apr 22, 2022

Uh oh!

craig bot commented Apr 22, 2022

Uh oh!

dt commented Apr 22, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

craig bot commented Apr 22, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

dt commented Apr 14, 2022 •

edited by shermanCRL

Loading

dt Apr 22, 2022 •

edited

Loading

dt commented Apr 22, 2022 •

edited

Loading