release-22.1: kv/bulk: parallelize sending SSTs due to range bounds by dt · Pull Request #80386 · cockroachdb/cockroach

dt · 2022-04-22T16:09:28Z

Backport:

2/2 commits from "kv/bulk: track send-wait by store" (kv/bulk: track send-wait by store #79612)
2/2 commits from "kv/bulk: parallelize sending SSTs due to range bounds" (kv/bulk: parallelize sending SSTs due to range bounds #79967)

Please see individual PRs for details.

/cc @cockroachdb/release

Addresses #79615

Release justification: high impact change motivated by customer request.

Release note: none.

the buffering adder took over making this split so this was dead code. Release note: none.

Previously the batcher, when it determined it needed to finish one SST and send it before starting another, would wait for it to be sent before moving one. When flushing a buffer that contained data that mapped to many ranges, this meant many serial flushes, e.g. flushing 512MB of data from a buffer that had keys uniformly distributed over a table which was split into 2000 ranges meant waiting for roughly 2000 sequential AddSSTable requests. When those requests were slow, for example sometimes taking as much as 1s each or more, this became a major bottleneck. This change switches the batcher to send files that are ended due to a round boundary asynchronously, queuing up the request to send and then starting the next file while it sends, as long as memory capacity in the monitor allows holding the extra file in memory (as these async sends could result in using an entire extra buffer's worth of memory if they all end up in-flight at once, which they easily could if the receivers are queuing). Release note (performance improvement): Bulk ingestion of unsorted data during IMPORT and schema changes uses a higher level of parallelism to send produced data to the storage layer.

blathers-crl · 2022-04-22T16:09:30Z

cockroach-teamcity · 2022-04-22T16:09:41Z

This change is

shermanCRL

Approved for extraordinary backport, pending code review from @nvanbenschoten.

nvb

Reviewed 5 of 5 files at r1, 9 of 9 files at r2, 1 of 1 files at r3, 7 of 7 files at r4, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @dt)

pkg/kv/bulk/sst_batcher.go, line 483 at r4 (raw file):

	}

	if reason == rangeFlush {

I'm struggling to understand how this interacts with the b.mu.lastRange state. When we issue multiple AddSSTables concurrently, the information added to b.mu.lastRange will be non-deterministic (the previous request may not have finished, or its result may have been cloberred). Is this ok because we're only using concurrency in the rangeFlush case, and we only want to perform manual splits in the rangeSize case where we still issue requests serially? If so, I wonder whether we could make that more clear in the code. If the state in b.mu.lastRange is meaningless because it's non-deterministic, should we even set it in those cases?

pkg/kv/bulk/sst_batcher.go, line 692 at r4 (raw file):

	}
	b.mu.Lock()
	b.stats.splitRetries += files - 1

Should b.stats be in the mu struct?

dt

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten)

pkg/kv/bulk/sst_batcher.go, line 483 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

I'm struggling to understand how this interacts with the b.mu.lastRange state. When we issue multiple AddSSTables concurrently, the information added to b.mu.lastRange will be non-deterministic (the previous request may not have finished, or its result may have been cloberred). Is this ok because we're only using concurrency in the rangeFlush case, and we only want to perform manual splits in the rangeSize case where we still issue requests serially? If so, I wonder whether we could make that more clear in the code. If the state in b.mu.lastRange is meaningless because it's non-deterministic, should we even set it in those cases?

async requests will indeed overwrite each other's lastRange, but i think last-write-wins is probably going to be fine, since as you say, if we're sending async flushes at all, we're probably already in the small-ssts-to-many-ranges case where we don't send our own splits anyway.

I could just make only size-flushes update lastRange if you'd prefer?

pkg/kv/bulk/sst_batcher.go, line 692 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Should b.stats be in the mu struct?

Hm, I'd tried to avoid moving it, since most of the fields are accessed without holding mu. Indeed, the reader of this field doesn't lock it, since reads are done only by the log statement that is called after a flush() which has called .Wait(), so the locking is only for write/write races.

I decided that was too confusing with the other ones and just made them atomics to avoid commenting all that; I could do the same here if you feel strongly.

nvb

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @dt)

pkg/kv/bulk/sst_batcher.go, line 483 at r4 (raw file):

Previously, dt (David Taylor) wrote…

async requests will indeed overwrite each other's lastRange, but i think last-write-wins is probably going to be fine, since as you say, if we're sending async flushes at all, we're probably already in the small-ssts-to-many-ranges case where we don't send our own splits anyway.

I could just make only size-flushes update lastRange if you'd prefer?

If you did that then we could pull b.mu.lastRange out of the mutex, right? I would find that much easier to reason about.

pkg/kv/bulk/sst_batcher.go, line 692 at r4 (raw file):

Previously, dt (David Taylor) wrote…

Hm, I'd tried to avoid moving it, since most of the fields are accessed without holding mu. Indeed, the reader of this field doesn't lock it, since reads are done only by the log statement that is called after a flush() which has called .Wait(), so the locking is only for write/write races.

I decided that was too confusing with the other ones and just made them atomics to avoid commenting all that; I could do the same here if you feel strongly.

I feel less strongly about this one. I'll defer to you.

dt

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @dt and @nvanbenschoten)

pkg/kv/bulk/sst_batcher.go, line 483 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

If you did that then we could pull b.mu.lastRange out of the mutex, right? I would find that much easier to reason about.

added a commit (will forward port it to master too if it looks like what you were thinking)

nvb

Reviewed 2 of 2 files at r5, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @dt)

pkg/kv/bulk/sst_batcher.go, line 542 at r5 (raw file):

	sstBytes []byte,
	stats enginepb.MVCCStats,
	reason int,

Should this be the reason, or should this be a bool indicating that the addSSTable is being run concurrently? Or should this be a trackLastRange bool? In other words, whose responsibility should it be to make the decision about whether setting b.lastRange is needed and is thread-safe?

stevendanna · 2022-04-25T06:27:45Z

Heads up: I think this change is implicated in a couple of nightly failures over on master; so we might want to hold off merging this until we have resolved #80436 and #80437

dt

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten)

pkg/kv/bulk/sst_batcher.go line 542 at r5 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Should this be the reason, or should this be a bool indicating that the addSSTable is being run concurrently? Or should this be a trackLastRange bool? In other words, whose responsibility should it be to make the decision about whether setting b.lastRange is needed and is thread-safe?

Good point, makes sense.

Done.

nvb

Reviewed 1 of 1 files at r6, 3 of 3 files at r7, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @dt)

Release note: none.

dt · 2022-04-25T14:37:38Z

I think this change is implicated in a couple of nightly failures over on master

Fixed in second commit of #80467 and picked up here too now.

dt · 2022-04-25T15:56:13Z

verified restore/nodeShutdown/worker passes on master with the fix mentioned above. so think this is good to go

dt · 2022-04-25T16:31:14Z

blathers backport 22.1.0

blathers-crl · 2022-04-25T16:31:16Z

Encountered an error creating backports. Some common things that can go wrong:

The backport branch might have already existed.
There was a merge conflict.
The backport branch contained merge commits.

You might need to create your backport manually using the backport tool.

error creating backport branch refs/heads/blathers/backport-release-22.1.0-80386: POST https://api.github.com/repos/cockroachdb/cockroach/git/refs: 422 Reference already exists []

Backport to branch 22.1.0 failed. See errors above.

_{🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.}

dt added 4 commits April 22, 2022 16:04

kv/bulk: pull ingest stats into own struct

7ae6b88

Release note: none.

kv/bulk: track send-wait by store

f77e393

Release note: none.

kv/bulk: remove initial batcher split

6c6f9c6

the buffering adder took over making this split so this was dead code. Release note: none.

dt requested review from nvb and shermanCRL April 22, 2022 16:09

dt requested a review from a team as a code owner April 22, 2022 16:09

dt requested review from a team April 22, 2022 16:09

shermanCRL approved these changes Apr 22, 2022

View reviewed changes

dt mentioned this pull request Apr 22, 2022

release-22.1: kv/bulk: track send-wait by store #79707

Closed

nvb approved these changes Apr 22, 2022

View reviewed changes

dt commented Apr 22, 2022

View reviewed changes

nvb reviewed Apr 22, 2022

View reviewed changes

dt commented Apr 22, 2022

View reviewed changes

shermanCRL mentioned this pull request Apr 22, 2022

Improve IMPORT performance for larger unsorted data sets of #79615

Closed

4 tasks

nvb reviewed Apr 24, 2022

View reviewed changes

dt force-pushed the backport22.1-79612-79967 branch from c46b95d to bcb710a Compare April 25, 2022 11:48

dt commented Apr 25, 2022

View reviewed changes

nvb approved these changes Apr 25, 2022

View reviewed changes

dt added 2 commits April 25, 2022 14:20

kv/bulk: pull lastRange out of mu

1733f74

Release note: none.

kv/bulk: ensure batcher always has a non-nil monitor

c0d7f68

Release note: none.

dt force-pushed the backport22.1-79612-79967 branch from bcb710a to c0d7f68 Compare April 25, 2022 14:21

dt added the backport-22.1.0 label Apr 25, 2022

dt merged commit 7434d1e into cockroachdb:release-22.1 Apr 25, 2022

dt deleted the backport22.1-79612-79967 branch April 25, 2022 16:31

blathers-crl bot mentioned this pull request Apr 25, 2022

release-22.1.0: release-22.1: kv/bulk: parallelize sending SSTs due to range bounds #80487

Merged

cockroach-teamcity mentioned this pull request Apr 25, 2022

kv/bulk: parallelize sending SSTs due to range bounds cockroachdb/docs#13704

Closed

Conversation

dt commented Apr 22, 2022 • edited by shermanCRL Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

blathers-crl bot commented Apr 22, 2022

Uh oh!

cockroach-teamcity commented Apr 22, 2022

Uh oh!

shermanCRL left a comment

Choose a reason for hiding this comment

Uh oh!

nvb left a comment

Choose a reason for hiding this comment

Uh oh!

dt left a comment

Choose a reason for hiding this comment

Uh oh!

nvb left a comment

Choose a reason for hiding this comment

Uh oh!

dt left a comment

Choose a reason for hiding this comment

Uh oh!

nvb left a comment

Choose a reason for hiding this comment

Uh oh!

stevendanna commented Apr 25, 2022

Uh oh!

dt left a comment

Choose a reason for hiding this comment

Uh oh!

nvb left a comment

Choose a reason for hiding this comment

Uh oh!

dt commented Apr 25, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dt commented Apr 25, 2022

Uh oh!

dt commented Apr 25, 2022

Uh oh!

blathers-crl bot commented Apr 25, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

dt commented Apr 22, 2022 •

edited by shermanCRL

Loading

dt commented Apr 25, 2022 •

edited

Loading