liveness: run sync disk write in a stopper task#81813
liveness: run sync disk write in a stopper task#81813craig[bot] merged 2 commits intocockroachdb:masterfrom
Conversation
Release note: None
7f36c79 to
8dc745f
Compare
8dc745f to
acc8755
Compare
|
Looks like this will resolve #81827, should have looked before opening that one. Thanks! |
|
CI failures appear to be unrelated flake. |
stevendanna
left a comment
There was a problem hiding this comment.
Thanks for jumping on this! I suppose if we were worried about the impact of a disk stall on shutdown, we could do something where we give up on the operation after some amount of time. But, I buy the paragraph you wrote about mitigations.
Reviewed 3 of 3 files at r1.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @erikgrinaker, @stevendanna, and @tbg)
pkg/kv/kvserver/liveness/liveness.go line 1277 at r2 (raw file):
resultCs[i], _ = nl.engineSyncs.DoChan(strconv.Itoa(i), func() (interface{}, error) { var taskErr error if err := nl.stopper.RunTask(ctx, "liveness-hb-diskwrite", func(ctx context.Context) {
Looks like there is also RunTaskWithErr that would cut down on some of the boilerplate here, but it's fine as is.
stevendanna
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 1 of 0 LGTMs obtained (waiting on @erikgrinaker, @stevendanna, and @tbg)
nicktrav
left a comment
There was a problem hiding this comment.
Reviewed 3 of 3 files at r1, 1 of 1 files at r2, all commit messages.
Reviewable status:complete! 2 of 0 LGTMs obtained (waiting on @erikgrinaker and @tbg)
acc8755 to
3271df3
Compare
erikgrinaker
left a comment
There was a problem hiding this comment.
I suppose if we were worried about the impact of a disk stall on shutdown, we could do something where we give up on the operation after some amount of time. But, I buy the paragraph you wrote about mitigations.
Yeah, but I don't know if that's always going to be safe -- for example, in this case that means that we'll be closing Pebble while we still have in-flight writes, which is what prompted this in the first place. There's no way to cancel these writes afaik.
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @nicktrav, @stevendanna, and @tbg)
pkg/kv/kvserver/liveness/liveness.go line 1277 at r2 (raw file):
Previously, stevendanna (Steven Danna) wrote…
Looks like there is also
RunTaskWithErrthat would cut down on some of the boilerplate here, but it's fine as is.
Oh nice, that's better -- thanks! Definitely not "fine" as is, more like functional. :)
This patch runs the sync disk write during node heartbeats in a stopper task. The write is done in a goroutine, so that we can respect the caller's context cancellation (even though the write itself won't). However, this could race with engine shutdown when stopping the node, violating the Pebble contract and triggering the race detector. Running it as a stopper task will cause the node to wait for the disk write to complete before closing the engine. Of course, if the disk stalls then node shutdown will now never complete. This is very unfortunate, since stopping the node is often the only mitigation to recover stuck ranges with stalled disks. This is mitigated by Pebble panic'ing the node on stalled disks, and Kubernetes and other orchestration tools killing the process after some time. Release note: None
3271df3 to
b2c4afe
Compare
the write can't? I think go supports this via deadlines on the os.File handle. |
Yes, but that applies to all writes, not just this one. We'd also have to propagate that through Pebble. Adding deadlines for all Pebble writes is a riskier change than I want to make here. |
|
TFTRs! bors r=stevendanna,nicktrav |
|
Build succeeded: |
liveness: move stopper to
NodeLivenessOptionsRelease note: None
liveness: run sync disk write in a stopper task
This patch runs the sync disk write during node heartbeats in a stopper
task. The write is done in a goroutine, so that we can respect the
caller's context cancellation (even though the write itself won't).
However, this could race with engine shutdown when stopping the node,
violating the Pebble contract and triggering the race detector. Running
it as a stopper task will cause the node to wait for the disk write to
complete before closing the engine.
Of course, if the disk stalls then node shutdown will now never
complete. This is very unfortunate, since stopping the node is often
the only mitigation to recover stuck ranges with stalled disks. This is
mitigated by Pebble panic'ing the node on stalled disks, and Kubernetes
and other orchestration tools killing the process after some time.
Touches #81786.
Resolves #81511.
Resolves #81827.
Release note: None