storage: ensure non-expired context before each liveness update attempt#25631
storage: ensure non-expired context before each liveness update attempt#25631craig[bot] merged 1 commit intocockroachdb:masterfrom
Conversation
Fixes cockroachdb#25430. Before this change, a liveness update could get stuck in an infinite loop if its context expired. This is because it would continue to retry and continue to get `errRetryLiveness` errors due to `AmbiguousResultErrors` created by `DistSender`. Release note: None
|
Do you think it makes more sense to move the check down to Review status: 0 of 1 files reviewed at latest revision, all discussions resolved. Comments from Reviewable |
|
That's where I had it before, but I moved it up because |
|
Great! Thanks for the explanation. LGTM! |
|
I'm also wondering why we are not replacing this endless for loop with a |
|
There's always a debate between whether bounded loops help avoid problems or just mask issues (like this one) and make them harder to find. I think for now we should keep this as-is. The context cancellation should at-least bound the loop to some maximum duration. |
|
Review status: 0 of 1 files reviewed at latest revision, all discussions resolved. Comments from Reviewable |
|
Should we backport this to 2.0? Is this just a test flake or could it explain some user issues like the import failures? |
|
This could be more serious than a test flake, so I think we should backport it. If the user issues you're referring to included |
|
bors r+ |
25542: changefeedccl: uri sinks, fill in record key, WITH envelope r=tschottdorf a=danhhz See commit messages for details 25631: storage: ensure non-expired context before each liveness update attempt r=nvanbenschoten a=nvanbenschoten Fixes #25430. Before this change, a liveness update could get stuck in an infinite loop if its context expired. This is because it would continue to retry and continue to get `errRetryLiveness` errors due to `AmbiguousResultErrors` created by `DistSender`. Release note: None Co-authored-by: Daniel Harrison <daniel.harrison@gmail.com> Co-authored-by: Nathan VanBenschoten <nvanbenschoten@gmail.com>
I don't have a specific issue in mind but I know we've seen some reports of liveness problems (especially in conjunction with IMPORT). |
Build succeeded |
|
This seems bad enough to backport.
On Thu, May 17, 2018, 10:56 PM craig[bot] ***@***.***> wrote:
Merged #25631 <#25631>.
—
You are receiving this because your review was requested.
Reply to this email directly, view it on GitHub
<#25631 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AE135Ce9nMN_FfFGSshzAlTJ5JSujLzyks5tzePtgaJpZM4UDsK3>
.
--
…-- Tobias
|
25634: backport-2.0: storage: ensure non-expired context before each liveness update attempt r=nvanbenschoten a=nvanbenschoten Backport 1/1 commits from #25631. /cc @cockroachdb/release --- Fixes #25430. Before this change, a liveness update could get stuck in an infinite loop if its context expired. This is because it would continue to retry and continue to get `errRetryLiveness` errors due to `AmbiguousResultErrors` created by `DistSender`. Release note: None Co-authored-by: Nathan VanBenschoten <nvanbenschoten@gmail.com>
Fixes #25430.
Before this change, a liveness update could get stuck in an infinite
loop if its context expired. This is because it would continue to retry
and continue to get
errRetryLivenesserrors due toAmbiguousResultErrorscreated byDistSender.Release note: None