server: improve error handling in join rpc#53934
server: improve error handling in join rpc#53934craig[bot] merged 1 commit intocockroachdb:masterfrom
Conversation
4c86e84 to
0348dfd
Compare
knz
left a comment
There was a problem hiding this comment.
Reviewed 1 of 1 files at r1.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @irfansharif and @tbg)
pkg/server/init.go, line 286 at r1 (raw file):
gossipConnectedCh = g.Connected // Let's nil out joinCh for posterity.
s/for posterity/to prevent accidental reuse/
pkg/server/init.go, line 300 at r1 (raw file):
// We expect the join RPC to blindly retry on all errors // save for the two above. This should be unreachable code. log.Fatalf(ctx, "unexpected error: %s", err.Error())
a couple of things here:
- the right way forward is
"%v", err-- generallyerr.Error()in logging or to assemble other errors is code smell - you don't need log.Fatal here since you're still in the start-up path, you can return
errors.HandleAsAssertionFailure(err)orerrors.NewAssertionErrorWithWrappedErrf(err, "unexpected error: %v", err)
if you care about it being sent to sentry (so that we see it) you can export log.sendCrashReport and call it here, or modity cli/cli.go and add if errors.IsAssertionFailure(err) { log.SendCrashReport(... err)) in there.
(I think I'm going to send a PR for this bottom point -- we probably need that as a matter of course)
Previously we used two separate channels to relay the results of using the join RPC, one for possible error in attempting to do and one for the actual result if successful. There was no real reason for this, and the code gluing them together was a bit fragile (and hid a latent bug). Previously we could deadlock if the select loop in ServeAndWait, for a successful join attempt, drained the errCh before draining joinCh. Because the successful joinCh drain then followed up with trying to drain the errCh (already drained by now), it would just block indefinitely. We just simplify all this to using one channel to encapsulate the results of the join attempt. Release justification: bug fixes and low-risk updates to new functionality Release note: None
0348dfd to
9f9130f
Compare
irfansharif
left a comment
There was a problem hiding this comment.
Thanks for the review Raphael!
bors r+
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @knz and @tbg)
pkg/server/init.go, line 286 at r1 (raw file):
Previously, knz (kena) wrote…
s/for posterity/to prevent accidental reuse/
Done.
pkg/server/init.go, line 300 at r1 (raw file):
Previously, knz (kena) wrote…
a couple of things here:
- the right way forward is
"%v", err-- generallyerr.Error()in logging or to assemble other errors is code smell- you don't need log.Fatal here since you're still in the start-up path, you can return
errors.HandleAsAssertionFailure(err)orerrors.NewAssertionErrorWithWrappedErrf(err, "unexpected error: %v", err)if you care about it being sent to sentry (so that we see it) you can export
log.sendCrashReportand call it here, or moditycli/cli.goand addif errors.IsAssertionFailure(err) { log.SendCrashReport(... err))in there.(I think I'm going to send a PR for this bottom point -- we probably need that as a matter of course)
Done. Went with errors.NewAssertionErrorWithWrappedErrf.
|
Build failed (retrying...): |
|
Build succeeded: |
Previously we used two separate channels to relay the results of using
the join RPC, one for possible error in attempting to do and one for the
actual result if successful. There was no real reason for this, and the
code gluing them together was a bit fragile (and hid a latent bug).
Previously we could deadlock if the select loop in ServeAndWait, for a
successful join attempt, drained the errCh before draining joinCh.
Because the successful joinCh drain then followed up with trying to
drain the errCh (already drained by now), it would just block
indefinitely.
We just simplify all this to using one channel to encapsulate the
results of the join attempt.
Release justification: bug fixes and low-risk updates to new functionality
Release note: None