grpcinterceptor: Use ctxutil to detect context cancellation#107079
grpcinterceptor: Use ctxutil to detect context cancellation#107079craig[bot] merged 1 commit intocockroachdb:masterfrom
Conversation
|
It looks like your PR touches production code but doesn't add or edit any test code. Did you consider adding tests to your PR? 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
64ca14b to
9144c8c
Compare
|
We have two problems here. 1. Goroutine saving - are you sure?
This is not true in general. Have you opened the black box? Look at the go source code, this atomic.AddInt32(&goroutines, +1)
go func() {
select {
case <-parent.Done():
child.cancel(false, parent.Err())
case <-child.Done():
}
}()2. Possible goroutine leak!
I double checked this and I don't like what I am seeing. If you look at the go source code for this internal propagateCancel" function, you see that it spawns a goroutine. That goroutine only ever terminates if either the child or parent context is canceled. So what if they don't get canceled at all? Like, if our code paths pass a context.Background or similar? At the very least with our code base, when someone spawns a goroutine they know they have to listen on the stopper ShouldQuiesce channel. This is also encouraged in the API of our This new Recommendation: remove ctxutil altogetherIf the above is confirmed, I would recommend removing the current impl of If we really need this functionality, we need to build a patch to the go runtime to do what we want (we have a patch infrastructure now! let's use it!) while ensuring that stray goroutines don't leak. |
|
Thanks for looking into this @knz, but I think your concerns can be assuaged. cockroach/pkg/util/ctxutil/context.go Lines 34 to 36 in dfe4ef8
Only if the parent context isn't cancellable. We've already established that it must be, so it takes this code path here, which doesn't spawn a goroutine:
We've established that the context must be cancellable, so the caller is responsible for avoiding goroutine leaks by calling the associated cancel function. I think the only concern here would be if we use a custom context implementation that overrides That said, I'm not super-thrilled about accessing Go internals like this. This stuff is internal for a reason. I'll give this a closer review. |
|
FWIW, I also verified that this gets rid of the superfluous goroutine. |
|
@knz -- what @erikgrinaker said.
Goroutines do not leak with this change. The behavior is no different from the default implementation of context
I have searched quite extensively on this topic; there were proposals to extend context.go somehow; there was always a pushback. |
@knz -- I don't think there are two problems here. |
There was a problem hiding this comment.
This seems fine, and gets the job done.
Somewhat uneasy about ctxutil.WhenDone(), especially considering it's otherwise unused. Can we add in some CrdbTestBuild assertions ensuring that it always does what it's supposed to, either here or separately? We'll also want to let this bake for a while before backporting.
| // [1] https://pkg.go.dev/google.golang.org/grpc#ClientConn.NewStream | ||
| finishFunc(nil /* err */) | ||
| }); err != nil && !errors.Is(err, ctxutil.ErrNeverCompletes) { | ||
| panic(err) |
There was a problem hiding this comment.
Let's propagate the error. I know this can't currently happen, but still, I don't like latent panics and the only caller has an error return path.
|
I am friendly to the change now. You have successfully changed my opinion. |
|
@erikgrinaker @knz |
19c1fb2 to
667bcf9
Compare
Previous commit cockroachdb#76306 arranged for interceptors to run for all rpcs (local or otherwise). In doing so, it added an additional goroutine for each RPC whose purpose was to detect context completion, and to arrange for the cleanup function to run when that happened. `ctxutil.WhenDone` (added by cockroachdb#103866/commits/dfe4ef82810aaf81ae21afecce83cbc1fb6783dd) allows for the same functionality without the additional goroutine cost. Informs cockroachdb#107053 Release note: None
|
bors r+ |
|
Build succeeded: |
…tion Backport 1/2 commits from cockroachdb#103866 Backport 1/1 commits from cockroachdb#107079 Previous commit cockroachdb#76306 arranged for interceptors to run for all rpcs (local or otherwise). In doing so, it added an additional goroutine for each RPC whose purpose was to detect context completion, and to arrange for the cleanup function to run when that happened. `ctxutil.WhenDone` (added by cockroachdb#103866/commits/dfe4ef82810aaf81ae21afecce83cbc1fb6783dd) allows for the same functionality without the additional goroutine cost. Informs cockroachdb#107053 Release note: None
…tion Backport 1/2 commits from cockroachdb#103866 Backport 1/1 commits from cockroachdb#107079 Previous commit cockroachdb#76306 arranged for interceptors to run for all rpcs (local or otherwise). In doing so, it added an additional goroutine for each RPC whose purpose was to detect context completion, and to arrange for the cleanup function to run when that happened. `ctxutil.WhenDone` (added by cockroachdb#103866/commits/dfe4ef82810aaf81ae21afecce83cbc1fb6783dd) allows for the same functionality without the additional goroutine cost. Informs cockroachdb#107053 Release note: None Release justification: Important performance improvement for large scale rangefeed workloads to remove inadvertently introduced goroutine.
Previous commit #76306 arranged for interceptors to run for all rpcs (local or otherwise). In doing so, it added an additional go routine for each RPC whose purpose was to detect context completion, and to arrange for the cleanup function to run when that happened.
ctxutil.WhenDone(added by #103866/commits/dfe4ef82810aaf81ae21afecce83cbc1fb6783dd) allows for the same functionality without the additional goroutine cost.Informs #107053
Release note: None