liveness: remove a possible data race#56899
Conversation
This was caught by our grpc race provoking transport while stressing PR cockroachdb#56812. Release note: None
irfansharif
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @irfansharif and @tbg)
pkg/kv/kvserver/liveness/liveness.go, line 1265 at r1 (raw file):
b := txn.NewBatch() key := keys.NodeLivenessKey(update.newLiveness.NodeID) if err := v.SetProto(&update.newLiveness); err != nil {
I'm not really sure I understand what's going on here. We're racing with GRPCTransportFactory under transport_race.go, and I'm not sure what that means. What exactly is holding onto v here? My reading of this, and the race in #56812, is that the GRPCTransportFactory guy is busy reading through all batch requests to "to catch any mutations of a batch passed to the transport. The deal is that batches passed to the transport are immutable the server is not allowed to mutate anything and this transport makes sure they don't."
But we're still doing that with v.SetProto. So I think this "race" (which is safe I think, we're not mutating the batch after having sent it to the transport) still exists. I think what we want to do instead is call SetProto outside the Txn closure.
tbg
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @irfansharif)
pkg/kv/kvserver/liveness/liveness.go, line 1265 at r1 (raw file):
Previously, irfansharif (irfan sharif) wrote…
I'm not really sure I understand what's going on here. We're racing with GRPCTransportFactory under transport_race.go, and I'm not sure what that means. What exactly is holding onto
vhere? My reading of this, and the race in #56812, is that the GRPCTransportFactory guy is busy reading through all batch requests to "to catch any mutations of a batch passed to the transport. The deal is that batches passed to the transport are immutable the server is not allowed to mutate anything and this transport makes sure they don't."But we're still doing that with
v.SetProto. So I think this "race" (which is safe I think, we're not mutating the batch after having sent it to the transport) still exists. I think what we want to do instead is callSetProtooutside the Txn closure.
I'm not sure I follow. This transport mechanism is ensuring that anything passed to gRPC is immutable. We are allocating a new value here and calling .SetProto on it before it's handed to anyone, and it is immutable from that point on. What am I missing?
|
bors r=lunevalex Irfan lmk if there's something wrong still and I'll follow up. |
|
Build succeeded: |
This was caught by our grpc race provoking transport while stressing
PR #56812.
Release note: None