ringhash: handle config updates properly#5557
Conversation
| // picker is pushed on the channel. | ||
| func (s) TestUpdateClientConnState_NewRingSize(t *testing.T) { | ||
| addrs := []resolver.Address{{Addr: testBackendAddrStrs[0]}} | ||
| cc, b, p0 := setupTest(t, addrs) |
There was a problem hiding this comment.
Nit: I know it's p0 elsewhere, but for some reason p1 and p2 make more sense in my mind than p0 and p1.
|
|
||
| // If the ring configuration has changed, we need to regenerate the ring and | ||
| // send a new picker. | ||
| var regenerateRing bool |
There was a problem hiding this comment.
In correspondence to your comment on my Outlier Detection PR:
regenerateRing := b.updateAddresses(s.ResolverState.Addresses)
if b.config == nil || b.config.MinRingSize != newConfig.MinRingSize || b.config.MaxRingSize != newConfig.MaxRingSize {
regenerateRing = true
}
b.config = newConfig
| } | ||
|
|
||
| if regenerateRing { | ||
| ring, err := newRing(b.subConns, b.config.MinRingSize, b.config.MaxRingSize) |
There was a problem hiding this comment.
This changes logic, and I don't know if this is correct. I still think it needs to write to b.ring. Otherwise it would persist a built ring map from the previous unrelated good update? Then subsequent ResolverError calls (line 318) and UpdateSubConnState() calls (line 369) call into regeneratePicker which reads b.ring, which would not be in the correct state and "synchronized" with the UpdateClientConnState that caused the SubConn error from the address list. Please triage.
There was a problem hiding this comment.
Earlier, when we were assigning to b.ring here and if err != nil, we were still calling ResolverError, but not sending a new picker. So, we would have still ended up using the old ring (from the old picker) to service RPCs. So, in that sense, i feel that the behavior has not changed. But either ways, I'm not sure if our existing behavior was right. So, I have posted on our chat room, let's see what comes out of it. Thanks.
There was a problem hiding this comment.
Ok, based on the discussion in the chat room, I've change our ring creation routine newRing() to not return an error.
| return balancer.ErrBadResolverState | ||
| } | ||
| // Successful resolution; clear resolver error and return nil. | ||
| b.resolverErr = nil |
There was a problem hiding this comment.
Again, I don't know if this breaks it, but this doesn't clear it if hits line 295, whereas in master it did. Although it feels appropriate here, but please triage if this breaks anything.
There was a problem hiding this comment.
This one looks totally fine to me.
| // TestUpdateClientConnState_NewRingSize tests the scenario where the ringhash | ||
| // LB policy receives new configuration which specifies new values for the ring | ||
| // min and max sizes. The test verifies that a new ring is created and a new | ||
| // picker is pushed on the channel. |
There was a problem hiding this comment.
Nit: we're not testing whether the picker is pushed on "the channel". We're testing whether the new picker gets forwarded to this balancer's client conn, which in this case sends on the channel for verification purposes. Would prefer rewording "is pushed on the channel" to "sent to the ClientConn".
| // getWeightAttribute() returns 1 if the weight attribute is not found | ||
| // on the address. And since this function is guaranteed to be called | ||
| // with a non-empty subConns map, weightSum is guaranteed to be | ||
| // non-zero. So, we need not worry about divide by zero error here. |
There was a problem hiding this comment.
Sorry, super super minor nit: "need not worry about a divide"
| // | ||
| // Must be called with a non-empty subConns map. | ||
| func normalizeWeights(subConns *resolver.AddressMap) ([]subConnWithWeight, float64) { | ||
| var weightSum float64 |
There was a problem hiding this comment.
What do you think of keeping this uint32, and instead of having another var to convert it like in master, you convert in the denominator float64(weightsum) to avoid cast on line 119. I don't care/don't feel strongly about this but was just proposing a suggestion.
| // is 1. | ||
| func normalizeWeights(subConns *resolver.AddressMap) ([]subConnWithWeight, float64, error) { | ||
| // | ||
| // Must be called with a non-empty subConns map. |
There was a problem hiding this comment.
I'm guessing "non-empty" encapsualtes nil and len(0)?
| @@ -291,6 +272,29 @@ func (b *ringhashBalancer) UpdateClientConnState(s balancer.ClientConnState) err | |||
| b.ResolverError(errors.New("produced zero addresses")) | |||
There was a problem hiding this comment.
This is not correct. This happens before b.subConns gets written to. Thus, line 303 won't hit guaranteed. Say you get a first update with 3 addresses, you make 3 subconns. Then, the next UpdateClientConnState comes in with 0 addresses. Then, this conditional hits, the SubConn len is still 3, and doens't set to connectivity.TransientFailure.
There was a problem hiding this comment.
That's right. I moved it to be right before we go about constructing the ring. So, if we end up with zero addresses, we save the effort of making a new ring.
|
Thanks for the review. |
zasweq
left a comment
There was a problem hiding this comment.
LGTM. I found my first correctness issue ever I'm so proud of myself :D!
Yay Yay Yay !!! |
Fixes #5460
Fix a few things in the config handling routine of the ringhash LB policy:
RELEASE NOTES: