Handle Aggregate cluster loops and dups#10274
Conversation
…tors of the current CDS aggregate. Also put in a code path to do NACK instead of handling loops as that is what the other languages do.
| if (childLb != null) { | ||
| childLb.shutdown(); | ||
| childLb = null; |
There was a problem hiding this comment.
Wouldn't the previous loop guarantee that childLB is always shut down if loopStatus is set?
There was a problem hiding this comment.
true. Removed it.
|
|
||
| // CLUSTER (aggr.) -> [cluster2 (aggr.)] | ||
| String cluster2 = "cluster-02.googleapis.com"; | ||
| update = |
There was a problem hiding this comment.
This update just swaps CLUSTER from having cluster1 to cluster2, right? Why does this need to be done in this test?
There was a problem hiding this comment.
Significantly rewrote this test as it wasn't testing what I intended it to.
|
|
||
| // cluster2 (aggr.) -> [cluster3 (EDS)] | ||
| String cluster3 = "cluster-03.googleapis.com"; | ||
| CdsUpdate update2 = |
There was a problem hiding this comment.
Why not reuse the "update" variable like before?
There was a problem hiding this comment.
I think that it is actually cleaner for each update to be in a different variable so you can see in the debugger exactly what has been done and it is easier to spot when you are accidentally forgetting to update the variable before delivering the update.
| Status unavailable = Status.UNAVAILABLE.withDescription( | ||
| "CDS error: found 0 leaf (logical DNS or EDS) clusters for root cluster " + CLUSTER); |
There was a problem hiding this comment.
Is this test for missing leaf clusters as it seems, or for duplicate children as the name applies? Where are duplicates added?
temawi
left a comment
There was a problem hiding this comment.
I'm wondering if it would be worth it to also do an integration test with a control plane and an XdsClient. The unit testing here is fundamental, but sometimes the interactions of load balancers in the LB tree can be unexpected.
This is actually pretty easy to do, take a look at FakeControlPlaneXdsIntegrationTest. You could ignore the DataPlaneRule part.
Optional of course if you feel confident that the unit tests are covering all the bases.
| private void handleClusterDiscovered() { | ||
| List<DiscoveryMechanism> instances = new ArrayList<>(); | ||
| Map<ClusterState, List<ClusterState>> parentClusters = new HashMap<>(); | ||
| Status loopStatus = null; |
There was a problem hiding this comment.
I think it might make sense to add a comment explaining that we have this loop detection built into the following logic and why it is important to have it.
| if (childClusterStates == null) { | ||
| return; | ||
| } |
There was a problem hiding this comment.
Why add this short circuit? It's better to have a single return point from a method and this change doesn't seem to affect the logic.
There was a problem hiding this comment.
Was trying to reduce nesting if { for { if { } } }. Rewrote it to use a stream instead of a for with an if.
|
Because we are using |
For duplicate children, just ignore them
For loops, Put the channels into TRANSIENT_FAILURE like when no edge policy is provided
Note, to handle loop detection being done after state objects are created, I updated FakeXdsClient watcher map value to be a collection (as is true for XdsClientImpl).
Fixes go/grpc-lb-xds-tracker row 22 (not tracked as an issue)