xds/cdsbalancer: change TestErrorFromParentLB_ResourceNotFound to send correct resources#8794
Conversation
2838498 to
d83dd4b
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #8794 +/- ##
==========================================
- Coverage 83.42% 83.36% -0.06%
==========================================
Files 418 417 -1
Lines 32897 32978 +81
==========================================
+ Hits 27443 27491 +48
- Misses 4069 4084 +15
- Partials 1385 1403 +18 🚀 New features to boost your workflow:
|
|
Please add a more descriptive title and description. This is what we recommend for our external contributors: https://github.com/grpc/grpc-go/blob/master/CONTRIBUTING.md#pr-descriptions |
|
|
||
| // Replace DNS resolver with a wrapped resolver to capture ResolveNow calls. | ||
| resolveNowCh := make(chan struct{}, 1) | ||
| resolveNowCh := make(chan struct{}, 3) |
There was a problem hiding this comment.
Why do we need this change? How is a reviewer supposed to understand the motivation for this change?
There was a problem hiding this comment.
This was a flaky test which I thought was because of my changes but this is an already flaky test which pranjali is working on fixing , so reverting the change now.
There was a problem hiding this comment.
Flake reported in this issue is caused by the changes made for logical-dns.
Fixed this in adb4625.
There was a problem hiding this comment.
IIUC, the root cause described by @Pranjali-2501 was only in her fork, not on the master branch.
If this is correct, @eshitachandwani please file an issue with the logs.
There was a problem hiding this comment.
I have already reverted the change. Maybe you are looking at the wrong commit.
|
|
||
| const ( | ||
| defaultTestTimeout = 10 * time.Second | ||
| defaultTestTimeout = 20 * time.Second |
There was a problem hiding this comment.
Same here. Why do these tests need 20s to run?
There was a problem hiding this comment.
Earlier, with my changes , the test was taking around 15 secs to pass, because the EDS update was taking a tittle longer to be received.
But that doesn't seem to be the case anymore after a few other changes, so reverting the change.
There was a problem hiding this comment.
Tests usually finish in < 100 milliseconds. There are some tests that make 1000s of RPCs to verify RPC distribution which take 1-5s. A test taking 10+ seconds usually means that there's probably a deadlock somewhere which gets resolved when teardown begins.
There was a problem hiding this comment.
Right! Maybe there was some issue with my changes that were making it fail. Which I have fixed and I have already reverted this change.
+1. Also, can you describe what caused the flakiness and how does this PR fix it? I suspect it was a recent change because I saw the same test flake for the first time and filed an issue: #8803 |
|
This change does not fix the flakiness , or rather that was not what I had in mind. While looking at the test I found that it configured one set of resources , then removes CDS,EDS and then instead of adding it back again , it just configures a whole new set of LDS,RDS,CDS,EDS. |
…d correct resources (grpc#8794) This PR fixes the `TestErrorFromParentLB_ResourceNotFound` to reconfigure the same CDS and EDS resources after they have been removed instead of configuring all the new resources. Also removes unused `defaultTestShortTimeout` from [internal/xds/balancer/clustermanager/clustermanager_test.go](https://github.com/grpc/grpc-go/blob/900ffa948c031ddd50a667750f7cba8f01c66c4d/internal/xds/balancer/clustermanager/clustermanager_test.go#L51) RELEASE NOTES: None
This PR fixes the
TestErrorFromParentLB_ResourceNotFoundto reconfigure the same CDS and EDS resources after they have been removed instead of configuring all the new resources.Also removes unused
defaultTestShortTimeoutfrom internal/xds/balancer/clustermanager/clustermanager_test.goRELEASE NOTES: None