Skip to content

xds/cdsbalancer: change TestErrorFromParentLB_ResourceNotFound to send correct resources#8794

Merged
eshitachandwani merged 6 commits into
grpc:masterfrom
eshitachandwani:minor_test_change
Jan 8, 2026
Merged

xds/cdsbalancer: change TestErrorFromParentLB_ResourceNotFound to send correct resources#8794
eshitachandwani merged 6 commits into
grpc:masterfrom
eshitachandwani:minor_test_change

Conversation

@eshitachandwani

@eshitachandwani eshitachandwani commented Dec 26, 2025

Copy link
Copy Markdown
Member

This PR fixes the TestErrorFromParentLB_ResourceNotFound to reconfigure the same CDS and EDS resources after they have been removed instead of configuring all the new resources.

Also removes unused defaultTestShortTimeout from internal/xds/balancer/clustermanager/clustermanager_test.go
RELEASE NOTES: None

@eshitachandwani eshitachandwani added Type: Internal Cleanup Refactors, etc Area: xDS Includes everything xDS related, including LB policies used with xDS. labels Dec 26, 2025
@eshitachandwani eshitachandwani added this to the 1.79 Release milestone Dec 26, 2025
@codecov

codecov Bot commented Dec 26, 2025

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 83.36%. Comparing base (4046676) to head (2b00e5d).
⚠️ Report is 13 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #8794      +/-   ##
==========================================
- Coverage   83.42%   83.36%   -0.06%     
==========================================
  Files         418      417       -1     
  Lines       32897    32978      +81     
==========================================
+ Hits        27443    27491      +48     
- Misses       4069     4084      +15     
- Partials     1385     1403      +18     

see 42 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@easwars

easwars commented Jan 6, 2026

Copy link
Copy Markdown
Contributor

Please add a more descriptive title and description. This is what we recommend for our external contributors: https://github.com/grpc/grpc-go/blob/master/CONTRIBUTING.md#pr-descriptions


// Replace DNS resolver with a wrapped resolver to capture ResolveNow calls.
resolveNowCh := make(chan struct{}, 1)
resolveNowCh := make(chan struct{}, 3)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this change? How is a reviewer supposed to understand the motivation for this change?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was a flaky test which I thought was because of my changes but this is an already flaky test which pranjali is working on fixing , so reverting the change now.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Flake reported in this issue is caused by the changes made for logical-dns.
Fixed this in adb4625.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, the root cause described by @Pranjali-2501 was only in her fork, not on the master branch.

If this is correct, @eshitachandwani please file an issue with the logs.

@eshitachandwani eshitachandwani Jan 8, 2026

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have already reverted the change. Maybe you are looking at the wrong commit.


const (
defaultTestTimeout = 10 * time.Second
defaultTestTimeout = 20 * time.Second

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here. Why do these tests need 20s to run?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Earlier, with my changes , the test was taking around 15 secs to pass, because the EDS update was taking a tittle longer to be received.

But that doesn't seem to be the case anymore after a few other changes, so reverting the change.

@arjan-bal arjan-bal Jan 8, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests usually finish in < 100 milliseconds. There are some tests that make 1000s of RPCs to verify RPC distribution which take 1-5s. A test taking 10+ seconds usually means that there's probably a deadlock somewhere which gets resolved when teardown begins.

@eshitachandwani eshitachandwani Jan 8, 2026

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right! Maybe there was some issue with my changes that were making it fail. Which I have fixed and I have already reverted this change.

@easwars easwars assigned eshitachandwani and unassigned easwars and arjan-bal Jan 6, 2026
Comment thread logs.txt Outdated
Comment thread internal/xds/balancer/clusterresolver/resource_resolver_eds.go Outdated
@arjan-bal

Copy link
Copy Markdown
Contributor

Please add a more descriptive title and description. This is what we recommend for our external contributors: https://github.com/grpc/grpc-go/blob/master/CONTRIBUTING.md#pr-descriptions

+1. Also, can you describe what caused the flakiness and how does this PR fix it? I suspect it was a recent change because I saw the same test flake for the first time and filed an issue: #8803

@eshitachandwani

Copy link
Copy Markdown
Member Author

This change does not fix the flakiness , or rather that was not what I had in mind. While looking at the test I found that it configured one set of resources , then removes CDS,EDS and then instead of adding it back again , it just configures a whole new set of LDS,RDS,CDS,EDS.
So this change adds the same CDS,EDS back after removing to justify the test's goal.

@arjan-bal arjan-bal left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, please improve the PR title. Using a short but descriptive title makes things easier for people browsing through the commit history.

@eshitachandwani eshitachandwani changed the title xds: test changes xds/cdsbalancer: change TestErrorFromParentLB_ResourceNotFound to send correct resources Jan 8, 2026
@eshitachandwani eshitachandwani merged commit 9cf0a86 into grpc:master Jan 8, 2026
17 checks passed
mbissa pushed a commit to mbissa/grpc-go that referenced this pull request Feb 16, 2026
…d correct resources (grpc#8794)

This PR fixes the `TestErrorFromParentLB_ResourceNotFound` to
reconfigure the same CDS and EDS resources after they have been removed
instead of configuring all the new resources.

Also removes unused `defaultTestShortTimeout` from
[internal/xds/balancer/clustermanager/clustermanager_test.go](https://github.com/grpc/grpc-go/blob/900ffa948c031ddd50a667750f7cba8f01c66c4d/internal/xds/balancer/clustermanager/clustermanager_test.go#L51)
RELEASE NOTES: None
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Area: xDS Includes everything xDS related, including LB policies used with xDS. Type: Internal Cleanup Refactors, etc

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants