Cherry pick #8610, #8615 to v1.76.x#8621
Merged
Merged
Conversation
… endpoints (grpc#8610) The new `pick_first`, which is the default, doesn't shuffle the addresses at all for resolver updates that are missing the `Endpoints` field. This change fixes that. Since [gRPC automatically sets the the missing `Endpoints`](https://github.com/grpc/grpc-go/blob/1059e84f885bf7ed65b3b1a4fbe914360d8ab5b1/resolver_wrapper.go#L136-L138), occurrence of this bug should be uncommon in practice. RELEASE NOTES: * balancer/pick_first: When configured, shuffle addresses in resolver updates that lack endpoints. Since gRPC automatically adds endpoints to resolver updates, this bug should only affect implementers of custom LB policies that use pick_first for delegation but don't forward the endpoints.
…pc#8615) Related issue: b/415354418 ## Problem On connection breakage, the pickfirst leaf balancer enters idle and returns an `Idle picker` that calls the balancer's `ExitIdle` method only the first time `Pick` is called. The following sequence of events will cause the balancer to get stuck in `Idle` state: 1. Existing connection breaks, SubConn [requests re-resolution and reports IDLE](https://github.com/grpc/grpc-go/blob/bb71072094cf533965450c44890f8f51c671c393/clientconn.go#L1388-L1393). In turn PF updates the ClientConn state to IDLE with an `Idle picker`. 1. An RPC is made, triggering `balancer.ExitIdle` through the idle picker. The balancer attempts to re-connect the failed SubConn. 1. The resolver produces a new endpoint list, removing the endpoint used by the existing SubConn. PF removes the existing SubConn. Since the balancer didn't update the ClientConn state to CONNECTING yet, pickfirst thinks that it's still in IDLE and doesn't start connecting to the new endpoints. 1. New RPC requests trigger the idle picker, but it's a no-op since it only [triggers the balancer's ExitIdle method once](https://github.com/grpc/grpc-go/blob/bb71072094cf533965450c44890f8f51c671c393/balancer/pickfirst/pickfirstleaf/pickfirstleaf.go#L663https://github.com/grpc/grpc-go/blob/bb71072094cf533965450c44890f8f51c671c393/balancer/pickfirst/pickfirstleaf/pickfirstleaf.go#L663). ## Fix This change moves the ClientConn into Connecting immediately when the `ExitIdle` method is called. This ensures that the balancer continues to re-connect when a new endpoint list is produced by the resolver. RELEASE NOTES: * balancer/pickfirst: Fix bug that can cause balancer to get stuck in `IDLE` state on connection failure.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## v1.76.x #8621 +/- ##
===========================================
- Coverage 82.12% 81.95% -0.18%
===========================================
Files 415 415
Lines 40686 40693 +7
===========================================
- Hits 33412 33348 -64
- Misses 5896 5968 +72
+ Partials 1378 1377 -1
🚀 New features to boost your workflow:
|
eshitachandwani
approved these changes
Oct 1, 2025
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Original PRs: #8610, #8615
RELEASE NOTES:
IDLEstate on backend address change.