pass health check service name through LB policies via a channel arg by markdroth · Pull Request #26441 · grpc/grpc

markdroth · 2021-06-07T17:28:16Z

Instead of storing the health check service name in the client channel code and updating all of the existing subchannel wrappers whenever it changes, we pass the health check service name through the LB policies via a channel arg. This avoids the need for synchronization in the client channel code.

Note that if the health check service name changes and existing subchannels report that the new name is unhealthy, we want the LB policies to stop sending them traffic. Previously, this happened automatically, because the channel updated the health check service name in the existing subchannel wrappers, so they would quickly start reporting TRANSIENT_FAILURE. However, now that the health check service name change is propagating through the LB policies via an update, this means that the new subchannel list will never report READY, and the current code would therefore never swap it in; it would just continue using the old subchannel list indefinitely, which is not what we want. So in order to make this work, I have changed both the pick_first and round_robin LB policies to swap in the new subchannel list if all of the subchannels are considered to be in TRANSIENT_FAILURE. (This change is something we were planning to do eventually anyway, to be consistent with our policy of always honoring what the control plane tells us to do, even if that causes the channel to go into TRANSIENT_FAILURE. That is also why I've changed the pick_first policy and not just the round_robin policy; the pick_first change is not actually directly related to the health check service name change, since pick_first does not support client-side health checking.)

…nnected_subchannel_simplification

stale · 2021-09-06T19:43:06Z

This issue/PR has been automatically marked as stale because it has not had any update (including commits, comments, labels, milestones, etc) for 30 days. It will be closed automatically if no further update occurs in 7 day. Thank you for your contributions!

…nnected_subchannel_simplification

…o client_channel_health_check_service_name_in_channel_args

…alth_check_service_name_in_channel_args

markdroth · 2021-10-07T17:15:03Z

The test failures all look like infrastructure issues.

) Currently, the client channel maintains two different data structures for tracking subchannels: - `subchannel_refcount_map_`, which tracks the number of subchannel wrappers per subchannel. This is used to determine when to create and remove channelz node linkage. - `subchannel_wrappers_`, which tracks the set of all subchannel wrappers across all subchannels. This was originally introduced back in #20039 as part of propagating the health check service name to subchannels, but we switched to using a different approach for the health check service name in #26441. However, in between those two, we started using this data structure for handling keepalive time in #23313 -- which is actually somewhat inefficient, since we may wind up setting the keepalive time on the underlying subchannel more than once in the case where there is more than one subchannel wrapper per subchannel (which happens frequently during updates). This PR combines these two data structures into one: a map from subchannel to a set of subchannel wrappers for that subchannel. This is used both for channelz node updates and keepalive propagation -- and it's more efficient for the latter, because we can now update each subchannel exactly once. This also paves the way for a subsequent change that will be needed as part of the MAX_CONCURRENT_STREAMS design. Closes #40880 COPYBARA_INTEGRATE_REVIEW=#40880 from markdroth:client_channel_combine_subchannel_maps f289c20 PiperOrigin-RevId: 829602334

…c#40880) Currently, the client channel maintains two different data structures for tracking subchannels: - `subchannel_refcount_map_`, which tracks the number of subchannel wrappers per subchannel. This is used to determine when to create and remove channelz node linkage. - `subchannel_wrappers_`, which tracks the set of all subchannel wrappers across all subchannels. This was originally introduced back in grpc#20039 as part of propagating the health check service name to subchannels, but we switched to using a different approach for the health check service name in grpc#26441. However, in between those two, we started using this data structure for handling keepalive time in grpc#23313 -- which is actually somewhat inefficient, since we may wind up setting the keepalive time on the underlying subchannel more than once in the case where there is more than one subchannel wrapper per subchannel (which happens frequently during updates). This PR combines these two data structures into one: a map from subchannel to a set of subchannel wrappers for that subchannel. This is used both for channelz node updates and keepalive propagation -- and it's more efficient for the latter, because we can now update each subchannel exactly once. This also paves the way for a subsequent change that will be needed as part of the MAX_CONCURRENT_STREAMS design. Closes grpc#40880 COPYBARA_INTEGRATE_REVIEW=grpc#40880 from markdroth:client_channel_combine_subchannel_maps f289c20 PiperOrigin-RevId: 829602334

markdroth added 5 commits June 3, 2021 15:24

grab connected subchannel in the data plane

e73ce61

don't return connected subchannel via state watch

6b755fa

fix build

354d1d6

fix build

9743f75

pass health check service name through LB policies via a channel arg

817d6f4

markdroth added the release notes: no Indicates if PR should not be in release notes label Jun 7, 2021

markdroth added 4 commits June 7, 2021 10:48

fix build

34f613d

clang-format

0257fe4

Merge remote-tracking branch 'upstream/master' into client_channel_co…

ad25bca

…nnected_subchannel_simplification

add trace log

fb49ec8

stale bot added the disposition/stale label Sep 6, 2021

markdroth removed the disposition/stale label Sep 7, 2021

markdroth added 3 commits October 5, 2021 15:30

Merge remote-tracking branch 'upstream/master' into client_channel_co…

116bb62

…nnected_subchannel_simplification

Merge branch 'client_channel_connected_subchannel_simplification' int…

c4f2e1c

…o client_channel_health_check_service_name_in_channel_args

Merge remote-tracking branch 'upstream/master' into client_channel_he…

ee0b901

…alth_check_service_name_in_channel_args

markdroth requested a review from donnadionne October 6, 2021 19:00

markdroth marked this pull request as ready for review October 6, 2021 19:00

github-actions bot added the lang/core label Oct 6, 2021

donnadionne approved these changes Oct 6, 2021

View reviewed changes

Merge remote-tracking branch 'upstream/master' into client_channel_he…

e40ce9f

…alth_check_service_name_in_channel_args

markdroth merged commit cb8dafa into grpc:master Oct 7, 2021

markdroth deleted the client_channel_health_check_service_name_in_channel_args branch October 7, 2021 17:15

copybara-service bot added the imported Specifies if the PR has been imported to the internal repository label Oct 8, 2021

scwhittle mentioned this pull request Feb 2, 2022

[grpc-c++] pick_first policy can end up in state where it is not watching channel connectivity #28771

Closed

markdroth mentioned this pull request Oct 9, 2025

[client channel] combine two subchannel data structures into one #40880

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pass health check service name through LB policies via a channel arg#26441

pass health check service name through LB policies via a channel arg#26441
markdroth merged 13 commits intogrpc:masterfrom
markdroth:client_channel_health_check_service_name_in_channel_args

markdroth commented Jun 7, 2021 •

edited

Loading

Uh oh!

stale bot commented Sep 6, 2021

Uh oh!

markdroth commented Oct 7, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

markdroth commented Jun 7, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

stale bot commented Sep 6, 2021

Uh oh!

markdroth commented Oct 7, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

markdroth commented Jun 7, 2021 •

edited

Loading