-
Notifications
You must be signed in to change notification settings - Fork 5.3k
Option to force LocalityDirect with Zone Aware Routing #38758
Description
Description:
Currently, when using Zone Aware Routing, Envoy prioritizes routing to hosts in the local zone but still aims to achieve an overall equal distribution of traffic across upstream hosts. This means that if the source cluster doesn't have hosts in a particular zone that the destination cluster uses, Envoy may route a portion of traffic to that remote zone in order to maintain balance.
Example Scenario:
- Source Cluster: Instances spread across zones A and B.
- Destination Cluster: Instances spread across zones A, B, and C.
In this setup, the source Envoy cluster routes traffic as follows:
- Instances in zone A will route approximately 66% of their traffic to the local zone A and 33% to remote zone C.
- Instances in zone B will act similarly and route 66% of their traffic to the local zone B and 33% to remote zone C.
This achieves an overall equal distribution of traffic but doesn't allow for explicitly keeping all traffic in the local zone.
Desired Behavior:
I propose adding a configuration option that enables Envoy to send 100% of the traffic to the local zone whenever healthy hosts are available, regardless of the overall distribution of zones between the source and destination clusters.
This change would allow users to prioritize low-latency, intra-zone communication whenever possible and gives more explicit control vs preferring an equal distribution.
If accepted, it might also be beneficial to include an option that sets a minimum number of hosts in the upstream zone (similar to min_cluster_size) for Zone Aware Routing to take effect. This would ensure that the routing logic only applies when there is sufficient capacity in the local zone. I opened a separate issue #38561 covering that and could close that in favor of this if preferred.
[optional Relevant Links:]
This is the code where LocalityDirect vs LocalityResidual is determined:
envoy/source/extensions/load_balancing_policies/common/load_balancer_impl.cc
Lines 498 to 505 in d70c259
| // If we have lower percent of hosts in the local cluster in the same locality, | |
| // we can push all of the requests directly to upstream cluster in the same locality. | |
| if (upstreamHostsPerLocality.hasLocalLocality() && | |
| locality_percentages[0].upstream_percentage > 0 && | |
| locality_percentages[0].upstream_percentage >= locality_percentages[0].local_percentage) { | |
| state.locality_routing_state_ = LocalityRoutingState::LocalityDirect; | |
| return; | |
| } |