Skip to content

Changes in host weight does not transition load balancer between weighted and unweighted mode #14360

@snowp

Description

@snowp

Some of the load balancers (random, LR, etc.) operate in different modes when all the hosts within a host source (all hosts, individual localities, etc.) are weighted or unweighted. The check to see which mode to operate in happens in

void EdfLoadBalancerBase::refresh(uint32_t priority) {
  const auto add_hosts_source = [this](HostsSource source, const HostVector& hosts) {
    // Nuke existing scheduler if it exists.
    auto& scheduler = scheduler_[source] = Scheduler{};
    refreshHostSource(source);

    // Check if the original host weights are equal and skip EDF creation if they are. When all
    // original weights are equal we can rely on unweighted host pick to do optimal round robin and
    // least-loaded host selection with lower memory and CPU overhead.
    if (hostWeightsAreEqual(hosts)) {
      // Skip edf creation.
      return;
    }

    scheduler.edf_ = std::make_unique<EdfScheduler<const Host>>();

which is triggered by the updateHosts calls propagated from the main thread on a config update.

The problem arises when considering a host weight only change: as an optimization, Envoy does not trigger updateHosts calls for certain changes to the host set. The rationale here is that the lb structures will eventually pick up on the host weight since they periodically read the host weight (stored in an atomic). While this is true for most host weight updates, some updates will require the load balancer to create/delete the EDF scheduler, requiring a full update.

The consequence of this generally will be that certain changes to host weights won't be visible to the load balancer until another update is made that triggers a full update.

Solving this in a good way is a bit tricky: at first it seems like simply checking each priority to see if it transitioned between weighted to unweighted would be sufficient, but the presence of localities and health makes it more complicated. To correctly handle this kind of update, we'd want to trigger a full refresh if any of the possible HostSets (all, healthy, degraded, healthy localities, degraded localities) transition between weighted and unweighted.

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions