Skip to content

Removing a locality from localityLbSetting.Distribute breaks the mesh #22837

@maruina

Description

@maruina

Bug description

Consider the following setup:

  • Multiple Service Entries with a locality of the type prefix/az/cluster, for example
- foo/eu-west-1a/cluster1
- foo/eu-west-1b/cluster2
  • A globally configured Locality-weighted load balancing as the following
localityLbSetting:
  distribute:
  - from: '*'
    to: 
      foo/eu-west-1a/cluster1: 98
      foo/eu-west-1b/cluster2: 1
      eu-west-1/*: 1

On cluster1, the mesh works as expected and the routes get updated.

As soon we remove one locality from the Istio configmap, all the new envoy proxies don't get ready anymore.

For example, with this configuration

localityLbSetting:
  distribute:
  - from: '*'
    to: 
      foo/eu-west-1b/cluster2: 99
      eu-west-1/*: 1

we get the following error

Envoy proxy is NOT ready: config not received from Pilot (is Pilot running?): cds updates: 1 successful, 0 rejected; lds updates: 0 successful, 0 rejected

As soon as we add the locality back, the proxies become ready again. We can't see any error in the Pilot logs.

Another interesting discovery is that we just need the locality in the configmap, so if we write something like

localityLbSetting:
  distribute:
  - from: '*'
    to: 
      foo/eu-west-1b/cluster2: 99
      eu-west-1/*: 1
  - from: 'bar'
    to: 
      foo/eu-west-1a/cluster1: 98
      foo/eu-west-1b/cluster2: 1
      eu-west-1/*: 1

All the envoy proxies are starting even if bar is a non existing locality.

Is this an expected behavior? We couldn't find anything in the documentation mentioning this.

Expected behavior

I would expect the mesh working even if we remove a locality from the distribute settings.

Steps to reproduce the bug

See the description.

Version (include the output of istioctl version --remote and kubectl version and helm version if you used Helm)

❯ istioctl version --remote
client version: 1.4.6
crosscellgateway version:
crossregiongateway version:
defaultgateway version:
citadel version: 1.4.3
citadel version: 1.4.3
galley version: 1.4.3
galley version: 1.4.3
pilot version: 1.4.3
pilot version: 1.4.3
policy version: 1.4.3
policy version: 1.4.3
sidecar-injector version: 1.4.3
sidecar-injector version: 1.4.3
telemetry version: 1.4.3
telemetry version: 1.4.3
thanosgateway version:
thanosgateway version:
data plane version: 1.4.3 (28 proxies)
❯ kubectl version
Client Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.0", GitCommit:"9e991415386e4cf155a24b1da15becaa390438d8", GitTreeState:"clean", BuildDate:"2020-03-26T06:16:15Z", GoVersion:"go1.14", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"14", GitVersion:"v1.14.9", GitCommit:"500f5aba80d71253cc01ac6a8622b8377f4a7ef9", GitTreeState:"clean", BuildDate:"2019-11-13T11:13:04Z", GoVersion:"go1.12.12", Compiler:"gc", Platform:"linux/amd64"}

How was Istio installed?

Helm chart deployed with Spinnaker.

Environment where bug was observed (cloud vendor, OS, etc)

AWS, AMI Amazon Linux 2

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions