Bug description
Consider the following setup:
- Multiple Service Entries with a locality of the type
prefix/az/cluster, for example
- foo/eu-west-1a/cluster1
- foo/eu-west-1b/cluster2
- A globally configured Locality-weighted load balancing as the following
localityLbSetting:
distribute:
- from: '*'
to:
foo/eu-west-1a/cluster1: 98
foo/eu-west-1b/cluster2: 1
eu-west-1/*: 1
On cluster1, the mesh works as expected and the routes get updated.
As soon we remove one locality from the Istio configmap, all the new envoy proxies don't get ready anymore.
For example, with this configuration
localityLbSetting:
distribute:
- from: '*'
to:
foo/eu-west-1b/cluster2: 99
eu-west-1/*: 1
we get the following error
Envoy proxy is NOT ready: config not received from Pilot (is Pilot running?): cds updates: 1 successful, 0 rejected; lds updates: 0 successful, 0 rejected
As soon as we add the locality back, the proxies become ready again. We can't see any error in the Pilot logs.
Another interesting discovery is that we just need the locality in the configmap, so if we write something like
localityLbSetting:
distribute:
- from: '*'
to:
foo/eu-west-1b/cluster2: 99
eu-west-1/*: 1
- from: 'bar'
to:
foo/eu-west-1a/cluster1: 98
foo/eu-west-1b/cluster2: 1
eu-west-1/*: 1
All the envoy proxies are starting even if bar is a non existing locality.
Is this an expected behavior? We couldn't find anything in the documentation mentioning this.
Expected behavior
I would expect the mesh working even if we remove a locality from the distribute settings.
Steps to reproduce the bug
See the description.
Version (include the output of istioctl version --remote and kubectl version and helm version if you used Helm)
❯ istioctl version --remote
client version: 1.4.6
crosscellgateway version:
crossregiongateway version:
defaultgateway version:
citadel version: 1.4.3
citadel version: 1.4.3
galley version: 1.4.3
galley version: 1.4.3
pilot version: 1.4.3
pilot version: 1.4.3
policy version: 1.4.3
policy version: 1.4.3
sidecar-injector version: 1.4.3
sidecar-injector version: 1.4.3
telemetry version: 1.4.3
telemetry version: 1.4.3
thanosgateway version:
thanosgateway version:
data plane version: 1.4.3 (28 proxies)
❯ kubectl version
Client Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.0", GitCommit:"9e991415386e4cf155a24b1da15becaa390438d8", GitTreeState:"clean", BuildDate:"2020-03-26T06:16:15Z", GoVersion:"go1.14", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"14", GitVersion:"v1.14.9", GitCommit:"500f5aba80d71253cc01ac6a8622b8377f4a7ef9", GitTreeState:"clean", BuildDate:"2019-11-13T11:13:04Z", GoVersion:"go1.12.12", Compiler:"gc", Platform:"linux/amd64"}
How was Istio installed?
Helm chart deployed with Spinnaker.
Environment where bug was observed (cloud vendor, OS, etc)
AWS, AMI Amazon Linux 2
Bug description
Consider the following setup:
prefix/az/cluster, for exampleOn
cluster1, the mesh works as expected and the routes get updated.As soon we remove one locality from the Istio configmap, all the new envoy proxies don't get ready anymore.
For example, with this configuration
we get the following error
As soon as we add the locality back, the proxies become ready again. We can't see any error in the Pilot logs.
Another interesting discovery is that we just need the locality in the configmap, so if we write something like
All the envoy proxies are starting even if
baris a non existing locality.Is this an expected behavior? We couldn't find anything in the documentation mentioning this.
Expected behavior
I would expect the mesh working even if we remove a locality from the distribute settings.
Steps to reproduce the bug
See the description.
Version (include the output of
istioctl version --remoteandkubectl versionandhelm versionif you used Helm)How was Istio installed?
Helm chart deployed with Spinnaker.
Environment where bug was observed (cloud vendor, OS, etc)
AWS, AMI Amazon Linux 2