Bug description
After installing bookinfo (or other applications), the following validation error appears in Pilot, and affected proxies remain in STALE state:
2020-01-10T23:45:25.493906Z warn ads ADS:EDS: ACK ERROR 127.0.0.1:41138 sidecar~10.56.0.90~reviews-v2-54594d974-l2gcj.bookinfo~bookinfo.svc.cluster.local-14 Internal:Proto constraint validation failed (ClusterLoadAssignmentValidationError.Endpoints[i]: ["embedded message failed validation"] | caused by LocalityLbEndpointsValidationError.LbEndpoints[i]: ["embedded message failed validation"] | caused by LbEndpointValidationError.LoadBalancingWeight: ["value must be greater than or equal to " '\x01']): cluster_name: "outbound|9080||reviews.bookinfo.svc.cluster.local"
endpoints {
locality {
region: "europe-west1"
zone: "europe-west1-b"
}
lb_endpoints {
endpoint {
address {
socket_address {
address: "10.56.0.90"
port_value: 9080
}
}
}
metadata {
filter_metadata {
key: "envoy.transport_socket_match"
value {
fields {
key: "tlsMode"
value {
string_value: "istio"
}
}
}
}
filter_metadata {
key: "istio"
value {
fields {
key: "network"
value {
string_value: "Kubernetes"
}
}
}
}
}
load_balancing_weight {
value: 1
}
}
lb_endpoints {
endpoint {
address {
socket_address {
address: "10.56.1.23"
port_value: 9080
}
}
}
metadata {
filter_metadata {
key: "envoy.transport_socket_match"
value {
fields {
key: "tlsMode"
value {
string_value: "istio"
}
}
}
}
filter_metadata {
key: "istio"
value {
fields {
key: "network"
value {
string_value: "Kubernetes"
}
}
}
}
}
load_balancing_weight {
value: 1
}
}
lb_endpoints {
endpoint {
address {
socket_address {
address: "10.56.1.24"
port_value: 9080
}
}
}
metadata {
filter_metadata {
key: "envoy.transport_socket_match"
value {
fields {
key: "tlsMode"
value {
string_value: "istio"
}
}
}
}
filter_metadata {
key: "istio"
value {
fields {
key: "network"
value {
string_value: "Kubernetes"
}
}
}
}
}
load_balancing_weight {
}
1: 1
}
load_balancing_weight {
value: 3
}
}
Note the following at the end of the error. That 1:1 seems to be the caue of the issue.
load_balancing_weight {
}
1: 1
$ istioctl ps
NAME CDS LDS EDS RDS PILOT VERSION
details-v1-69fd998859-d8sjx.bookinfo SYNCED SYNCED SYNCED SYNCED istio-pilot-6f59d5dff7-szbqs 1.5.0
productpage-v1-7c8647989-cw4jn.bookinfo SYNCED SYNCED SYNCED SYNCED istio-pilot-6f59d5dff7-szbqs 1.5.0
ratings-v1-7c688df87-wq5km.bookinfo SYNCED SYNCED SYNCED SYNCED istio-pilot-6f59d5dff7-szbqs 1.5.0
reviews-v1-7689bf68c4-cdn2m.bookinfo SYNCED SYNCED SYNCED SYNCED istio-pilot-6f59d5dff7-szbqs 1.5.0
reviews-v2-54594d974-l2gcj.bookinfo SYNCED SYNCED STALE SYNCED istio-pilot-6f59d5dff7-szbqs 1.5.0
reviews-v3-58b7db6d78-wwdpf.bookinfo SYNCED SYNCED SYNCED SYNCED istio-pilot-6f59d5dff7-szbqs 1.5.0
tcc-gateway-775dcbfbd6-vklh9.bookinfo SYNCED SYNCED SYNCED STALE istio-pilot-6f59d5dff7-szbqs 1.5.0
vmgateway-6597b46d74-jw5td.istio-system SYNCED SYNCED SYNCED NOT SENT istio-pilot-6f59d5dff7-szbqs 1.5.0
Expected behavior
There is no validation issue and the proxies receive the configuration normally.
Steps to reproduce the bug
I haven't been able to find a consistent way of reproducing this, and after some time the issue disappears.
Version (include the output of istioctl version --remote and kubectl version and helm version if you used Helm)
$ istioctl version --remote
client version: unknown
control plane version: 8d65b5b51ac625123ce43d912dc6e3484889fdb7
data plane version: 1.5.0 (8 proxies)
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"16", GitVersion:"v1.16.3", GitCommit:"b3cbbae08ec52a7fc73d334838e18d17e8512749", GitTreeState:"clean", BuildDate:"2019-11-14T04:24:29Z", GoVersion:"go1.12.13", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"13+", GitVersion:"v1.13.11-gke.14", GitCommit:"56d89863d1033f9668ddd6e1c1aea81cd846ef88", GitTreeState:"clean", BuildDate:"2019-11-07T19:12:22Z", GoVersion:"go1.12.11b4", Compiler:"gc", Platform:"linux/amd64"}
How was Istio installed?
Istio installed using istioctl manifest apply
Bookinfo and the bookinfo gateway were installed followignt he steps described int he website.
Environment where bug was observed (cloud vendor, OS, etc)
This appeared in a GKE cluster:
$ gcloud container clusters describe dev
addonsConfig:
kubernetesDashboard:
disabled: true
networkPolicyConfig:
disabled: true
clusterIpv4Cidr: <REDACTED>
createTime: '2019-11-18T15:48:15+00:00'
currentMasterVersion: 1.13.11-gke.14
currentNodeCount: 2
currentNodeVersion: 1.13.11-gke.14
databaseEncryption:
state: DECRYPTED
defaultMaxPodsConstraint:
maxPodsPerNode: '110'
endpoint: <REDACTED>
initialClusterVersion: 1.13.11-gke.14
instanceGroupUrls:
- <REDACTED>
labelFingerprint: a9dc16a7
legacyAbac: {}
location: europe-west1-b
locations:
- europe-west1-b
loggingService: logging.googleapis.com
maintenancePolicy:
resourceVersion: e3b0c442
masterAuth:
clusterCaCertificate: <REDACTED>
monitoringService: monitoring.googleapis.com
name: dev
network: default
networkConfig:
network: <REDACTED>
subnetwork: <REDACTED>
nodeConfig:
diskSizeGb: 100
diskType: pd-standard
imageType: COS
machineType: n1-standard-4
metadata:
disable-legacy-endpoints: 'true'
oauthScopes:
- https://www.googleapis.com/auth/devstorage.read_only
- https://www.googleapis.com/auth/logging.write
- https://www.googleapis.com/auth/monitoring
- https://www.googleapis.com/auth/service.management.readonly
- https://www.googleapis.com/auth/servicecontrol
- https://www.googleapis.com/auth/trace.append
serviceAccount: default
shieldedInstanceConfig:
enableIntegrityMonitoring: true
nodeIpv4CidrSize: 24
nodePools:
- autoscaling:
enabled: true
maxNodeCount: 5
minNodeCount: 1
config:
diskSizeGb: 100
diskType: pd-standard
imageType: COS
machineType: n1-standard-4
metadata:
disable-legacy-endpoints: 'true'
oauthScopes:
- https://www.googleapis.com/auth/devstorage.read_only
- https://www.googleapis.com/auth/logging.write
- https://www.googleapis.com/auth/monitoring
- https://www.googleapis.com/auth/service.management.readonly
- https://www.googleapis.com/auth/servicecontrol
- https://www.googleapis.com/auth/trace.append
serviceAccount: default
shieldedInstanceConfig:
enableIntegrityMonitoring: true
initialNodeCount: 2
instanceGroupUrls:
- <REDACTED>
management:
autoRepair: true
autoUpgrade: true
name: default-pool
podIpv4CidrSize: 24
selfLink: <REDACTED>
status: RUNNING
version: 1.13.11-gke.14
selfLink: <REDACTED>
servicesIpv4Cidr: <REDACTED>
status: RUNNING
subnetwork: default
zone: europe-west1-b
Bug description
After installing bookinfo (or other applications), the following validation error appears in Pilot, and affected proxies remain in STALE state:
Note the following at the end of the error. That
1:1seems to be the caue of the issue.Expected behavior
There is no validation issue and the proxies receive the configuration normally.
Steps to reproduce the bug
I haven't been able to find a consistent way of reproducing this, and after some time the issue disappears.
Version (include the output of
istioctl version --remoteandkubectl versionandhelm versionif you used Helm)How was Istio installed?
Istio installed using
istioctl manifest applyBookinfo and the bookinfo gateway were installed followignt he steps described int he website.
Environment where bug was observed (cloud vendor, OS, etc)
This appeared in a GKE cluster: