Skip to content

LbEndpointValidationError in Pilot #20087

@nacx

Description

@nacx

Bug description

After installing bookinfo (or other applications), the following validation error appears in Pilot, and affected proxies remain in STALE state:

2020-01-10T23:45:25.493906Z	warn	ads	ADS:EDS: ACK ERROR 127.0.0.1:41138 sidecar~10.56.0.90~reviews-v2-54594d974-l2gcj.bookinfo~bookinfo.svc.cluster.local-14 Internal:Proto constraint validation failed (ClusterLoadAssignmentValidationError.Endpoints[i]: ["embedded message failed validation"] | caused by LocalityLbEndpointsValidationError.LbEndpoints[i]: ["embedded message failed validation"] | caused by LbEndpointValidationError.LoadBalancingWeight: ["value must be greater than or equal to " '\x01']): cluster_name: "outbound|9080||reviews.bookinfo.svc.cluster.local"
endpoints {
  locality {
    region: "europe-west1"
    zone: "europe-west1-b"
  }
  lb_endpoints {
    endpoint {
      address {
        socket_address {
          address: "10.56.0.90"
          port_value: 9080
        }
      }
    }
    metadata {
      filter_metadata {
        key: "envoy.transport_socket_match"
        value {
          fields {
            key: "tlsMode"
            value {
              string_value: "istio"
            }
          }
        }
      }
      filter_metadata {
        key: "istio"
        value {
          fields {
            key: "network"
            value {
              string_value: "Kubernetes"
            }
          }
        }
      }
    }
    load_balancing_weight {
      value: 1
    }
  }
  lb_endpoints {
    endpoint {
      address {
        socket_address {
          address: "10.56.1.23"
          port_value: 9080
        }
      }
    }
    metadata {
      filter_metadata {
        key: "envoy.transport_socket_match"
        value {
          fields {
            key: "tlsMode"
            value {
              string_value: "istio"
            }
          }
        }
      }
      filter_metadata {
        key: "istio"
        value {
          fields {
            key: "network"
            value {
              string_value: "Kubernetes"
            }
          }
        }
      }
    }
    load_balancing_weight {
      value: 1
    }
  }
  lb_endpoints {
    endpoint {
      address {
        socket_address {
          address: "10.56.1.24"
          port_value: 9080
        }
      }
    }
    metadata {
      filter_metadata {
        key: "envoy.transport_socket_match"
        value {
          fields {
            key: "tlsMode"
            value {
              string_value: "istio"
            }
          }
        }
      }
      filter_metadata {
        key: "istio"
        value {
          fields {
            key: "network"
            value {
              string_value: "Kubernetes"
            }
          }
        }
      }
    }
    load_balancing_weight {
    }
    1: 1
  }
  load_balancing_weight {
    value: 3
  }
}

Note the following at the end of the error. That 1:1 seems to be the caue of the issue.

    load_balancing_weight {
    }
    1: 1
$ istioctl ps
NAME                                        CDS        LDS        EDS        RDS          PILOT                            VERSION
details-v1-69fd998859-d8sjx.bookinfo        SYNCED     SYNCED     SYNCED     SYNCED       istio-pilot-6f59d5dff7-szbqs     1.5.0
productpage-v1-7c8647989-cw4jn.bookinfo     SYNCED     SYNCED     SYNCED     SYNCED       istio-pilot-6f59d5dff7-szbqs     1.5.0
ratings-v1-7c688df87-wq5km.bookinfo         SYNCED     SYNCED     SYNCED     SYNCED       istio-pilot-6f59d5dff7-szbqs     1.5.0
reviews-v1-7689bf68c4-cdn2m.bookinfo        SYNCED     SYNCED     SYNCED     SYNCED       istio-pilot-6f59d5dff7-szbqs     1.5.0
reviews-v2-54594d974-l2gcj.bookinfo         SYNCED     SYNCED     STALE      SYNCED       istio-pilot-6f59d5dff7-szbqs     1.5.0
reviews-v3-58b7db6d78-wwdpf.bookinfo        SYNCED     SYNCED     SYNCED     SYNCED       istio-pilot-6f59d5dff7-szbqs     1.5.0
tcc-gateway-775dcbfbd6-vklh9.bookinfo       SYNCED     SYNCED     SYNCED     STALE        istio-pilot-6f59d5dff7-szbqs     1.5.0
vmgateway-6597b46d74-jw5td.istio-system     SYNCED     SYNCED     SYNCED     NOT SENT     istio-pilot-6f59d5dff7-szbqs     1.5.0

Expected behavior

There is no validation issue and the proxies receive the configuration normally.

Steps to reproduce the bug

I haven't been able to find a consistent way of reproducing this, and after some time the issue disappears.

Version (include the output of istioctl version --remote and kubectl version and helm version if you used Helm)

$ istioctl version --remote
client version: unknown
control plane version: 8d65b5b51ac625123ce43d912dc6e3484889fdb7
data plane version: 1.5.0 (8 proxies)
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"16", GitVersion:"v1.16.3", GitCommit:"b3cbbae08ec52a7fc73d334838e18d17e8512749", GitTreeState:"clean", BuildDate:"2019-11-14T04:24:29Z", GoVersion:"go1.12.13", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"13+", GitVersion:"v1.13.11-gke.14", GitCommit:"56d89863d1033f9668ddd6e1c1aea81cd846ef88", GitTreeState:"clean", BuildDate:"2019-11-07T19:12:22Z", GoVersion:"go1.12.11b4", Compiler:"gc", Platform:"linux/amd64"}

How was Istio installed?

Istio installed using istioctl manifest apply
Bookinfo and the bookinfo gateway were installed followignt he steps described int he website.

Environment where bug was observed (cloud vendor, OS, etc)

This appeared in a GKE cluster:

$ gcloud container clusters describe dev
addonsConfig:
  kubernetesDashboard:
    disabled: true
  networkPolicyConfig:
    disabled: true
clusterIpv4Cidr: <REDACTED>
createTime: '2019-11-18T15:48:15+00:00'
currentMasterVersion: 1.13.11-gke.14
currentNodeCount: 2
currentNodeVersion: 1.13.11-gke.14
databaseEncryption:
  state: DECRYPTED
defaultMaxPodsConstraint:
  maxPodsPerNode: '110'
endpoint: <REDACTED>
initialClusterVersion: 1.13.11-gke.14
instanceGroupUrls:
- <REDACTED>
labelFingerprint: a9dc16a7
legacyAbac: {}
location: europe-west1-b
locations:
- europe-west1-b
loggingService: logging.googleapis.com
maintenancePolicy:
  resourceVersion: e3b0c442
masterAuth:
  clusterCaCertificate: <REDACTED>
monitoringService: monitoring.googleapis.com
name: dev
network: default
networkConfig:
  network: <REDACTED>
  subnetwork: <REDACTED>
nodeConfig:
  diskSizeGb: 100
  diskType: pd-standard
  imageType: COS
  machineType: n1-standard-4
  metadata:
    disable-legacy-endpoints: 'true'
  oauthScopes:
  - https://www.googleapis.com/auth/devstorage.read_only
  - https://www.googleapis.com/auth/logging.write
  - https://www.googleapis.com/auth/monitoring
  - https://www.googleapis.com/auth/service.management.readonly
  - https://www.googleapis.com/auth/servicecontrol
  - https://www.googleapis.com/auth/trace.append
  serviceAccount: default
  shieldedInstanceConfig:
    enableIntegrityMonitoring: true
nodeIpv4CidrSize: 24
nodePools:
- autoscaling:
    enabled: true
    maxNodeCount: 5
    minNodeCount: 1
  config:
    diskSizeGb: 100
    diskType: pd-standard
    imageType: COS
    machineType: n1-standard-4
    metadata:
      disable-legacy-endpoints: 'true'
    oauthScopes:
    - https://www.googleapis.com/auth/devstorage.read_only
    - https://www.googleapis.com/auth/logging.write
    - https://www.googleapis.com/auth/monitoring
    - https://www.googleapis.com/auth/service.management.readonly
    - https://www.googleapis.com/auth/servicecontrol
    - https://www.googleapis.com/auth/trace.append
    serviceAccount: default
    shieldedInstanceConfig:
      enableIntegrityMonitoring: true
  initialNodeCount: 2
  instanceGroupUrls:
  - <REDACTED>
  management:
    autoRepair: true
    autoUpgrade: true
  name: default-pool
  podIpv4CidrSize: 24
  selfLink: <REDACTED>
  status: RUNNING
  version: 1.13.11-gke.14
selfLink: <REDACTED>
servicesIpv4Cidr: <REDACTED>
status: RUNNING
subnetwork: default
zone: europe-west1-b

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions