Skip to content

[KVStoreMesh] KVStore mesh enabled but not used #42330

@george-zubrienko

Description

@george-zubrienko

Hello! We have been using clustermesh w/o issues in 1.15, just upgraded to 1.16 where KVStoreMesh is enabled by default. After the upgrade, everything is fine in terms of clustermesh connectivity, but it seems KVStoreMesh is not used at all.

Image

Running diag from agent outputs this:

kubectl --context ... exec -it -n kube-system deploy/clustermesh-apiserver -c kvstoremesh -- clustermesh-apiserver kvstoremesh-d
bg status --verbose

KVStoreMesh:	0/0 remote clusters ready

Helm values for clustermesh:

clustermesh:
  apiserver:
    etcd:
      init:
        extraArgs: []
        extraEnv: []
        resources: {}
      lifecycle: {}
      resources: {}
      securityContext:
        allowPrivilegeEscalation: false
        capabilities:
          drop:
          - ALL
      storageMedium: Disk
    extraArgs: []
    extraEnv: []
    extraVolumeMounts: []
    extraVolumes: []
    healthPort: 9880
    image:
      digest: sha256:7b44efa93e0428511341005e493efb8aa88efd369901c07f8832dc5b3d669a2d
      override: null
      pullPolicy: IfNotPresent
      repository: quay.io/cilium/clustermesh-apiserver
      tag: v1.16.15
      useDigest: true
    kvstoremesh:
      enabled: true
      extraArgs: []
      extraEnv: []
      extraVolumeMounts: []
      healthPort: 9881
      lifecycle: {}
      readinessProbe: {}
      resources: {}
      securityContext:
        allowPrivilegeEscalation: false
        capabilities:
          drop:
          - ALL
    lifecycle: {}
    metrics:
      enabled: true
      etcd:
        enabled: true
        mode: basic
        port: 9963
      kvstoremesh:
        enabled: true
        port: 9964
      port: 9962
      serviceMonitor:
        annotations: {}
        enabled: false
        etcd:
          interval: 10s
          metricRelabelings: null
          relabelings: null
        interval: 10s
        kvstoremesh:
          interval: 10s
          metricRelabelings: null
          relabelings: null
        labels: {}
        metricRelabelings: null
        relabelings: null
    nodeSelector:
      kubernetes.io/os: linux
    podAnnotations: {}
    podDisruptionBudget:
      enabled: false
      maxUnavailable: 1
      minAvailable: null
    podLabels: {}
    podSecurityContext:
      fsGroup: 65532
      runAsGroup: 65532
      runAsNonRoot: true
      runAsUser: 65532
    priorityClassName: ""
    readinessProbe: {}
    replicas: 1
    resources: {}
    securityContext:
      allowPrivilegeEscalation: false
      capabilities:
        drop:
        - ALL
    service:
      annotations:
        service.beta.kubernetes.io/aws-load-balancer-ip-address-type: dualstack
        service.beta.kubernetes.io/aws-load-balancer-nlb-target-type: ip
        service.beta.kubernetes.io/aws-load-balancer-scheme: internal
      enableSessionAffinity: HAOnly
      externalTrafficPolicy: Cluster
      internalTrafficPolicy: Cluster
      loadBalancerClass: null
      loadBalancerIP: null
      nodePort: 32379
      type: LoadBalancer
    terminationGracePeriodSeconds: 30
    tls:
      admin:
        cert: ""
        key: ""
      authMode: legacy
      auto:
        certManagerIssuerRef:
          group: cert-manager.io
          kind: ClusterIssuer
          name: data-bolt-self-signed-ca
        certValidityDuration: 1095
        enabled: true
        method: certmanager
      client:
        cert: ""
        key: ""
      enableSecrets: true
      remote:
        cert: ""
        key: ""
      server:
        cert: ""
        extraDnsNames:
        - s3.cluster-mesh.sneaksanddata.internal
        extraIpAddresses: []
        key: ""
    tolerations:
    - key: kubernetes.sneaksanddata.com/service-node-group
      operator: Equal
      value: cilium-clustermesh
    topologySpreadConstraints: []
    updateStrategy:
      rollingUpdate:
        maxSurge: 1
        maxUnavailable: 0
      type: RollingUpdate
  config:
    clusters: []
    domain: mesh.cilium.io
    enabled: false
  enableEndpointSliceSynchronization: false
  enableMCSAPISupport: false
  maxConnectedClusters: 255
  useAPIServer: true

We create cilium-clustermesh secret using ExternalSecret and PushSecret resources from external secret operator. PushSercret pushes cilium-clustermesh-remote-cert contents to Vault and then the other cluster pulls that data and creates cilium-clustermesh secret.

From the provided docs, I am not sure what else needs to be done to activate KVStoreMesh - also no errors are reported which complicates this a lot :)

Metadata

Metadata

Assignees

No one assigned

    Labels

    area/agentCilium agent related.area/clustermeshRelates to multi-cluster routing functionality in Cilium.area/helmImpacts helm charts and user deployment experiencekind/questionFrequently asked questions & answers. This issue will be linked from the documentation's FAQ.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions