Bug report
General Information
Trying to use the cilium-node-init's restartPods functionality doesn't work on AKS with ubuntu 18.04 images. Connectivity test pods created before the reconfigureKubelet completes fail to become ready (pass). Looking at the logs for cilium-node-init the restart isn't happening.
I believe the issue is that the none of the branches in this if statement match on the azure images:
- the one I think should match:
if grep -q 'docker' /etc/crictl.yaml; then doesn't because the file /etc/crictl.yaml doesn't exist so grep errors.
- I think we could improve it with
if [ ! -f /etc/crictl.yaml ] || grep -q 'docker' /etc/crictl.yaml; then (check for the existence of the file). In my testing this works for me; I can see the "Restarting kubenet managed pods" msg in the cilium-node-init logs.
How to reproduce the issue
- Run aks install (filling in bash vars with your own)
- Helm install cilium
- Immediately apply the connectivity-check.yaml
- note this might need to be reapplied a couple times to get the cilium network policy
- creating these pods before the cilium-node-init finishes is important so they get created under a kubenet policy before the reconfigure changes kubenet->cni and restarts the kubelet. There are other ways this bug shows up, but this is the clearest way to demonstrate it.
aksargs=(
--subscription "$SUB"
--resource-group "$RG"
--name "$NAME"
--kubernetes-version 1.17.7
--vm-set-type "VirtualMachineScaleSets"
# Causes it to not create a public IP for the api-server
--enable-private-cluster
# don't use Azure CNI; but we will overwrite this later w/ cilium
--network-plugin kubenet
--load-balancer-sku "standard"
--vnet-subnet-id "$SUBNET_ID"
# Not really used; but needs to be defined
--docker-bridge-address="172.17.8.1/23"
# Internal IPs of kubernetes services
--service-cidr "172.17.0.0/21"
--dns-service-ip="172.17.0.10" # They ask for it to be `.10`... sure
--pod-cidr "172.17.32.0/19"
--service-principal "$APPID"
--client-secret "$APPPWD"
#https://docs.microsoft.com/en-us/azure/aks/cluster-configuration#generation-2-virtual-machines-preview
# importantly this triggers ubuntu 18.04 images
--aks-custom-headers "usegen2vm=true"
)
ciliumhelmargs=(
--version 1.8.2
--namespace cilium
--set config.ipam=kubernetes
# Rewrite kubelet config file to enable CNI w/ the node-init DaemonSet.
--set global.nodeinit.enabled=true
--set nodeinit.reconfigureKubelet=true
--set nodeinit.removeCbrBridge=true
# Any pods that already running won't get the above changes; have nodeinit restart them
# this doesn't actually work right now.
--set nodeinit.restartPods=true
# Use cilium native routing
--set global.tunnel=disabled
--set global.endpointRoutes.enabled=true
--set global.nativeRoutingCIDR=172.17.32.0/19
)
az aks create "${aksargs[@]}"
az aks get-credentials --resource-group $RG --name $NAME
kubectl create ns cilium
helm install cilium cilium/cilium "${ciliumargs[@]}"
kubectl apply -f https://raw.githubusercontent.com/cilium/cilium/1.8.2/examples/kubernetes/connectivity-check/connectivity-check.yaml
Bug report
General Information
Trying to use the
cilium-node-init'srestartPodsfunctionality doesn't work on AKS with ubuntu 18.04 images. Connectivity test pods created before thereconfigureKubeletcompletes fail to become ready (pass). Looking at the logs for cilium-node-init the restart isn't happening.I believe the issue is that the none of the branches in this if statement match on the azure images:
if grep -q 'docker' /etc/crictl.yaml; thendoesn't because the file/etc/crictl.yamldoesn't exist so grep errors.if [ ! -f /etc/crictl.yaml ] || grep -q 'docker' /etc/crictl.yaml; then(check for the existence of the file). In my testing this works for me; I can see the "Restarting kubenet managed pods" msg in the cilium-node-init logs.How to reproduce the issue