Describe the bug
Hello team,
Reconciliation process creates new pod before deleting old one. In case of pod has pvc in volume-section that ordering creates double claim to the same pv.
IMO order of operations should be
- remove old pod
- create new one
Steps to reproduce
Easiest way to reproduce is to follow "Automate image updates to Git" guide , with the following addition to podinfo-deployment.yaml.
Step 1) Add PV / PVC and attach volume to pod.
volumes:
- name: empty-dir-vol
persistentVolumeClaim:
claimName: empty-dir-pvc
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: empty-dir-pvc
namespace: podinfo-image-updater
spec:
storageClassName: slow
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 3Gi
---
apiVersion: v1
kind: PersistentVolume
metadata:
labels:
type: nfs
name: podinfoimageupdater-emptydir-pv
spec:
accessModes:
- ReadWriteOnce
capacity:
storage: 10Gi
claimRef:
name: empty-dir-pvc
namespace: podinfo-image-updater
nfs:
path: /storage_local/podinfo-image-updater/empty-dir
server: 192.168.170.36
storageClassName: slow
If it is confusing full manifest is here
-
Change image version to trigger deployment reconciliation
-
Observe the problem.
PVC will get to Lost state
$ kubectl get pvc -w
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
empty-dir-pvc Bound podinfoimageupdater-emptydir-pv 10Gi RWO slow 11s
empty-dir-pvc Lost podinfoimageupdater-emptydir-pv 0 slow 2m26s
ikuchin@microk8s-test:~$ microk8s.kubectl describe pvc
Name: empty-dir-pvc
Namespace: podinfo-image-updater
StorageClass: slow
Status: Lost
Volume: podinfoimageupdater-emptydir-pv
Labels: kustomize.toolkit.fluxcd.io/name=flux-system
kustomize.toolkit.fluxcd.io/namespace=flux-system
Annotations: pv.kubernetes.io/bind-completed: yes
pv.kubernetes.io/bound-by-controller: yes
Finalizers: [kubernetes.io/pvc-protection]
Capacity: 0
Access Modes:
VolumeMode: Filesystem
Used By: podinfo-9ccf96ff5-6d8nx <----------- notice podID
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning ClaimMisbound 26s persistentvolume-controller Two claims are bound to the same volume, this one is bound incorrectly
PV will get to Available state
$ kubectl get pv -w
podinfoimageupdater-emptydir-pv 10Gi RWO Retain Bound podinfo-image-updater/empty-dir-pvc slow 2m23s
podinfoimageupdater-emptydir-pv 10Gi RWO Retain Available podinfo-image-updater/empty-dir-pvc slow 2m23s
Reason for that is order of pod update operations
$ kubectl get pod -w
NAME READY STATUS RESTARTS AGE
podinfo-844777597c-hhj8g 1/1 Running 1 (114m ago) 11h <----- this pod owns PVC
podinfo-9ccf96ff5-6d8nx 0/1 Pending 0 0s
podinfo-9ccf96ff5-6d8nx 0/1 Pending 0 0s
podinfo-9ccf96ff5-6d8nx 0/1 Pending 0 14s
podinfo-9ccf96ff5-6d8nx 0/1 ContainerCreating 0 14s
podinfo-9ccf96ff5-6d8nx 0/1 ContainerCreating 0 15s
podinfo-9ccf96ff5-6d8nx 1/1 Running 0 15s <--------- this pod creates duplicate PVC
podinfo-844777597c-hhj8g 1/1 Terminating 1 (116m ago) 11h
podinfo-844777597c-hhj8g 0/1 Terminating 1 (116m ago) 11h
podinfo-844777597c-hhj8g 0/1 Terminating 1 (116m ago) 11h
podinfo-844777597c-hhj8g 0/1 Terminating 1 (116m ago) 11h
Expected behavior
Successful image update even with PV/PVC attached to the pod
Screenshots and recordings
No response
OS / Distro
20.04.3 LTS (Focal Fossa)
Flux version
flux version 0.27.0
Flux check
$ flux check
► checking prerequisites
✔ Kubernetes 1.22.6-3+7ab10db7034594 >=1.20.6-0
► checking controllers
✔ helm-controller: deployment ready
► ghcr.io/fluxcd/helm-controller:v0.17.0
✔ notification-controller: deployment ready
► ghcr.io/fluxcd/notification-controller:v0.22.0
✔ image-reflector-controller: deployment ready
► ghcr.io/fluxcd/image-reflector-controller:v0.16.0
✔ image-automation-controller: deployment ready
► ghcr.io/fluxcd/image-automation-controller:v0.20.0
✔ kustomize-controller: deployment ready
► ghcr.io/fluxcd/kustomize-controller:v0.21.0
✔ source-controller: deployment ready
► ghcr.io/fluxcd/source-controller:v0.21.2
✔ all checks passed
Git provider
No response
Container Registry provider
No response
Additional context
No response
Code of Conduct
Describe the bug
Hello team,
Reconciliation process creates new pod before deleting old one. In case of pod has pvc in volume-section that ordering creates double claim to the same pv.
IMO order of operations should be
Steps to reproduce
Easiest way to reproduce is to follow "Automate image updates to Git" guide , with the following addition to podinfo-deployment.yaml.
Step 1) Add PV / PVC and attach volume to pod.
If it is confusing full manifest is here
Change image version to trigger deployment reconciliation
Observe the problem.
PVC will get to Lost state
PV will get to Available state
Reason for that is order of pod update operations
Expected behavior
Successful image update even with PV/PVC attached to the pod
Screenshots and recordings
No response
OS / Distro
20.04.3 LTS (Focal Fossa)
Flux version
flux version 0.27.0
Flux check
$ flux check
► checking prerequisites
✔ Kubernetes 1.22.6-3+7ab10db7034594 >=1.20.6-0
► checking controllers
✔ helm-controller: deployment ready
► ghcr.io/fluxcd/helm-controller:v0.17.0
✔ notification-controller: deployment ready
► ghcr.io/fluxcd/notification-controller:v0.22.0
✔ image-reflector-controller: deployment ready
► ghcr.io/fluxcd/image-reflector-controller:v0.16.0
✔ image-automation-controller: deployment ready
► ghcr.io/fluxcd/image-automation-controller:v0.20.0
✔ kustomize-controller: deployment ready
► ghcr.io/fluxcd/kustomize-controller:v0.21.0
✔ source-controller: deployment ready
► ghcr.io/fluxcd/source-controller:v0.21.2
✔ all checks passed
Git provider
No response
Container Registry provider
No response
Additional context
No response
Code of Conduct