Skip to content

br: volume-snapshot can not restore when backup a scale-in cluster #4884

@fengou1

Description

@fengou1

Bug Report

What version of Kubernetes are you using?

[root@kvm-dev scale]# kubectl version Client Version: version.Info{Major:"1", Minor:"22+", GitVersion:"v1.22.6-eks-7d68063", GitCommit:"f24e667e49fb137336f7b064dba897beed639bad", GitTreeState:"clean", BuildDate:"2022-02-23T19:32:14Z", GoVersion:"go1.16.12", Compiler:"gc", Platform:"linux/amd64"} Server Version: version.Info{Major:"1", Minor:"22+", GitVersion:"v1.22.16-eks-ffeb93d", GitCommit:"52e500d139bdef42fbc4540c357f0565c7867a81", GitTreeState:"clean", BuildDate:"2022-11-29T18:41:42Z", GoVersion:"go1.16.15", Compiler:"gc", Platform:"linux/amd64"}
What version of TiDB Operator are you using?

1.5.0.alpha.2
What storage classes exist in the Kubernetes cluster and what are used for PD/TiKV pods?

What's the status of the TiDB cluster pods?

What did you do?

  • create a cluster with 9 tikv nodes

[root@kvm-dev scale]# kubectl get po -ntidb-cluster [348/1797] NAME READY STATUS RESTARTS AGE backup-bk-07-wsxns 0/1 Completed 0 3h51m basic-discovery-884d5b9d5-5r89r 1/1 Running 0 2m13s basic-pd-0 1/1 Running 0 2m13s basic-tidb-0 2/2 Running 0 80s basic-tikv-0 1/1 Running 0 103s basic-tikv-1 1/1 Running 0 103s basic-tikv-2 1/1 Running 0 103s basic-tikv-3 1/1 Running 0 37s basic-tikv-4 1/1 Running 0 37s basic-tikv-5 1/1 Running 0 36s basic-tikv-6 1/1 Running 0 35s basic-tikv-7 1/1 Running 0 33s basic-tikv-8 1/1 Running 0 32s

  • shrink tikv to 6 nodes

[root@kvm-dev scale]# kubectl get po -ntidb-cluster NAME READY STATUS RESTARTS AGE backup-bk-01-mk8ss 1/1 Running 0 59s backup-bk-07-wsxns 0/1 Completed 0 3h55m basic-discovery-884d5b9d5-5r89r 1/1 Running 0 5m48s basic-pd-0 1/1 Running 0 5m48s basic-tidb-0 2/2 Running 0 4m55s basic-tikv-0 1/1 Running 0 5m18s basic-tikv-1 1/1 Running 0 5m18s basic-tikv-2 1/1 Running 0 5m18s basic-tikv-3 1/1 Running 0 4m12s basic-tikv-4 1/1 Running 0 4m12s basic-tikv-5 1/1 Running 0 4m11s

  • keep pvc and pv

root@kvm-dev scale]# kubectl get pvc -ntidb-cluster NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE pd-basic-pd-0 Bound pvc-24cb46f9-857b-4993-bd87-7a53657be227 30Gi RWO gp2 5m27s tikv-basic-tikv-0 Bound pvc-d551a34b-8ac0-4e37-a025-cbf530d08562 350Gi RWO gp3 4m57s tikv-basic-tikv-1 Bound pvc-2ace20a9-ec41-45c1-b009-7067cbde6bf0 350Gi RWO gp3 4m57s tikv-basic-tikv-2 Bound pvc-1a3011b4-cc13-4380-a306-1673c5096686 350Gi RWO gp3 4m57s tikv-basic-tikv-3 Bound pvc-0996a8c5-279f-47da-87ed-fdc4f0019ff6 350Gi RWO gp3 3m51s tikv-basic-tikv-4 Bound pvc-7fe86b0a-ea07-48da-84e6-5e030f9aa34e 350Gi RWO gp3 3m51s tikv-basic-tikv-5 Bound pvc-d3b26cde-412d-4bb2-aa34-2d0604d4cfde 350Gi RWO gp3 3m50s tikv-basic-tikv-6 Bound pvc-b1e77cba-60fd-42d0-be4b-5eeb60e2778a 350Gi RWO gp3 3m49s tikv-basic-tikv-7 Bound pvc-89653084-89f6-480c-943d-a17b67642800 350Gi RWO gp3 3m47s tikv-basic-tikv-8 Bound pvc-d5e771ea-bd8b-4cab-be95-84a1f9098b17 350Gi RWO gp3 3m46s

[root@kvm-dev scale]# kubectl get pv NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE pvc-0996a8c5-279f-47da-87ed-fdc4f0019ff6 350Gi RWO Delete Bound tidb-cluster/tikv-basic-tikv-3 gp3 2m25s pvc-1a3011b4-cc13-4380-a306-1673c5096686 350Gi RWO Delete Bound tidb-cluster/tikv-basic-tikv-2 gp3 3m31s pvc-24cb46f9-857b-4993-bd87-7a53657be227 30Gi RWO Delete Bound tidb-cluster/pd-basic-pd-0 gp2 3m58s pvc-2ace20a9-ec41-45c1-b009-7067cbde6bf0 350Gi RWO Delete Bound tidb-cluster/tikv-basic-tikv-1 gp3 3m31s pvc-7fe86b0a-ea07-48da-84e6-5e030f9aa34e 350Gi RWO Delete Bound tidb-cluster/tikv-basic-tikv-4 gp3 2m24s pvc-89653084-89f6-480c-943d-a17b67642800 350Gi RWO Delete Bound tidb-cluster/tikv-basic-tikv-7 gp3 2m21s pvc-b1e77cba-60fd-42d0-be4b-5eeb60e2778a 350Gi RWO Delete Bound tidb-cluster/tikv-basic-tikv-6 gp3 2m22s pvc-d3b26cde-412d-4bb2-aa34-2d0604d4cfde 350Gi RWO Delete Bound tidb-cluster/tikv-basic-tikv-5 gp3 2m23s pvc-d551a34b-8ac0-4e37-a025-cbf530d08562 350Gi RWO Delete Bound tidb-cluster/tikv-basic-tikv-0 gp3 3m28s pvc-d5e771ea-bd8b-4cab-be95-84a1f9098b17 350Gi RWO Delete Bound tidb-cluster/tikv-basic-tikv-8 gp3 2m17s

  • take a backup and then do the restore

What did you expect to see?
restore shall success

What did you see instead?

restore failure with operator log
E0216 13:10:00.989025 1 restore_controller.go:102] Restore: tidb-cluster/rt-03, sync failed, err: pvc-89653084-89f6-480c-943d-a17b67642800 pv.annotations[temporary/volume-id] not found, requeuing

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions