Skip to content

NodeUnstageVolume not called because unmounter fails when vol_data.json is deleted #101911

@pohly

Description

@pohly

What happened:

The upcoming csi-driver-host-path release v1.7.0 will have a check in DeleteVolume that returns an error when the volume is still attached, staged, or published (kubernetes-csi/csi-driver-host-path#260).

Some of the jobs in the csi-driver-host-path repo with Kubernetes 1.21.0 are failing because NodeUnstageVolume is not called, causing DeleteVolume to fail repeatedly until the test times out.

One example:

volume ID 3f3a132b-b25c-11eb-9aba-a6b5a7ade690 in:

Corresponds to pvc-c0971c32-17c3-427f-a606-1bbf893caf89 in:

There's one kubelet error that seems relevant:

May 11 13:30:23 csi-prow-worker2 kubelet[247]: E0511 13:30:23.786580     247 reconciler.go:193] "operationExecutor.UnmountVolume failed (controllerAttachDetachEnabled true) for volume \"test-volume\" (UniqueName: \"kubernetes.io/csi/hostpath.csi.k8s.io^ca29f6e4-b25c-11eb-8da9-1e598a3983d2\") pod \"a6441c77-7c03-4d33-8a80-0a37d895f8a9\" (UID: \"a6441c77-7c03-4d33-8a80-0a37d895f8a9\") : UnmountVolume.NewUnmounter failed for volume \"test-volume\" (UniqueName: \"kubernetes.io/csi/hostpath.csi.k8s.io^ca29f6e4-b25c-11eb-8da9-1e598a3983d2\") pod \"a6441c77-7c03-4d33-8a80-0a37d895f8a9\" (UID: \"a6441c77-7c03-4d33-8a80-0a37d895f8a9\") : kubernetes.io/csi: unmounter failed to load volume data file [/var/lib/kubelet/pods/a6441c77-7c03-4d33-8a80-0a37d895f8a9/volumes/kubernetes.io~csi/pvc-c0971c32-17c3-427f-a606-1bbf893caf89/mount]: kubernetes.io/csi: failed to open volume data file [/var/lib/kubelet/pods/a6441c77-7c03-4d33-8a80-0a37d895f8a9/volumes/kubernetes.io~csi/pvc-c0971c32-17c3-427f-a606-1bbf893caf89/vol_data.json]: open /var/lib/kubelet/pods/a6441c77-7c03-4d33-8a80-0a37d895f8a9/volumes/kubernetes.io~csi/pvc-c0971c32-17c3-427f-a606-1bbf893caf89/vol_data.json: no such file or directory" err="UnmountVolume.NewUnmounter failed for volume \"test-volume\" (UniqueName: \"kubernetes.io/csi/hostpath.csi.k8s.io^ca29f6e4-b25c-11eb-8da9-1e598a3983d2\") pod \"a6441c77-7c03-4d33-8a80-0a37d895f8a9\" (UID: \"a6441c77-7c03-4d33-8a80-0a37d895f8a9\") : kubernetes.io/csi: unmounter failed to load volume data file [/var/lib/kubelet/pods/a6441c77-7c03-4d33-8a80-0a37d895f8a9/volumes/kubernetes.io~csi/pvc-c0971c32-17c3-427f-a606-1bbf893caf89/mount]: kubernetes.io/csi: failed to open volume data file [/var/lib/kubelet/pods/a6441c77-7c03-4d33-8a80-0a37d895f8a9/volumes/kubernetes.io~csi/pvc-c0971c32-17c3-427f-a606-1bbf893caf89/vol_data.json]: open /var/lib/kubelet/pods/a6441c77-7c03-4d33-8a80-0a37d895f8a9/volumes/kubernetes.io~csi/pvc-c0971c32-17c3-427f-a606-1bbf893caf89/vol_data.json: no such file or directory"

What you expected to happen:

NodeUnstageVolume should be called.

How to reproduce it (as minimally and precisely as possible):

CSI_PROW_KUBERNETES_VERSION=1.21.0 CSI_PROW_TESTS=parallel CSI_SNAPSHOTTER_VERSION=v4.0.0 ./.prow.sh in csi-driver-host-path and hope that it fails.

Alternatively, retest in kubernetes-csi/csi-driver-host-path#289

Anything else we need to know?:

Random observation: in the two cases that I looked at, the affected volume was published twice for different pods.

Metadata

Metadata

Assignees

Labels

kind/bugCategorizes issue or PR as related to a bug.sig/storageCategorizes an issue or PR as relevant to SIG Storage.triage/acceptedIndicates an issue or PR is ready to be actively worked on.

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions