volumes: fix error-handling when removing volumes with swarm enabled#45148
volumes: fix error-handling when removing volumes with swarm enabled#45148thaJeztah merged 2 commits intomoby:masterfrom
Conversation
Add additional test-cases for deleting non-existing volumes (with/without force).
With this patch:
make TEST_FILTER=TestVolumesRemove DOCKER_GRAPHDRIVER=vfs test-integration
Running /go/src/github.com/docker/docker/integration/volume (arm64.integration.volume) flags=-test.v -test.timeout=10m -test.run TestVolumesRemove
...
=== RUN TestVolumesRemove
=== RUN TestVolumesRemove/volume_in_use
=== RUN TestVolumesRemove/volume_not_in_use
=== RUN TestVolumesRemove/non-existing_volume
=== RUN TestVolumesRemove/non-existing_volume_force
--- PASS: TestVolumesRemove (0.04s)
--- PASS: TestVolumesRemove/volume_in_use (0.00s)
--- PASS: TestVolumesRemove/volume_not_in_use (0.01s)
--- PASS: TestVolumesRemove/non-existing_volume (0.00s)
--- PASS: TestVolumesRemove/non-existing_volume_force (0.00s)
PASS
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This comment was marked as resolved.
This comment was marked as resolved.
4f2fee7 to
53297e5
Compare
| // is enabled, in which case the remove won't return an error if no local | ||
| // volume existed, so we don't know if removal succeeded. In that case | ||
| // we always try to delete cluster volumes as well. | ||
| if errdefs.IsNotFound(err) || force { |
There was a problem hiding this comment.
Note that checking for errdefs.IsNotFound() is actually redundant here; I thought it made the code slightly more descriptive, but happy to revert that change.
| // local volume, but could be a cluster volume, so we ignore "not found" | ||
| // errors at this stage. Note that no "not found" error is produced if | ||
| // "force" is enabled. | ||
| err := v.backend.Remove(ctx, vars["name"], opts.WithPurgeOnError(force)) |
There was a problem hiding this comment.
I'm also looking if we can remove the force from the backends, and instead handle "not-exist" errors on the caller site, but looks like that may need more work ("purge" does more than only ignoring "not-exist" errors, as it also updates state in the backend to remove volumes from the store if the (remote ?) volume store (no longer) has a volume.
53297e5 to
b51a682
Compare
|
@cpuguy83 @dperny @neersighted ptal |
|
LGTM! |
| if err != nil && !errdefs.IsNotFound(err) { | ||
| return err | ||
| } |
There was a problem hiding this comment.
The conditional flow here (and through all of these blocks) feels very clunky. Is there a better way to structure this that makes the flow more obvious, even if we have mode code movement?
There was a problem hiding this comment.
Yes, I was hoping to remove the "force" from the backend entirely, and just handle notfound errors (which are the ones we want to ignore) to the caller site; #45148 (comment)
Unfortunately, there is code in the backend that still needs it (if the plugin doesn't have the volume, then our local storage still has to remove it, and not fail)
Perhaps still possible, if we would have the backend ignore the error, for what it needs to do, but make it still return it, but that still requires more changes (probably out of scope for this PR)
| if err != nil { | ||
| return err | ||
| } | ||
| if cvErr := v.cluster.RemoveVolume(vars["name"], force); cvErr != nil { |
There was a problem hiding this comment.
Can we unify the branches here?
I think we can do err = v.cluster.RemoveVolume and then do nothing else in the branch.
This of course assumes cluster.RemoveVolume will return a not found error if force=false
There was a problem hiding this comment.
Oh! You're right; I guess that would work; just overwrite the original err value if we reach this code.
|
|
||
| t.Run("non-existing volume", func(t *testing.T) { | ||
| err = client.VolumeRemove(ctx, "no_such_volume", false) | ||
| assert.Check(t, is.ErrorType(err, errdefs.IsNotFound)) |
There was a problem hiding this comment.
I'm not sure how this error check works?
errdefs.IsNotFound is a function
There was a problem hiding this comment.
Ah, yes, so is.ErrorType supports either a type or an assertion function, which errdefs.IsXXX satisfies
(also didn't know this until not too long ago)
b51a682 to
2d554cd
Compare
Commit 3246db3 added handling for removing cluster volumes, but in some conditions, this resulted in errors not being returned if the volume was in use; docker swarm init docker volume create foo docker create -v foo:/foo busybox top docker volume rm foo This patch changes the logic for ignoring "local" volume errors if swarm is enabled (and cluster volumes supported). While working on this fix, I also discovered that Cluster.RemoveVolume() did not handle the "force" option correctly; while swarm correctly handled these, the cluster backend performs a lookup of the volume first (to obtain its ID), which would fail if the volume didn't exist. Before this patch: make TEST_FILTER=TestVolumesRemoveSwarmEnabled DOCKER_GRAPHDRIVER=vfs test-integration ... Running /go/src/github.com/docker/docker/integration/volume (arm64.integration.volume) flags=-test.v -test.timeout=10m -test.run TestVolumesRemoveSwarmEnabled ... === RUN TestVolumesRemoveSwarmEnabled === PAUSE TestVolumesRemoveSwarmEnabled === CONT TestVolumesRemoveSwarmEnabled === RUN TestVolumesRemoveSwarmEnabled/volume_in_use volume_test.go:122: assertion failed: error is nil, not errdefs.IsConflict volume_test.go:123: assertion failed: expected an error, got nil === RUN TestVolumesRemoveSwarmEnabled/volume_not_in_use === RUN TestVolumesRemoveSwarmEnabled/non-existing_volume === RUN TestVolumesRemoveSwarmEnabled/non-existing_volume_force volume_test.go:143: assertion failed: error is not nil: Error response from daemon: volume no_such_volume not found --- FAIL: TestVolumesRemoveSwarmEnabled (1.57s) --- FAIL: TestVolumesRemoveSwarmEnabled/volume_in_use (0.00s) --- PASS: TestVolumesRemoveSwarmEnabled/volume_not_in_use (0.01s) --- PASS: TestVolumesRemoveSwarmEnabled/non-existing_volume (0.00s) --- FAIL: TestVolumesRemoveSwarmEnabled/non-existing_volume_force (0.00s) FAIL With this patch: make TEST_FILTER=TestVolumesRemoveSwarmEnabled DOCKER_GRAPHDRIVER=vfs test-integration ... Running /go/src/github.com/docker/docker/integration/volume (arm64.integration.volume) flags=-test.v -test.timeout=10m -test.run TestVolumesRemoveSwarmEnabled ... make TEST_FILTER=TestVolumesRemoveSwarmEnabled DOCKER_GRAPHDRIVER=vfs test-integration ... Running /go/src/github.com/docker/docker/integration/volume (arm64.integration.volume) flags=-test.v -test.timeout=10m -test.run TestVolumesRemoveSwarmEnabled ... === RUN TestVolumesRemoveSwarmEnabled === PAUSE TestVolumesRemoveSwarmEnabled === CONT TestVolumesRemoveSwarmEnabled === RUN TestVolumesRemoveSwarmEnabled/volume_in_use === RUN TestVolumesRemoveSwarmEnabled/volume_not_in_use === RUN TestVolumesRemoveSwarmEnabled/non-existing_volume === RUN TestVolumesRemoveSwarmEnabled/non-existing_volume_force --- PASS: TestVolumesRemoveSwarmEnabled (1.53s) --- PASS: TestVolumesRemoveSwarmEnabled/volume_in_use (0.00s) --- PASS: TestVolumesRemoveSwarmEnabled/volume_not_in_use (0.01s) --- PASS: TestVolumesRemoveSwarmEnabled/non-existing_volume (0.00s) --- PASS: TestVolumesRemoveSwarmEnabled/non-existing_volume_force (0.00s) PASS Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2d554cd to
058a31e
Compare
|
Let me bring this one in, and do a cherry-pick later |
|
cherry-pick in #45155 |
integration/volumes: TestVolumesRemove: add coverage for force/no-force
Add additional test-cases for deleting non-existing volumes (with/without force).
With this patch:
volumes: fix error-handling when removing volumes with swarm enabled
Commit 3246db3 (#44224) added handling for removing
cluster volumes, but in some conditions, this resulted in errors not being
returned if the volume was in use;
This patch changes the logic for ignoring "local" volume errors if swarm
is enabled (and cluster volumes supported).
While working on this fix, I also discovered that Cluster.RemoveVolume()
did not handle the "force" option correctly; while swarm correctly handled
these, the cluster backend performs a lookup of the volume first (to obtain
its ID), which would fail if the volume didn't exist.
Before this patch:
With this patch:
- Description for the changelog
- A picture of a cute animal (not mandatory but encouraged)