Storage can remove ErrNotAContainer as well#11787
Storage can remove ErrNotAContainer as well#11787openshift-merge-robot merged 1 commit intocontainers:mainfrom
Conversation
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: rhatdan The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
@edsantiago Could you see if this fixes your race condition? |
55843f9 to
c23f516
Compare
Fixes: containers#11775 [NO TESTS NEEDED] No easy way to cause this problem in CI. Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
c23f516 to
c9ea2ca
Compare
|
Restarted my reproducer because I noticed you pushed new changes. FWIW, first attempt ran for 2-3 minutes without failure (prior to this, it would fail in a few seconds). Second reproducer (using c9ea2ca) running now, one minute with no failures. Looks good so far. |
|
15 minutes in, still no failures (other than CI, Docker-py and Test Bindings, which are so weird that I've just re-run on the assumption that they're both flakes). I guess this "fixes" the race, in the sense of sweeping under the rug. I will leave it to more knowledgeable devs to determine how safe this is and how likely it is to mask a real problem. Thank you for taking this on so quickly! |
| if err != nil { | ||
| logrus.Debugf("Failed to delete container %q: %v", container.ID, err) | ||
| return err | ||
| if errors.Cause(err) == storage.ErrNotAContainer || errors.Cause(err) == storage.ErrContainerUnknown { |
There was a problem hiding this comment.
What does ErrNotAContainer mean? Is the ID present in c/storage, but not as a container? That points to a potential bug to me - why is that exact 256-bit random ID still in use?
There was a problem hiding this comment.
I think at times, we attempt to remove a layer as opposed to a container, and call DeleteContainer with the id.
There was a problem hiding this comment.
@mheon Looking at the code https://github.com/containers/storage/blob/65e6ab3cdc1ea69c8a55039dc668ee3844255a92/store.go#L2449-L2532 it means that the container id does not exists, it does not mean that the id exists as image or layer.
|
LGTM |
|
/lgtm |
Fixes: #11775
[NO TESTS NEEDED] No easy way to cause this problem in CI.
Signed-off-by: Daniel J Walsh dwalsh@redhat.com
What this PR does / why we need it:
How to verify it
Which issue(s) this PR fixes:
Special notes for your reviewer: