Untag manifests for DELETE API calls by tag#1956
Untag manifests for DELETE API calls by tag#1956sergeyfd wants to merge 4 commits intodistribution:masterfrom
Conversation
|
So does this approach make sense? In my opinion an ability to delete tags would be a good improvement for the current API. |
|
Added tests and a proper (?) support for proxy mode. |
|
@dmcgowan Could you take a look? |
|
@sergeyfd sorry I have no responded. I have the same feeling that I have expressed before, without the ability to lock this operation could be racy. Your deletion logic is correct though. As for the race, there are 3 approaches we could take....
I think option 2 is interesting but don't think that is a simple change and may be a breach of contract. Users are already doing option 1 so merging this would probably just make that approach official. Ping @stevvooe @aaronlehmann do we want this feature now or should we stick with our current solutions and wait until option 3 to merge something in. |
|
Oh, how we soon forget. The reason we don't do this is documented in the API docs and the github issues that originally added deletion support. Not only is this operation racy, but it is destructive. The effect can be accomplished by calling If someone wants to carry the idea of the first-class tags and actually fix this, I would be in support of that. Also, if someone wants to actually address the issues that I've identified with deleting tags, rather than just assuming we just didn't implement it because we are lazy, that would also be great. |
|
#1566 (comment) sums up the problem and work around in the best manner possible.
For this to go in, you really need to solve the technical issues. The problems with many solutions is they don't account for the schema1 case, which includes the tag in the manifest. The other problem with existing solutions is that it doesn't account for the race conditions. However, I would ask that folks stop assuming this is some sort of glaring oversight. This is an unfortunate side-effect of the data model. There are ways to address it, but it takes some consideration to address appropriately. |
|
@dmcgowan You have to educate me here. Why deleting by digest with deleting all accompanying tags is safe and re-using the same code for deleting manifest after making sure that it has only one tag pointing to it isn't. Why same race conditions do not apply to the delete by digest? However @stevvooe is right that this approach doesn't support schema 1 manifests. I wouldn't pursue option 2. The need to put a registry into R/O mode to run GC is already bad enough and adding more features depending on that would make it even worse. Thanks for the looking into this PR anyway. And please don't think that your users do not appreciate the work you guys do. |
|
@sergeyfd Ok, so that sounds like a bug. If a tag file references a manifest that has been deleted, the API should return 404. Is it not doing that? The main reason why digests are considered safe and tags are not is simply uniqueness. When one deletes by digest, there is some guarantee that digest only has a single target. With a tag, it could have been updated in a concurrent request and the deletion would be invalid or random depending on the backend. Some of the underlying concepts are described in https://github.com/docker/docker.github.io/blob/master/registry/architecture.md#eventual-consistency. |
|
@stevvooe Sorry but you are not following. The problem with doing HEAD and then deleting by acquired digest is that it deletes manifest and all other tags that might exists in the registry and point to the same digest. This is definitely unexpected, and not well documented, side effect of deletion by digest. BTW, the proposed PR deletes manifest only if it's the only existing tag for that manifest, otherwise it deletes only a tagging link by calling Untag() Thanks for that link, I'll ready through that document. |
How is this unexpected? What should the tag point to after its target has been deleted? |
|
@stevvooe, In schema1 retrieving a digest by tag and using that digest in Delete operation would lead to deletion of just that tag. In schema 2 this same operation leads into deletion of a manifest and ALL tags pointing to that manifest. So if we tag same image as /repo/cool_image:sergeyfd and /repo/cool_image:stevvoe deleting the first one by me would delete the second one as well. This could be unexpected to you. So the problem that I am trying to state is that in schema 2 there's no easy way to get rid of a tag. One has to upload a dummy image, point a tag that needs to be deleted to that dummy image, and then delete that image. Quite cumbersome logic for something that looks simple. Unfortunately you are right. Since digests are basically immutable and tags aren't implementing support fort Untag API call in a way that I proposed does create race conditions when tag can be switched to some other image while Delete operation is in action. And I do not know how to go around this. |
|
@stevvoe I think the issue here is that if I have two soft links to the
same file, I expect to be able to delete a soft link without eliminating
the underlying file. If however soft links are my primary reference, and
there's NO remaining soft links, then I expect the file to be deleted.
…On Tue, Jan 10, 2017 at 11:57 AM, Stephen Day ***@***.***> wrote:
So if we tag same image as /repo/cool_image:sergeyfd and
/repo/cool_image:stevvoe deleting the first one by me would delete the
second one as well. This could be unexpected to you.
I don't find this unexpected. That is why I am confused. Let's say we each
create a soft link to a common file on a filesystem. If that file gets
deleted, do you still expect those links to work? No.
I took a closer look at the current state of the code. It looks like Untag
is a shallow operation on a single path (manifestTagPathSpec for
reference). For the most part, this should make the delete idempotent. At
this point, I think we can ignore the issues with schema1, as they are
grouped under this directory anyways.
Note that an implementation may not be able to delete an unreferenced
manifest with this approach.
For this to go through, the following must happen:
1. Update the specification accordingly.
2. Remove querying to TagStore.All.
3. Just issue an Untag for the target tag.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#1956 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AKIcBi4j_bXgxxWFpW2tPhuRl-dIQyhFks5rQ-KvgaJpZM4J6I-O>
.
--
Mark Galpin | Senior Solutions Architect *|* JFrog
270 E Caribbean Drive, Sunnyvale, CA 94089
U.S Toll Free: +1.888.494.2855 <%2B1.888.704.0670> *|* Tel: +1.408.329.1540
<https://goo.gl/gnqeun>
This message may contain confidential and/or proprietary information, and
is intended only for the person/entity to whom it was originally
addressed. The content of this message may contain private views and
opinions, which do not constitute a formal disclosure or commitment unless
specifically stated.
|
|
@markgalpin We need to separate what would be convenient from what is possible in the data model. The issue is that there is no locking allowing for the target to be collected when there are no more incoming links.
The problem is that this assumption is not true. We cannot be sure that the there are no remaining soft links (just like on a file system) so deleting the target file when there are no soft links will result in a race condition and data corruption. |
|
You are right so to some extent. If I delete a target than it'll make your symlink broken but it won't delete it. If I delete my symlink, that'll leave your symlink intact and still working. I am Ok with your approach but would suggest that we refuse deleting the last tag to avoid creation of unreferenced manifests. Such images will be really hard to deal with cause I don't see how one would get a digest for a such image. So if TagStore.All() returns a list with just one element than the system can respond with 405 or so. |
And this is what is racy. You can't be sure that there is still one element by the time the delete is issued. If you just allow deleting the items, then this is not a problem. I don't see any reason this is a problem: if you need list access to repo manifests, then we should probably add it. |
|
Agree that with a list access to repo manifests it's not a problem, otherwise it becomes kind of a hidden image not seen by any client that doesn't have digest for that image retrieved previously. So having GET /v2//manifests/list which would return a list of known digests makes total sense to me. |
Let's save that for another PR. |
|
Shouldn't manifests without tags get deleted by gc? It's the most natural thing I'd expect to happen. |
Why is this "the most natural thing"? What if the digest is referenced externally? |
|
Gc deletes unreferenced objects, and this manifest is unreferenced. If the digest is referenced externally, then people simply shouldn't delete the last tag. |
|
@LEW21 I think you may be oversimplifying the problem or misunderstanding the guarantees provided by digests. The whole point of a digest is that one can use it to ensure that when someone updates a tag, they will continue using the exact version with the digest. If you just delete these, you completely break this functionality. Before commenting further, I'd recommend you read information on deletes in the road map. There are lots of considerations that need to be made when making changes in this area. |
registry/handlers/images.go
Outdated
| // DeleteImageManifest removes the manifest with the given digest from the registry. | ||
| func (imh *imageManifestHandler) DeleteImageManifest(w http.ResponseWriter, r *http.Request) { | ||
| ctxu.GetLogger(imh).Debug("DeleteImageManifest") | ||
| ctxu.GetLogger(imh).Debugf("DeleteImageManifest. Tag: %s Digest: %s", imh.Tag, imh.Digest) |
There was a problem hiding this comment.
Use WithFields here to add the digest and tag, rather than formatting them into the error message. They might actually already be in the context.
|
@sergeyfd This looks like a good start. Could you modify the specification to include the changes requested here? |
|
@stevvooe I need to rebase it cause image.go was renamed into manifests.go and I have a bit of problems with that rebase but hope to figure it out over weekend. Do I need to update both api.md and api.md.tpl? I am a bit confused there. |
|
@sergeyfd You'll have to add a route descriptor in registry/api/v2 and add a section in the template describing the delete methodology. If you can't find the right section, ping me and I'll hunt it down. Sorry about the move. I had meant to do that a long time ago. |
Signed-off-by: Serge Dubrouski <sergeyfd@gmail.com>
Signed-off-by: Serge Dubrouski <sergeyfd@gmail.com>
Signed-off-by: Serge Dubrouski <sergeyfd@gmail.com>
Signed-off-by: Serge Dubrouski <sergeyfd@gmail.com>
|
@stevvooe I rebased it but did not update specs. I think that it all has to be done differently. With this implementation it should not be attached to DELETE v2//manifests/ because it actually never deletes a manifest. So the correct route should be DELETE v2//tags/ and this code should actually go into tags.go instead of manifests.go. Do you agree? |
Current coverage is 51.20% (diff: 81.81%)@@ master #1956 diff @@
==========================================
Files 125 125
Lines 11433 11454 +21
Methods 0 0
Messages 0 0
Branches 0 0
==========================================
- Hits 6996 5865 -1131
- Misses 3550 4843 +1293
+ Partials 887 746 -141
|
|
I agree, but there is a design problem that we'll have to tackle. The route overview is here: https://github.com/docker/distribution/blob/master/docs/spec/api.md#detail. I do agree that this should only go on tags. Unfortunately, the route for listing tags is A few possibilities here:
If you have any other ideas, I'd be interested in hearing them. |
|
I think that the best way would be moving tag list functionality into
This work shall be split into several PRs. |
|
In most REST APIs the list would be placed at |
|
@LEW21 If you analyze the API appropriately, you'll see why this creates ambiguity. The main issue is that @sergeyfd The plan looks good except for this:
This redirect can never be removed without incrementing the API version. One will never be able to access the tag named |
|
I created a new PR #2169 which covers first two steps of the plan. |
|
Closed in favor of #2169 |
This is a proposed implementation for #1954 that would help close #1811 #1859
Documentation will have to be updated as well if this is an accepted API change.