Skip to content

Untag manifests for DELETE API calls by tag#1956

Closed
sergeyfd wants to merge 4 commits intodistribution:masterfrom
sergeyfd:untag
Closed

Untag manifests for DELETE API calls by tag#1956
sergeyfd wants to merge 4 commits intodistribution:masterfrom
sergeyfd:untag

Conversation

@sergeyfd
Copy link
Copy Markdown
Contributor

This is a proposed implementation for #1954 that would help close #1811 #1859

Documentation will have to be updated as well if this is an accepted API change.

@sergeyfd
Copy link
Copy Markdown
Contributor Author

So does this approach make sense? In my opinion an ability to delete tags would be a good improvement for the current API.

@sergeyfd
Copy link
Copy Markdown
Contributor Author

sergeyfd commented Dec 9, 2016

Added tests and a proper (?) support for proxy mode.

@sergeyfd
Copy link
Copy Markdown
Contributor Author

sergeyfd commented Dec 9, 2016

@dmcgowan Could you take a look?

@dmcgowan
Copy link
Copy Markdown
Collaborator

@sergeyfd sorry I have no responded. I have the same feeling that I have expressed before, without the ability to lock this operation could be racy. Your deletion logic is correct though. As for the race, there are 3 approaches we could take....

  1. Don't worry about it, it is very unlikely a specific manifest digest is tagged within that deletion windows.
  2. Allow deletion calls to occur while the registry is in read only mode. This deletion will not have a negative on any running garbage collection.
  3. Wait until we have transactional metadata, still no timeframe on getting this merged in though.

I think option 2 is interesting but don't think that is a simple change and may be a breach of contract. Users are already doing option 1 so merging this would probably just make that approach official.

Ping @stevvooe @aaronlehmann do we want this feature now or should we stick with our current solutions and wait until option 3 to merge something in.

@stevvooe
Copy link
Copy Markdown
Collaborator

Oh, how we soon forget.

The reason we don't do this is documented in the API docs and the github issues that originally added deletion support. Not only is this operation racy, but it is destructive.

The effect can be accomplished by calling HEAD on the tag and then using that digest to delete the manifest.

If someone wants to carry the idea of the first-class tags and actually fix this, I would be in support of that. Also, if someone wants to actually address the issues that I've identified with deleting tags, rather than just assuming we just didn't implement it because we are lazy, that would also be great.

@stevvooe
Copy link
Copy Markdown
Collaborator

#1566 (comment) sums up the problem and work around in the best manner possible.

For this to go in, you really need to solve the technical issues. The problems with many solutions is they don't account for the schema1 case, which includes the tag in the manifest. The other problem with existing solutions is that it doesn't account for the race conditions.

However, I would ask that folks stop assuming this is some sort of glaring oversight. This is an unfortunate side-effect of the data model. There are ways to address it, but it takes some consideration to address appropriately.

@sergeyfd
Copy link
Copy Markdown
Contributor Author

@stevvooe Unfortunately doing HEAD and then deleting by digest doesn't provide same functionality and can be really dangerous in it turn, see here: #1811 Inability to get rid of a tag is really annoying.

@sergeyfd
Copy link
Copy Markdown
Contributor Author

@dmcgowan You have to educate me here. Why deleting by digest with deleting all accompanying tags is safe and re-using the same code for deleting manifest after making sure that it has only one tag pointing to it isn't. Why same race conditions do not apply to the delete by digest? However @stevvooe is right that this approach doesn't support schema 1 manifests.

I wouldn't pursue option 2. The need to put a registry into R/O mode to run GC is already bad enough and adding more features depending on that would make it even worse.

Thanks for the looking into this PR anyway. And please don't think that your users do not appreciate the work you guys do.

@stevvooe
Copy link
Copy Markdown
Collaborator

@sergeyfd Ok, so that sounds like a bug. If a tag file references a manifest that has been deleted, the API should return 404. Is it not doing that?

The main reason why digests are considered safe and tags are not is simply uniqueness. When one deletes by digest, there is some guarantee that digest only has a single target. With a tag, it could have been updated in a concurrent request and the deletion would be invalid or random depending on the backend.

Some of the underlying concepts are described in https://github.com/docker/docker.github.io/blob/master/registry/architecture.md#eventual-consistency.

@sergeyfd
Copy link
Copy Markdown
Contributor Author

@stevvooe Sorry but you are not following. The problem with doing HEAD and then deleting by acquired digest is that it deletes manifest and all other tags that might exists in the registry and point to the same digest. This is definitely unexpected, and not well documented, side effect of deletion by digest.

BTW, the proposed PR deletes manifest only if it's the only existing tag for that manifest, otherwise it deletes only a tagging link by calling Untag()

Thanks for that link, I'll ready through that document.

@stevvooe
Copy link
Copy Markdown
Collaborator

The problem with doing HEAD and then deleting by acquired digest is that it deletes manifest and all other tags that might exists in the registry and point to the same digest.

How is this unexpected? What should the tag point to after its target has been deleted?

@sergeyfd
Copy link
Copy Markdown
Contributor Author

@stevvooe, In schema1 retrieving a digest by tag and using that digest in Delete operation would lead to deletion of just that tag. In schema 2 this same operation leads into deletion of a manifest and ALL tags pointing to that manifest. So if we tag same image as /repo/cool_image:sergeyfd and /repo/cool_image:stevvoe deleting the first one by me would delete the second one as well. This could be unexpected to you. So the problem that I am trying to state is that in schema 2 there's no easy way to get rid of a tag. One has to upload a dummy image, point a tag that needs to be deleted to that dummy image, and then delete that image. Quite cumbersome logic for something that looks simple.

Unfortunately you are right. Since digests are basically immutable and tags aren't implementing support fort Untag API call in a way that I proposed does create race conditions when tag can be switched to some other image while Delete operation is in action. And I do not know how to go around this.

@markgalpin
Copy link
Copy Markdown

markgalpin commented Jan 10, 2017 via email

@stevvooe
Copy link
Copy Markdown
Collaborator

@markgalpin We need to separate what would be convenient from what is possible in the data model. The issue is that there is no locking allowing for the target to be collected when there are no more incoming links.

If however soft links are my primary reference, and
there's NO remaining soft links, then I expect the file to be deleted.

The problem is that this assumption is not true. We cannot be sure that the there are no remaining soft links (just like on a file system) so deleting the target file when there are no soft links will result in a race condition and data corruption.

@sergeyfd
Copy link
Copy Markdown
Contributor Author

You are right so to some extent. If I delete a target than it'll make your symlink broken but it won't delete it. If I delete my symlink, that'll leave your symlink intact and still working.

I am Ok with your approach but would suggest that we refuse deleting the last tag to avoid creation of unreferenced manifests. Such images will be really hard to deal with cause I don't see how one would get a digest for a such image. So if TagStore.All() returns a list with just one element than the system can respond with 405 or so.

@stevvooe
Copy link
Copy Markdown
Collaborator

Such images will be really hard to deal with cause I don't see how one would get a digest for a such image. So if TagStore.All() returns a list with just one element than the system can respond with 405 or so.

And this is what is racy. You can't be sure that there is still one element by the time the delete is issued. If you just allow deleting the items, then this is not a problem.

I don't see any reason this is a problem: if you need list access to repo manifests, then we should probably add it.

@sergeyfd
Copy link
Copy Markdown
Contributor Author

Agree that with a list access to repo manifests it's not a problem, otherwise it becomes kind of a hidden image not seen by any client that doesn't have digest for that image retrieved previously.

So having GET /v2//manifests/list which would return a list of known digests makes total sense to me.

@stevvooe
Copy link
Copy Markdown
Collaborator

So having GET /v2//manifests/list which would return a list of known digests makes total sense to me.

Let's save that for another PR.

@LEW21
Copy link
Copy Markdown

LEW21 commented Jan 11, 2017

Shouldn't manifests without tags get deleted by gc? It's the most natural thing I'd expect to happen.

@stevvooe
Copy link
Copy Markdown
Collaborator

Shouldn't manifests without tags get deleted by gc? It's the most natural thing I'd expect to happen.

Why is this "the most natural thing"? What if the digest is referenced externally?

@LEW21
Copy link
Copy Markdown

LEW21 commented Jan 11, 2017

Gc deletes unreferenced objects, and this manifest is unreferenced.

If the digest is referenced externally, then people simply shouldn't delete the last tag.

@stevvooe
Copy link
Copy Markdown
Collaborator

@LEW21 I think you may be oversimplifying the problem or misunderstanding the guarantees provided by digests. The whole point of a digest is that one can use it to ensure that when someone updates a tag, they will continue using the exact version with the digest. If you just delete these, you completely break this functionality.

Before commenting further, I'd recommend you read information on deletes in the road map. There are lots of considerations that need to be made when making changes in this area.

// DeleteImageManifest removes the manifest with the given digest from the registry.
func (imh *imageManifestHandler) DeleteImageManifest(w http.ResponseWriter, r *http.Request) {
ctxu.GetLogger(imh).Debug("DeleteImageManifest")
ctxu.GetLogger(imh).Debugf("DeleteImageManifest. Tag: %s Digest: %s", imh.Tag, imh.Digest)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use WithFields here to add the digest and tag, rather than formatting them into the error message. They might actually already be in the context.

@stevvooe
Copy link
Copy Markdown
Collaborator

@sergeyfd This looks like a good start. Could you modify the specification to include the changes requested here?

@sergeyfd
Copy link
Copy Markdown
Contributor Author

@stevvooe I need to rebase it cause image.go was renamed into manifests.go and I have a bit of problems with that rebase but hope to figure it out over weekend.

Do I need to update both api.md and api.md.tpl? I am a bit confused there.

@stevvooe
Copy link
Copy Markdown
Collaborator

@sergeyfd You'll have to add a route descriptor in registry/api/v2 and add a section in the template describing the delete methodology. If you can't find the right section, ping me and I'll hunt it down.

Sorry about the move. I had meant to do that a long time ago.

Signed-off-by: Serge Dubrouski <sergeyfd@gmail.com>
Signed-off-by: Serge Dubrouski <sergeyfd@gmail.com>
Signed-off-by: Serge Dubrouski <sergeyfd@gmail.com>
Signed-off-by: Serge Dubrouski <sergeyfd@gmail.com>
@sergeyfd
Copy link
Copy Markdown
Contributor Author

@stevvooe I rebased it but did not update specs. I think that it all has to be done differently. With this implementation it should not be attached to DELETE v2//manifests/ because it actually never deletes a manifest. So the correct route should be DELETE v2//tags/ and this code should actually go into tags.go instead of manifests.go. Do you agree?

@codecov-io
Copy link
Copy Markdown

Current coverage is 51.20% (diff: 81.81%)

Merging #1956 into master will decrease coverage by 9.98%

@@             master      #1956   diff @@
==========================================
  Files           125        125           
  Lines         11433      11454     +21   
  Methods           0          0           
  Messages          0          0           
  Branches          0          0           
==========================================
- Hits           6996       5865   -1131   
- Misses         3550       4843   +1293   
+ Partials        887        746    -141   

Powered by Codecov. Last update 7cb0c15...3f450ad

@stevvooe
Copy link
Copy Markdown
Collaborator

I agree, but there is a design problem that we'll have to tackle.

The route overview is here: https://github.com/docker/distribution/blob/master/docs/spec/api.md#detail.

I do agree that this should only go on tags. Unfortunately, the route for listing tags is /v2/<name>/tags/list, which prevents us from adding the route /v2/<name>/tags/<reference>. (There is history behind this: list helps "affix" the route, ensuring that <name> can be parsed correctly).

A few possibilities here:

  1. Go forward with /v2/<name>/tags/<reference> and disallow list as a tag.
  2. Go forward with /v2/<name>/tags/<reference>, add _list and disallow list as a tag, slowly deprecating usage of /v2/<name>/tags/list for /v2/<name>/tags/_list. Add a redirect, as well.
  3. Allow both routes to exist but do different things for GET and DELETE.
  4. Continue supporting /v2/<name>/tags/list but allow the addition of an Accept header to disambiguate, which is only required for the value list.
  5. Just reserve the tag list and disallow it as a supported tag in the registry going forward.

If you have any other ideas, I'd be interested in hearing them.

@sergeyfd
Copy link
Copy Markdown
Contributor Author

I think that the best way would be moving tag list functionality into /v2/<name>/tags/_list. In this case we won't need to make any changes in Docker Engine to disallow any tags. This will also allow us to be consistent withe the future /v2/<name>/manifests/_list route. So this is what should be done from my point of view:

  1. Move /v2/<name>/tags/list to /v2/<name>/tags/_list with a redirect for backward compatibility. That redirect shall be removed later on.
  2. Add new route /v2/<name>/tags/<reference> for a DELETE operation. This route will support only tag references.
  3. Add new route /v2/<name>/manifests/_list for GET operation to retrieve a list of digests in a given repo. I am not sure if this route is a best choice, may be adding /v2/<name>/digests/_list would be a better option.

This work shall be split into several PRs.

@LEW21
Copy link
Copy Markdown

LEW21 commented Jan 16, 2017

In most REST APIs the list would be placed at /v2/<name>/tags/, as in https://en.wikipedia.org/wiki/Representational_state_transfer#Applied_to_Web_services

@stevvooe
Copy link
Copy Markdown
Collaborator

@LEW21 If you analyze the API appropriately, you'll see why this creates ambiguity. The main issue is that <name> can be an unrestricted number of path components. There is a massive set of test cases showing this in the code base. Having these "affixed tuples" (ie manifest/<reference>, tags/list) avoids this.

@sergeyfd The plan looks good except for this:

That redirect shall be removed later on.

This redirect can never be removed without incrementing the API version. One will never be able to access the tag named list but I don't think proposal calls for that.

@sergeyfd sergeyfd mentioned this pull request Jan 21, 2017
@sergeyfd
Copy link
Copy Markdown
Contributor Author

sergeyfd commented Jan 21, 2017

I created a new PR #2169 which covers first two steps of the plan.

@sergeyfd sergeyfd mentioned this pull request Jan 24, 2017
3 tasks
@sergeyfd
Copy link
Copy Markdown
Contributor Author

sergeyfd commented Feb 8, 2017

Closed in favor of #2169

@sergeyfd sergeyfd closed this Feb 8, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Should images with different tags have different digests?

7 participants