Refcount images and cleanup orphans#409
Conversation
|
I think we should have that completely optional. The registry should work without it. On a large dataset, if a refcount is not updated and the refcount index gets corrupted, what are the options to rebuild it? I think it's suitable only for small dataset by design. |
|
On Wed, Jun 04, 2014 at 08:27:01AM -0700, Sam Alba wrote:
Can do. However, the only use cases I can think of where you'd want to
Neither is particularly convincing to me, but there's no accounting
As in “docker-registry crashes while/before writing the references
Neither case is currently handled in this PR, but I can add the code
Why is that? Is the problem the long list of references for In any case, how you implement refcounting is a storage-side decision. |
To rephrase what you explain, the operation is not atomic. Walking the whole dataset means it's not suitable for big dataset. |
|
On Wed, Jun 04, 2014 at 10:07:53AM -0700, Sam Alba wrote:
To make things atomic, I'd suggest initially staging images (and their Even with atomic file-writes, you'd need some way to distinguish def put_tag(namespace, repository, tag): When touching the references files, we'd check for the provisional How does that sound? Having a transactional storage backend would make this much easier ;). |
This way we can access FileNotFoundError too, without the flake8 "H301 one import per line" raised by something like: from .exceptions import FileNotFoundError, NotImplementedError I'm not sure why we implement our own NotImplementedError when Python already has one built in, but that's probably off topic :p.
From Index._walk_storage. It's useful for creating the initial index, and also for the new storage upgrades.
Mirroring Base.get_registries. Both methods are useful for abstracting away the current nested filesystem implementation, which will hopefully not be baked into Base for much longer.
From docker_registry.tags.get_tags. Centralizing another abstraction method helps hide the current nested filesystem implementation from the coming upgrade methods.
Besides factoring out a bunch of repetitive
{self.images}/{image_id}
construction, this gives us a path to recursively remove when we want
to delete an image. Modeled after ab3805a (storage: Add
Storage.repository_path, 2014-02-12).
This is more explicit, and avoids getting clobbered by the later: data = create_tag_json(user_agent=ua)
Before this commit there was no way to remove images that were no longer needed by any tags. This commit adds a persistent list of references for each image, with both the depending image id and the namespace, repository, and tag. When we delete a tag, we traverse it's ancestors removing any obsolete references and any images that are no longer referenced. This is more complicated than just having a refcount number, but also more resilient. For example, someone will eventually have a power outage after removing a tag while the refcounts were still being adjusted. By keeping the references themselves in a list, it's easy to check for stale references and remove them. If it turns out that the removal checks are too slow, it would be easy to make them more optimistic (e.g. passing the descendant image and tags that we expect to remove down the _remove_references stack and removing them instead of checking all listed references for all ancestors). The JSON serialization for the [namespace, repository, tag] key allows us to use JSON to serialize the resulting dictionary (JSON only supports string keys). It's a bit awkward, but I wanted to make sure a single tag was only represented once (with whatever its current image was) in the references list.
The upgrade from storage v0.0 to storage v0.1 just adds tag references and cleans up unreferenced images. I think all the storage subclasses are too tightly bound to the file-based structure layed out in driver.Base. It would be nice is we exposed an API closer to the web API (add a tag, get a tag, remove a tag, add a layer, get a layer, remove a layer, ...), and left the internal details completely up to the driver. Of course, the file-like handling the the current Base class could be a FileStorage utility for easy subclassing, but it feels wrong to bake it into Base. If we do pull it out, we'd want to version both the Base API used by storage consumers. We'd leave it up to the implementations (like FileStorage) to version the backing data store they used to provide the Base API: registry core <--(Base API)--> storage driver <--> backing store Base.version and the version checks in Base._check_version allow the storage driver to manage the backing store, and are independent of the Base API used by the core. In any case, the data in the backing store persists between software upgrades, so we need *some* way to manage it as we polish the persistent format.
This is part of the internal backing-store interface, and isn't part of the public Base API.
|
After cleaning up some silly typos, the unit tests are passing again We still need to reach a consensus on where this should live (I can |
|
Would love to see this in. What's holding it back? |
|
Mainly:
Either way, if you want to try it, I guess nothing prevents you from giving it a spin from @wking branch. |
|
On Wed, Jun 25, 2014 at 08:57:25AM -0700, Olivier Gambier wrote:
Optional shouldn't be too hard, especially now that option parsing is |
|
On Wed, Jun 25, 2014 at 11:26:14AM -0700, W. Trevor King wrote:
Ping. |
|
On Tue, Aug 12, 2014 at 09:28:21AM -0700, W. Trevor King wrote:
I just saw #533, so I thought I'd ping this again. I'm still looking |
|
@wking sorry about the lag on this - will look |
|
On Tue, Aug 12, 2014 at 09:28:21AM -0700, W. Trevor King wrote:
I just saw #533, so I thought I'd ping this again. I'm still looking |
|
About the open questions: I would like this to live in its own subclass, and leave Base as-is. Also, I would be happy with separate drivers (s3-rf, file-rf - where rf=ref-counting - but any other, better, suffix can do) that would subclass both their parent (s3, file) and the refcounting base if that's possible. I really want this as standalone and "separate" as possible - and to minimize the effort on the drivers developers - and make this an explicit (simple) choice for the users. Sorry for the long wait on this, and for the bikeshedding - and thank you for your work on this @wking Best. |
|
@wking would you like me to try to convert this to an extension? |
|
On Tue, Sep 16, 2014 at 01:08:00PM -0700, Andy Goldstein wrote:
Sure. If we're not touching Base, what's your plan for injecting this I think this sort of thing would be much cleaner once we had the |
|
@wking I wasn't actually going to try to modify Storage at all. I was going to treat all the reference counting code (and the files it creates) as something solely managed by the extension (even though _references goes in .../images/[id]/). |
|
On Tue, Sep 16, 2014 at 01:48:00PM -0700, Andy Goldstein wrote:
Ah, I suppose you could tie it to the signals instead. Where would |
|
If we use your existing code in this PR, just like you've been doing it via image_references_path. |
|
On Tue, Sep 16, 2014 at 01:59:20PM -0700, Andy Goldstein wrote:
Which I added to Base in 3603066. If you're not touching Base, where |
|
I'd put image_references_path in the extension code, and use signals to react to tags being created and deleted. |
|
On Tue, Sep 16, 2014 at 02:08:49PM -0700, Andy Goldstein wrote:
I think it's going to be hard to convince me to like anything that's |
|
@wking do you think we need to worry about file locking and/or mutexes when updating image references? |
|
On Wed, Sep 17, 2014 at 12:56:01PM -0700, Andy Goldstein wrote:
Here are my previous thoughts for these issues [1,2]. |
Add refcount extension to store image references and cleanup orphans. Based heavily on @wking's work in docker-archive#409 - this is just an adaptation to extension format. Signed-off-by: Andy Goldstein <agoldste@redhat.com>
Add refcount extension to store image references and cleanup orphans. Based heavily on @wking's work in docker-archive#409 - this is just an adaptation to extension format. Signed-off-by: Andy Goldstein <agoldste@redhat.com>
Add refcount extension to store image references and cleanup orphans. Based heavily on @wking's work in docker-archive#409 - this is just an adaptation to extension format. Signed-off-by: Andy Goldstein <agoldste@redhat.com>
Add refcount extension to store image references and cleanup orphans. Based heavily on @wking's work in docker-archive#409 - this is just an adaptation to extension format. Signed-off-by: Andy Goldstein <agoldste@redhat.com>
Add refcount extension to store image references and cleanup orphans. Based heavily on @wking's work in docker-archive#409 - this is just an adaptation to extension format. Signed-off-by: Andy Goldstein <agoldste@redhat.com>
|
@wking Sorry if this has been covered off (I'm having problems following the various discussions on deletion of images). Will the code as is clean up images on an s3 compatible back end (I ask because most of the scripts/cleanups I have seen in other threads look to be local file system only)? |
|
On Thu, Oct 09, 2014 at 02:52:35PM -0700, dbason wrote:
I tweaked Base a bit, but I don't think I added any new |
Add refcount extension to store image references and cleanup orphans. Based heavily on @wking's work in docker-archive#409 - this is just an adaptation to extension format. Signed-off-by: Andy Goldstein <agoldste@redhat.com>
Also add versioning to the storage backend, and lay some groundwork
for a more opaque storage driver. It shouldn't take too much more
work to factor out the file-based assumptions out of driver.Base, but
I've left that for later work. See the commit messages for details on
specific changes. And let Travis complete before merging, since I
haven't even run this locally yet ;). I'll try and get on tomorrow
and link this to some earlier discussion, but I don't have time to
look up the references now ;).