-
Notifications
You must be signed in to change notification settings - Fork 18.9k
Description
Docker users frequently run out of disk space if they build or pull many successive versions of the same image. That is because build produces many intermediary filesystem layers (1 layer per build step), and when an image is updated many of those layers are no longer referenced by any tag. They become unreachable by any image name (they are sometimes also called "dangling").
Over time unreachable layers can accumulate and occupy large amounts of disk space. And because there is no convenient way to discover and remove unreachable layers, they keep accumulating until the disk is full.
We can implement a solution in 2 steps:
1: Remove the concept of anonymous images
Currently any filesystem layer qualifies as an "image", whether it's referenced by a name and tag or not. This makes garbage collection impossible because, from the Engine's point of view, every layer is potentially useful. There is no way to tell which layer the user considers "reachable" and which it considers "unreachable".
To solve this, we must make it impossible to use a layer ID in lieu of an image name. This will force users to create proper references for every image they want to reference. And in turn it allows the Engine to cleanly distinguish reachable layers (any layer referenced by at least 1 image name) from unreachable layers (all the other layers).
2: docker prune: prune all unreachable filesystem objects
Once the Engine has a clear definition of "unreachable*, we can implement a docker prune command, following the model of git prune. This commands removes all filesystem layers which are not reachable by an image with a proper name and tag.
Note: docker prune is not the same thing as docker rmi. The former applies to filesystem layers (which have an ID and filesystem state); the latter applies to images, which have a name and tags, and reference one or more filesystem layers.