Skip to content

Proposal: Self-describing images #6805

@shykes

Description

@shykes

Different Docker users need different "routes" for distributing their images. Possible distribution methods include:

  • Publishing images publicly on https://hub.docker.com, for anyone to discover and consume
  • Publishing images privately on https://hub.docker.com, for deployment to other machines, or consumption by trusted collaborators
  • Storing images on an open-source registry, hosted on the local network
  • Saving images to the local filesystem, moving them out-of-band (for example to an sealed-off LAN appliance), and loading them back from local filesystem
  • Saving images to a static file server such as S3, Swift or equivalent
  • Consuming images from a public mirror of the Hub operated by the local network operator
  • Consuming images from a "satellite" registry operated by an independent software vendor
  • Augmenting all of the above with custom middleware for authentication, caching, federation, peer-to-peer transfer, etc.

Most of these customizations are possible today, but they often degrade the user experience. In some cases, they require unsafe tweaks or even patching of the Docker engine, which is not recommended in production environments and puts interoperability at risk.

We must upgrade Docker's image distribution system to make these customizations not only possible, but easy to configure and use. We must do this while respecting Docker's design philosophy:

  1. Don't break interoperability. Every container and every image must work predictably on every engine.
  2. Don't break separation of concerns. Site-specific configuration must not affect application-specific considerations.

We can implement a robust solution in 2 steps.

1: separate naming and transport

Currently the name of an image is directly linked to its method of transport:

  • If the name is a valid URL, then an https connection to that URL is the sole authorized transport for this image
  • If the name is of the form repo/name, then an https connection to https://registry.hub.docker.com/*/* is the sole authorized transport for this image
  • If the name is of the form name, then an https connection to https://registry.hub.docker.com/_/** is the sole authorized transport for this image

This breaks separation of concerns. For example, if a sysadmin wants to change the URL at which an internal registry is available, all developers building on top of an image stored in that registry will need to change their Dockerfiles. In this example, we want the sysadmin to be free to change the URL (or other configuration aspects, for example authentication) without affecting the development workflow, and vice-versa.

The solution is to attach an image's name to its content, independently of transport. In practice, this means that 1) the image format should be changed to include its name, and 2) how the image was transferred should not influence its name.

Similar to #6802, this requires removing the concept of anonymous images. It should not be possible to designate a filesystem layer (with their random generated IDs) as an image. An image should be defined by the combination of 1) A name (eg. shykes/myapp), 2) a tag (eg. latest) 3) a json manifest and 4) a filesystem layer.

A strict 1-1 mapping of name to image should be enforced: 2 different names means 2 different images (even if these images are identical in every other way).

2: enforce a global namespace with cryptographic signature

3: expose a distribution driver API

Metadata

Metadata

Assignees

No one assigned

    Labels

    area/distributionImage Distributionarea/projectkind/featureFunctionality or other elements that the project doesn't currently have. Features are new and shiny

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions