Skip to content

Proposal: Provenance step 1 - Transform images for validation and verification #8093

@vbatts

Description

@vbatts

Background

The current image format does not allow for content addressable images nor
require metadata about the image to reference the content of the layer used by
the metadata. The identifiers for layers are randomly generated requiring
extra book keeping to map layer ids to the content being referred to. In a
highly distributed environment such as the Docker ecosystem, this book keeping
is cumbersome and complicates security.

This relates to #6805 #6959

Proposal Summary

Make images self-describing manifests containing a list of content addressable
layers, run configuration, and a signatures to identify the builder and verify
the image meets the expectations of the installer.

Image Manifest

The image manifest file will contain all the information which is needed to
pull, install, validate and run an image. It will contain a list of layers by
a content addressable id, history, run time configuration, and signatures.
This manifest is generated by the daemon. Initially this generation will happen
when an image is published, and ultimately happen anytime an image is built or
committed. Each manifest is required to be signed by the client creating the
manifest on push or build with additional signatures which can be added post
build to verify the quality of the manifest or validity of the builder. The
history will contain fully backward compatible metadata to allow old style
layer and metadata to be recreated from the manifest.

Signable manifest (or payload) refers to portions of the manifest which
will be signed by builder. The signable manifest is a JSON dictionary
containing the layers, run configuration, and history. The entire signable
manifest will signed, any changes including whitespace will require a new
signature. To aid in readability the signable manifest should be
well-formatted JSON.

Example: (totally subject to change)

{
   "name": "dmcgowan/test-image",
   "tag": "latest",
   "architecture": "amd64",
   "fsLayers": [
      {
         "blobSum": "tarsum+sha256:e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855",
      },
      {
         "blobSum": "tarsum+sha256:cea0d2071b01b0a79aa4a05ea56ab6fdf3fafa03369d9f4eea8d46ea33c43e5f",
      },
      {
         "blobSum": "tarsum+sha256:e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855",
      },
      {
         "blobSum": "tarsum+sha256:2a7812e636235448785062100bb9103096aa6655a8f6bb9ac9b13fe8290f66df"
      }
   ],
   "history": [
      "{\"id\":\"a9eb172552348a9a49180694790b33a1097f546456d041b6e82e4d7716ddb721\",\"parent\":\"120e218dd395ec314e7b6249f39d2853911b3d6def6ea164ae05722649f34b16\",\"created\":\"2014-06-05T00:05:35.990887725Z\"...",
      "{\"id\":\"120e218dd395ec314e7b6249f39d2853911b3d6def6ea164ae05722649f34b16\",\"parent\":\"42eed7f1bf2ac3f1610c5e616d2ab1ee9c7290234240388d6297bc0f32c34229\",\"created\":\"2014-06-05T00:05:35.692528634Z\"...",
      "{\"id\":\"42eed7f1bf2ac3f1610c5e616d2ab1ee9c7290234240388d6297bc0f32c34229\",\"parent\":\"511136ea3c5a64f264b78b5433614aec563103b4d4702f3ba7d4d2698e22c158\",\"created\":\"2014-06-05T00:05:35.589531476Z\"...",
      "{\"id\":\"511136ea3c5a64f264b78b5433614aec563103b4d4702f3ba7d4d2698e22c158\",\"comment\":\"Imported from -\",\"created\":\"2013-06-13T14:03:50.821769-07:00\"..."
   ],
   "schemaVersion": 1
}

Signed manifest refers to a manifest which includes the signature as well
as the signable manifest. The signed manifest could be represented as either a
JSON Web Signature (JSON serialization, see link), in which the payload is the
base64 encoded signed manifest, or an altered version of the signed manifest
JSON to include the signature as the last element of the JSON dictionary and a
record of the alternations to the original signed manifest included in the
signature. EIther format is fully verifiable and tamper-proof.

http://tools.ietf.org/html/draft-ietf-jose-json-web-signature-31#section-7.2

Example: (human readable format)

{
   "name": "dmcgowan/test-image",
   "tag": "latest",
   "architecture": "amd64",
   "blobSums": [
      {
         "blobSum": "tarsum+sha256:e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855",
      },
      {
         "blobSum": "tarsum+sha256:cea0d2071b01b0a79aa4a05ea56ab6fdf3fafa03369d9f4eea8d46ea33c43e5f",
      },
      {
         "blobSum": "tarsum+sha256:e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855",
      },
      {
         "blobSum": "tarsum+sha256:2a7812e636235448785062100bb9103096aa6655a8f6bb9ac9b13fe8290f66df"
      }
   ],
   "history": ["v1 compatible string encoded json for each layer"],
   "schemaVersion": 1,
   "signatures": [
      {
         "header": {
            "jwk": {
               "crv": "P-256",
               "kid": "LYRA:YAG2:QQKS:376F:QQXY:3UNK:SXH7:K6ES:Y5AU:XUN5:ZLVY:KBYL",
               "kty": "EC",
               "x": "Cu_UyxwLgHzE9rvlYSmvVdqYCXY42E9eNhBb0xNv0SQ",
               "y": "zUsjWJkeKQ5tv7S-hl1Tg71cd-CqnrtiiLxSi6N_yc8"
            },
            "alg": "ES256"
         },
         "signature": "m3bgdBXZYRQ4ssAbrgj8Kjl7GNgrKQvmCSY-00yzQosKi-8UBrIRrn3Iu5alj82B6u_jNrkGCjEx3TxrfT1rig",
         "protected": "eyJmb3JtYXRMZW5ndGgiOjYwNjMsImZvcm1hdFRhaWwiOiJDbjAiLCJ0aW1lIjoiMjAxNC0wOS0xMVQxNzoxNDozMFoifQ"
      }
   ]
}

Content Addressable Layers

Each layer of an image will be referenced by a checksum created from its
contents. This checksum will be used on push and pull to verify contents have
not been tampered and disallow the layer referred to in the manifest to be
changed after signed.

History

For auditability and assurance of the image, there will be a history section.
This history will convey the life of the the image (build steps, ancestry,
prior attestations on parent images, etc.).
It will have a generic form, and it is important to note that its content is
included in the signed payload.

Signature

Every client and daemon will contain both a public key pair which can be used
to sign manifests. So that the user on the host that publishes (or builds) the
image can sign the image manifest, without sharing their keys for all users on
the host.

Verification

Note: Verification framework will be vetted out in a separate Proposal
review, but the following is provided for a complete picture of its role.

Verification of a manifest will be done by checking the public key used to sign
the image manifest against an authorization graph linking keys to users and the
image namespaces. The authority for this graph will be a remote server which
can respond to authorization queries with signed statements which can be cached
and imported locally for future authorizations. These signed statements which
are received and cached contain a chain of trust which verify their
authenticity. A root certificate to verify this chain will ship with Docker to
allow immediate verification of these statements. Certificates are X509
certificates and verification uses an x509 chain included with the signed
statement. The signed statement will be a JSON Web Signature with the contents
a series of graph nodes and edges to be imported and the x509 chain in the
signature header.

http://tools.ietf.org/html/draft-ietf-jose-json-web-signature-31#section-4.1.6

Registry V2 API

(Note: this versioning is on the API of talking to a docker registry)

Unlike the present /v1/... which is locked to the root of the URI path, the
/v2/... is expected be the relative root of the path, for easier route
handling.

Manage manifest by tag

GET/PUT/DELETE /v2/images/<imgname>/<tagname>

List tags

GET /v2/images/<imgname>/tags

Download an image layer by content id

  • Performs ACL verification
  • Redirects to a temporary signed URL

GET /v2/images/<imgname>/<sumtype>/<sum>

Upload an image layer

PUT /v2/images/<imgname>/<sumtype>/<sum>

Upload an image layer

PUT /v2/images/<imgname>/<sumtype>

Compatibility

For compatibility with prior versions of docker-registries and docker daemons,
the manifest will store the json metadata used in previous versions in the
history section. This history will allow recreation of the layers in the
previous format and layout. Version 2 registries can synchronize content with
version 1 registries using this content in order to ensure content is still
accessible through the version 1 API.
There will be a couple of phases.

Phase 1

Have v2 capable registry and docker daemon that can:

  • detect and push as v2 where possible
  • pull from v2 where possible
  • validate signature on pull

Phase 2

  • escrow layer checksums on for layers on the daemon
  • produce a manifest at build/commit time (not just on publish)

Phase 3

  • provide a strict mode where only verified signed images can be run
  • flags to validate layers

Strictly V2

In the future when signatures are enforced more strictly, it will become more
difficult to do this synchronization as version 1 will not validate signatures
and creation version 2 manifests from version 1 registries will not have a
signature of the builder.

Attribution

Folks involved in this design so far
@dmcgowan @dmp42 @jlhawn @shykes and @vbatts

Metadata

Metadata

Assignees

No one assigned

    Labels

    area/distributionImage Distributionarea/imagesImage Distributionkind/featureFunctionality or other elements that the project doesn't currently have. Features are new and shiny

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions