-
Notifications
You must be signed in to change notification settings - Fork 18.9k
Description
spec for abstracting layer ID (created by tarsum to have a deterministic ID) and the image ID (which is the json metadata, which references a particular layer ID)
Initially, a deterministic image ID will not be directly possible, as the usual TarSum calculation includes the contents of the image's json metadata, which itself includes the image's ID.
Once the file system layer has a deterministic ID, generated with a fixed-time hash (the TarSum), but not including that json metadata, then it will be possible to deduplicate layers that are exactly the same. This may happen often, as the primary differences will be in the json metadata (this could happen in a RUN, but also for MAINTAINER, ENV, PORT, ENTRYPOINT, CMD and VOLUMES).
Therefore multiple images(json) could reference the same layer IDs.
The next step of the image(json) ID, for it to be deterministic, the ID must not be inside the json itself. Perhaps since it is already in a directory path of .//json, then the ID is inferred from its path, and verifiable by checksuming the json with the layer ID that it references.