-
-
Notifications
You must be signed in to change notification settings - Fork 18.6k
dockerTools.buildImage layers are 2x too big #94636
Copy link
Copy link
Closed
Labels
0.kind: bugSomething is brokenSomething is broken2.status: stalehttps://github.com/NixOS/nixpkgs/blob/master/.github/STALE-BOT.mdhttps://github.com/NixOS/nixpkgs/blob/master/.github/STALE-BOT.md6.topic: docker toolsOpen-source software for deploying and running of containerized applicationsOpen-source software for deploying and running of containerized applications
Milestone
Metadata
Metadata
Assignees
Labels
0.kind: bugSomething is brokenSomething is broken2.status: stalehttps://github.com/NixOS/nixpkgs/blob/master/.github/STALE-BOT.mdhttps://github.com/NixOS/nixpkgs/blob/master/.github/STALE-BOT.md6.topic: docker toolsOpen-source software for deploying and running of containerized applicationsOpen-source software for deploying and running of containerized applications
Fields
Give feedbackNo fields configured for issues without a type.
Describe the bug
When using dockerTools.buildImage, files specified as contents get added twice, resulting in an image twice too big: first by mkPureLayer (or mkRootLayer), without the /nix/store/* prefix path, directly in /, then by the layerClosures and newFiles handling of buildImage, that will pull in all the very same packages in /nix/store as separate copies.
To Reproduce
${IMG}usingbuildImage, thendocker load < resultdocker run -t -i ${IMG} ls -lid /bin/bash /nix/store/*bash-interactive*/bin/bashYou'll see that the two files have different inode numbers. The same data is copied twice.
Or you can
docker history ${IMG}and see that layers are twice the expected size.Expected behavior
Somehow the files should be copied only once. The two copies should be either as symlinks or hardlinks. In the latter case, the two steps of mkPureLayer then newFiles handling should happen in a single command that can thus share the hardlinks.
This requires some major refactoring of buildImage.
For added points, separate layer computation from image computation, so when chaining multiple layers, we don't need to pack then unpack N images each of N layers, which consumes O(N^2) resources both in cpu time and disk space.
For yet more points, make it so that layers can be built that are independent from each other will be built in parallel, instead of requiring a total order of layers.
Bonus: instead of running runAsRoot commands in a virtual machine, what about using the much lighter weight fakeroot, just like Debian does. This might even remove the need for two vastly separate cases mkPureLayer vs mkRootLayer.
Screenshots
If applicable, add screenshots to help explain your problem.
Additional context
Add any other context about the problem here.
Notify maintainers
@roberth @utdemir @alexbiehl @nlewo @grahamc