Skip to content

COPY vs RUN --mount type=bind  #2893

@DYefimov

Description

@DYefimov

This is an explorative issue for now.
With the main goal to shed some light on the origins of the docker COPY working differently from the linux cp utility.
And possibly a fix for that in the future.

[UPDATE]: RFC for this issue: #2917

The problem

Given a microservice architecture, I'm interested in reusing dockerfile target for every service I have in my application.

Consider the following setup:

  1. ./services folder on the host with many .py files - one per service + several common (in other words, per target image)
  2. dockerfile with two stages - build_env, and service.template
  3. compose file that reuses the same service.template target for every service while feeding it with $SOURCES build-arg (e.g. "./services/common ./services/service1.*"

If you use COPY $SOURCES /app in the dockerfile, everything works just fine for individual files, but fails when you try to feed a directory along that space separated list of sources.
Digging though buildkit sources it appears that there is a flag CopyDirContentsOnly: true that expands the contents of any directory in my sources list, or just stripping the top-most directory. And that is not configurable.

Linux cp utility does not do that. And frankly that's a mystery why would one implement a generic copy some other way rather than using something that proved to be stable for tenths of years. While exploring the issue, I found several mentions on that, like "backward compability"... the usual staff. Can you, please, elaborate on the reasoning behind this implementation.

The workaround

While looking for a solution for the issue, I was exploring standard methods, and there are a couple.

RUN --mount=type=bind,source=services,target=/src mkdir -p /app && tar cf - $SOURCES | tar xf - --strip-components=1 -C /app
was looking really promising at first, but it cache-misses for every service on any file update in the ./services directory.

There are some other ways to achieve the desired functionality.
For example, outsourcing the bundling process for a service to some other entity outside of the dockerfile in question. Like pull from git during the build process (side note, personally I'm strongly against tightly-coupling layers of complexity like this).
Or, maybe docker in docker with some tricks inside...
Or, maybe docker in docker building buildkit from sources and applying a custom patch?

You see where it goes - wouldn't the simple COPY working the way it is "supposed to be" be much nicer?

The experiment

I've run some experiments and it seems trivial to add a config option for COPY without introducing a breaking change:

"Not really a patch" here

Just the barebone working example:

diff --git a/frontend/dockerfile/dockerfile2llb/convert.go b/frontend/dockerfile/dockerfile2llb/convert.go
index 2dddd4ea..865fc31a 100644
--- a/frontend/dockerfile/dockerfile2llb/convert.go
+++ b/frontend/dockerfile/dockerfile2llb/convert.go
@@ -647,6 +647,7 @@ func dispatch(d *dispatchState, cmd command, opt dispatchOpt) error {
                       chown:        c.Chown,
                       chmod:        c.Chmod,
                       link:         c.Link,
+                       linux_like:   false,
                       location:     c.Location(),
                       opt:          opt,
               })
@@ -692,6 +693,7 @@ func dispatch(d *dispatchState, cmd command, opt dispatchOpt) error {
                       chown:        c.Chown,
                       chmod:        c.Chmod,
                       link:         c.Link,
+                       linux_like:   c.LinuxLike,
                       location:     c.Location(),
                       opt:          opt,
               })
@@ -1035,7 +1037,7 @@ func dispatchCopy(d *dispatchState, cfg copyConfig) error {
                       opts := append([]llb.CopyOption{&llb.CopyInfo{
                               Mode:                mode,
                               FollowSymlinks:      true,
-                               CopyDirContentsOnly: true,
+                               CopyDirContentsOnly: cfg.linux_like == false,
                               AttemptUnpack:       cfg.isAddCommand,
                               CreateDestPath:      true,
                               AllowWildcard:       true,
@@ -1124,6 +1126,7 @@ type copyConfig struct {
       chown        string
       chmod        string
       link         bool
+       linux_like   bool
       location     []parser.Range
       opt          dispatchOpt
}
diff --git a/frontend/dockerfile/instructions/commands.go b/frontend/dockerfile/instructions/commands.go
index 48ebf183..5fb28b1d 100644
--- a/frontend/dockerfile/instructions/commands.go
+++ b/frontend/dockerfile/instructions/commands.go
@@ -251,6 +251,7 @@ type CopyCommand struct {
       Chown string
       Chmod string
       Link  bool
+       LinuxLike bool
}

// Expand variables
diff --git a/frontend/dockerfile/instructions/parse.go b/frontend/dockerfile/instructions/parse.go
index a04e9b9d..16e378bf 100644
--- a/frontend/dockerfile/instructions/parse.go
+++ b/frontend/dockerfile/instructions/parse.go
@@ -307,6 +307,7 @@ func parseCopy(req parseRequest) (*CopyCommand, error) {
       flFrom := req.flags.AddString("from", "")
       flChmod := req.flags.AddString("chmod", "")
       flLink := req.flags.AddBool("link", false)
+       flLinuxLike := req.flags.AddBool("linux-like-copy", false)
       if err := req.flags.Parse(); err != nil {
               return nil, err
       }
@@ -323,6 +324,7 @@ func parseCopy(req parseRequest) (*CopyCommand, error) {
               Chown:           flChown.Value,
               Chmod:           flChmod.Value,
               Link:            flLink.Value == "true",
+               LinuxLike:       flLinuxLike.Value == "true",
       }, nil
}

[Update] Small followup

For the reference, the issue #15858 is dating back to 2015 and still out there.
Didn't find any followup or any good proposal, rather then introducing a new CP command (well first ADD then COPY and now CP - bet on the next one)
Maybe COPY --linux-like-copy ... or COPY --dont_expand_dirs or something along those lines would be a good start?

And btw, at first I thought RUN mount=type=bind should work the way proposed in #2821, but apparently it is not that smart.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions