Skip to content

Added the BINDCONTEXT dockerfile instruction#5369

Closed
a-ba wants to merge 5 commits intomoby:masterfrom
a-ba:3056-bindcontext
Closed

Added the BINDCONTEXT dockerfile instruction#5369
a-ba wants to merge 5 commits intomoby:masterfrom
a-ba:3056-bindcontext

Conversation

@a-ba
Copy link

@a-ba a-ba commented Apr 23, 2014

closes #3056

The BINDCONTEXT <mountpoint> instruction arranges for the context to be mounted at the location <mountpoint> inside the container during the subsequent RUN commands.

The context is mounted in read-write mode. This is convenient when building something from sources located in the context since only files that are explicitly installed will persist in the image. Everything else (sources, intermediate files,...) is wiped out when the build is over.

The command is also useful when dealing with a voluminous files. Using BINDCONTEXT instead of ADD avoid making extra copies of temporary files and avoids keeping them in the final image. Depending on the case, this can mitigate #2224 and #332.

There are two outstanding questions:

  1. the command mounts directly the context directory and let the user modify it (and possibly mess up with next dockerfile commands). An alternatives would be to write changes in an aufs layer to leave the server context intact, but this adds a dependency on aufs (and I wondef if this is really necessary).
  2. the stays in the image after the build and is listed in the 'Volumes' section of the image config. Ideally I would prefer removing it from the config and from the filesystem (if it did not exist prior to the mount). I am not very familiay with the runtime API, is there a way to do that cleanly before the commit ?

closes moby#3056

The `BINDCONTEXT` instruction will arrange for the context to be mounted
at the location `<mountpoint>` inside the container during the
subsequent `RUN` commands.

Docker-DCO-1.1-Signed-off-by: Anthony Baire <Anthony.Baire@irisa.fr> (github: a-ba)
@tailhook
Copy link

Hi,

This is a very nice idea.

the command mounts directly the context directory and let the user modify it

It might be nice to make a read-only mount. So that container can't mess up the files.

@blueyed
Copy link

blueyed commented Apr 24, 2014

Very nice to have this during building!

It might be nice to make a read-only mount. So that container can't mess up the files.

👍 for something like BINDCONTEXT_RO.

@tailhook
Copy link

for something like BINDCONTEXT_RO

I would make BINDCONTEXT by default readonly

@tianon
Copy link
Member

tianon commented Apr 24, 2014

What about something like BINDCONTEXT /path:rw ?

I think we also need to wait for some clarification from the core
developers on the acceptance of the general idea before we get too deep
down the design hole.

@crosbymichael
Copy link
Contributor

ping @shykes

This solves part of the problem but i'm not sure if it is inline with @vbatts and your ideas.

@derfred
Copy link

derfred commented Apr 25, 2014

+1, during my container builds I need access to a private ssh key, which afterwards I don't need anymore.

Due to the layering this key gets baked into the image and I have to be careful about protecting the generated binaries. With this change the secret information is no longer in the images and the protection requirements for the whole infrastructure (private registry, hosting servers) become way simpler.

@vincentwoo
Copy link
Contributor

I am also very interested in this for the "I need 2gb of build artifacts to make a 10mb binary" angle.

@thaJeztah
Copy link
Member

Would it be possible to achieve the same with a VOLUME Instruction? I think (currently) using a volume during build will still add to the image size (will need to check).

Just wondering if a new instruction needs to be added to the Dockerfile vocabulary, while it performs similar functionality as the existing VOLUME instruction (i.e. use files from the hosts scope and keep the outside of the image)

Some way to remove a volume from within a Dockerfile would be useful in these cases.

@a-ba
Copy link
Author

a-ba commented Apr 25, 2014

VOLUME+ADD do not seem to give the expected result. The context lands in the image anyway, and it still requires a copy.

# cat Dockerfile 
FROM ubuntu:12.04
VOLUME /tmp
ADD . /tmp/context
# docker build .
Uploading context  2.56 kB
Uploading context 
Step 0 : FROM ubuntu:12.04
 ---> c0fe63f9a4c1
Step 1 : VOLUME /tmp
 ---> 50ffa5596567
Step 2 : ADD . /tmp/context
 ---> 751a9e7ee82a
Successfully built 751a9e7ee82a
# docker run -t -i --rm 751a find /tmp
/tmp
/tmp/context
/tmp/context/Dockerfile

@tailhook
Copy link

Some way to remove a volume from within a Dockerfile would be useful in these cases.

A way to remove the volume should be, without regard to this issue. E.g. I often mount /var/cache/ccache, which is not needed after build.

@thaJeztah
Copy link
Member

@a-ba that's what I remembered happening. The question here is; is that by design or a bug? From my point of view, content added to a volume should not end up in an image. This may be a misinterpretation on my side.

If volumes would not end-up in an image, most cases could be handled by using a volume in the Dockerfile, which would keep things simple. Maybe there are other situations that are not handled that I'm overseeing of course.

@crosbymichael @shykes Should content added to a volume during build end up inside an image? And if so, for what reason?

Note; If the VOLUME syntax would be on par with docker run -v, this would be awesome;

VOLUME /host-path/to/large-sourcefiles:/tmp/install

@tailhook
Copy link

by design or a bug?

AFAIK, by design. There is an feature request to init a volume from the image content. But volume being not part of the image is a main it's use case. In fact VOLUME command is an annotation for future docker run (similarly to CMD for example) not something useful during the build.

Note; If the VOLUME syntax would be on par with docker run -v, this would be awesome;

Docker file must not contain something about host file system. And must be secure. So some malicious Dockerfile can't mount /etc from host system. It's by design too.

@a-ba
Copy link
Author

a-ba commented Apr 25, 2014

The issue with VOLUME is that the data is only available during the
lifetime of the container and docker build spawns a new separate
container for each RUN line (and for each ADD line too). If this is a
bug, then the data should not be present even for the RUN command.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be good to expand on what happens outside of docker build time to make sure there's no ambiguity for our more creative readers :)

What dastardly things happen if I BINDCONTXT /etc, - I presume that allows me to feel safe that the final image will have a clean /etc just like the FROM container had.

if I BINDCONTEXT /non-existant-dir will that dir exist in the final image, but be empty?

fun feature.

@thaJeztah
Copy link
Member

In fact VOLUME command is an annotation for future docker run (similarly to CMD for example)

Ok, clear, maybe this could be re-considered? Build creates intermediate containers, why not have it use a VOLUME as well?

Docker file must not contain something about host file system. And must be secure.

Agreed, Dockerfile should not be aware about the host filesystem. Implementing the -v option for build could be an option (build -v ./sourcefiles:/tmp/install)

...So some malicious Dockerfile can't mount /etc from host system. It's by design too.

Not sure if security is a major issue here; the content of a Dockerfile is readable. IMO, docker run could be just as harmful if somebody tells you to "run my image with this command: docker run -v /etc:/some-path …"

@tailhook
Copy link

Ok, clear, maybe this could be re-considered? Build creates intermediate containers, why not have it use a VOLUME as well?

I'm not in charge for that. But what's the use case? If you need some files, you put them in context, then explicitly copy to the image.

Not sure if security is a major issue here; the content of a Dockerfile is readable. IMO, docker run could be just as harmful if somebody tells you to "run my image with this command: docker run -v /etc:/some-path …"

The primary use case is building untrusted Dockerfile by trusted user :)

@thaJeztah
Copy link
Member

But what's the use case? If you need some files, you put them in context, then explicitly copy to the image.

Isn't that what this PR is about? Using files inside an image, without copying them to the image (or at least without them taking up space inside the image)? When running a container, volumes provide exactly that, but volumes are not available during build.

@tailhook
Copy link

Isn't that what this PR is about?

This pull request allows to mount "context". I.e. a directory where Dockerfile is. Currenly this directory is uploaded to docker daemon fully. But sometimes context is huge, and it's inefficient to tar it and send it to daemon and untar again. Also sometimes context contains many files that don't actually needed (e.g. .git directory).

All in all you may think of this patch as an optimization of what can already be done now.

@vbatts
Copy link
Contributor

vbatts commented Apr 28, 2014

I can't help but think, that this functionality would be better served by a -v --volume flag for go build (like it is for go run), such that you could bind mount in a volume for the build-time, that is not a part of the committed volume.
@shykes thoughts

v0.11.1

Conflicts:
	docs/sources/reference/builder.md
@tianon
Copy link
Member

tianon commented Jun 16, 2014

I think the benefit of this over an explicit -v is that it could be relied upon, especially for doing heavy software builds where you don't want the source code and the build artifacts in the final image (as has been discussed in this thread previously). This would make it useful even for Automated Builds on the Hub.

@ppcherng
Copy link

Is this still being worked on? I unfortunately am not technically adept enough to contribute to this effort, but I would like to voice my desire to see this feature in a future Docker release. I frequently find myself in situations where I need to build from source within the image, but using the ADD instruction to copy the source files to the container results in inflation of the image size, which is very undesirable for my applications. The alternative I am forced to use then is to run in interactive mode and use -v to mount the directory containing the source files, but this gets very tricky to automate.

@timthelion
Copy link
Contributor

I'm 👍 on this. This is a good alternative to the secrets stuff that @vbatts is currently working on.

@daviddyball
Copy link

Is this in the schedule to be merged? It's a great feature and would really allow us to better utilize the build cache if we didn't invalidate everything with ADD commands at the start of our Dockerfiles.

@mkscrg
Copy link

mkscrg commented Aug 19, 2014

+1 on this. We've been hurting for a way to mount a volume during container build for a long time. Our current solution is a set of scripts that (poorly) approximate a Dockerfile. Our use case is caching Maven dependencies between builds and keeping them out of our images.

@a-ba
Copy link
Author

a-ba commented Oct 30, 2014

some thoughts about this PR :

  • currently the command works on the whole context, which prevents using efficiently the cache. It could be a good idea to take two parameters like in the ADD and COPY command, to allow binding a subdirectory of the context
  • regarding read-only vs read-write bind, I think read-write is useful for two reasons :
    • it allows building in the source tree directly (thus no need to copy the sources or to adapt build script to build outside the tree)
    • it provides persistence between successive RUN commands without having to store anything in the resulting images

a-ba added 2 commits October 30, 2014 20:57
…-bindcontext

Conflicts:
	builder/internals.go
	integration-cli/docker_cli_build_test.go
Docker-DCO-1.1-Signed-off-by: Anthony Baire <Anthony.Baire@irisa.fr> (github: a-ba)
@crosbymichael
Copy link
Contributor

Tagged as UX for next design session.

@shykes
Copy link
Contributor

shykes commented Dec 23, 2014

Design review with @icecrime

NOTE: this is only a design review. Let's please reach agreement on design before going back and forth on code details. This is to avoid wasting your time.

I agree on the idea. But the word BINDCONTEXT is really not clear. I would like to change it to ADDTMP, for 3 reasons:

  1. it makes it obvious that it's related to ADD
  2. it's nice and short
  3. TMP, even though it's short, is pretty descriptive.

I would expect the docs to say something like: "ADDTMP works exactly like ADD, except the source directory is only temporarily added to the container, and will be removed, along with any changes, at the end of the build".

I would really emphasize "works exactly like". We should make sure that all artifacts of ADD are preserved in ADDTMP unless documented otherwise. For example, the added directory should be writeable, and changes should persist across build steps; source and destination paths should be resolved following the same rules (choice of file vs. directory etc). And so on.

@thaJeztah
Copy link
Member

@shykes with exactly like ADD, that also implies magic, such as downloading and automatic extracting archives? I thought the plan was to "phase out" ADD, or has that decision been reverted?

Just asking, and to prevent wrong expectations.

@icecrime
Copy link
Contributor

icecrime commented Jan 6, 2015

We're closing this one in favor of an ulterior docs PR: maintainers (@crosbymichael @unclejack @tiborvass) are against a new Dockerfile verb for this purpose, and are suggesting other possibilities (such as having the context always bind-mounted in the container as part of a build process).

@icecrime icecrime closed this Jan 6, 2015
@mkscrg
Copy link

mkscrg commented Jan 6, 2015

@icecrime can you ref that PR? I'd like to keep following along on this issue

@tiborvass
Copy link
Contributor

@mkscrg there is none afaict, but as soon as there is one, we'll add a reference to it on this one.

@tianon
Copy link
Member

tianon commented Jan 6, 2015

Always bind-mounting the context won't work. Once the context is
bind-mounted, we have to cache-bust at that point based on the entire
context, which means that if we bind-mount it by default, we're
cache-busted at line 1 every time any file in the entire context changes.

@crosbymichael
Copy link
Contributor

@tianon good point

@tianon
Copy link
Member

tianon commented Jan 7, 2015

What if we essentially copied what COPY and ADD look like?

MOUNT-CONTEXT some-dir /somewhere-in-the-image
MOUNT-CONTEXT some/sub/directory /somewhere-else
MOUNT-CONTEXT . /ALL-THE-THINGS

@crosbymichael
Copy link
Contributor

@tianon
Copy link
Member

tianon commented Jan 7, 2015

@crosbymichael ❤️

@a-ba
Copy link
Author

a-ba commented Apr 20, 2015

i am still maintaining the patch at https://github.com/a-ba/docker/tree/3056-bindcontext (up to date with 1.6.0) since we have no alternative for the moment

ADDTMP would be fine for us though it does not help in handling large files (but that part should rather be addressed by caching/CoW, eg. #9553)

@dagelf
Copy link
Contributor

dagelf commented May 4, 2015

@a-ba 1) Is the idea for your "BINDCONTEXT" to be silently ignored if the local context is not available?
2) Is the idea to only have it available during build?
3) Would it be a big hassle to make it accept the path-in-host:path-in-image format?
4) Do you have any issues with calling it "BINDMOUNT" rather? Because isn't that in fact what it is doing? Context generally refers to metadata or something else, rather than a tree of the filesystem, whereas the word MOUNT is pretty clear in its purpose.

@a-ba
Copy link
Author

a-ba commented May 4, 2015

  1. Is the idea for your "BINDCONTEXT" to be silently ignored if the local context is not available?

There is always a context: at least it contains the Dockerfile. If there is only a Dockerfile, BINDCONTEXT would not be ignored however it would be mostly useless.

  1. Is the idea to only have it available during build?

yes

  1. Would it be a big hassle to make it accept the path-in-host:path-in-image format?

There are two problems with letting the user bind an arbitrary directory from the machine: security (it gives root access) and side-effects (we have no guarantee that the data will not be altered after they are fingerprinted and before they are used in the build)

path-in-context:path-in-image would be ok (for binding part of the context)

  1. Do you have any issues with calling it "BINDMOUNT" rather? Because isn't that in fact what it is doing? Context generally refers to metadata or something else, rather than a tree of the filesystem, whereas the word MOUNT is pretty clear in its purpose.

"Context generally refers to metadata or something else" but in the case of docker it refers to the archive we ship to the server for the build. It is used everywhere in the documentation.
http://docs.docker.com/reference/commandline/cli/#build

However I agree that the word CONTEXT should not be in the command name, but for an other reason: to make it consistent with the other commands that use the context : ADD and COPY

@a-ba
Copy link
Author

a-ba commented Oct 9, 2015

Now I am thinking of another solution : why not just mount /tmp from a temporary external volume during the whole build ?

Thus, instead of writing: BINDCONTEXT /context or ADDTMP . /context

we would just write : ADD /context /tmp/context

By convention everything in /tmp is discarded at the end of the build. It is easy to understand and it does not require to introduce a new keyword.

The only downside is that it breaks old Dockerfile where the user expects to have something remaining in /tmp/ in the final image (but is it a real concern?)

@thaJeztah
Copy link
Member

@a-ba please see the roadmap; https://github.com/docker/docker/blob/master/ROADMAP.md

There's a feature freeze on the Dockerfile syntax, because we're in the process of moving the "builder" client-side, which will allow alternative implementations to the Dockerfile, and more flexibility.

@dagelf
Copy link
Contributor

dagelf commented Oct 9, 2015

The /tmp use case would be useless if you're wanting to have a local apt or portage repository available (which you don't want in the image.) In the case of the apt respository - Ubuntu knows to fetch it from a certain path, or from the internet if that path is not available.

@briceburg
Copy link

briceburg commented Apr 28, 2017

The naming of a MOUNTCONTEXT directive proposed in #30110 (comment) speaks to me. This is useful -- especially for enmasse copying from the context under particular ownership.

Has the builder been forked?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Proposal: Dockerfile add BIND_CONTEXT