-
Notifications
You must be signed in to change notification settings - Fork 18.9k
Description
Description
The processes, environments, and resources required to construct artifacts that compose a desired image are typically irrelevant to its runtime. For example, the tool chain: language compilers, their libraries, …, employed to build an application, when incorporated into an image for delivery, encumber the transport of the resultant image and potentially the execution of its derivative container(s). To avoid image pollution and facilitate delivery of a minimally sized one, the Docker BUILD environment must properly separate (isolate) the concerns of image construction from that of its runtime.
This proposal essentially recommends incorporating the docker run IMAGE command to impart the same benefits of containment (isolation/encapsulation), performance, and reusability that have contributed to Docker’s success, to its BUILD environment. It does so by remapping the concepts/implementation of the docker run IMAGE command to a function idiom represented by a new Dockerfile operator called “FUN” (FUNction run) that executes an already know image. A discussion of its benefits, sketch of syntax, overview of semantics, and an example are provided below.
In addition to the new FUN operator, this proposal introduces another one: DEF FUN, to declare/define the body of a transient image (function) within a Dockerfile. The body of a transient image is the set of Dockerfile commands (conceptually a Dockerfile within a Dockerfile) needed to construct it.
Benefits
- Separates concerns: Fully separates the build environment from the resultant image’s runtime by leveraging the inherent isolation mechanism of a container. Build processes run in their own containers and in whatever environment supported by it, limited to affecting only their container’s file system. When BUILD completes, all transient containers can be destroyed eliminating potential side effects, when the image’s BUILD is re-executed.
- Permits encoding a Dockerfile through widely understood function idiom. Besides the familiarity Developers have in applying functions to compose a solution, the proposal leverages less obvious/assumed mechanisms to reduce harmful coupling.
The function idiom includes an interface definition that provides a coupling point at the boundary layer separating the internals of a function, its implementation, from the surrounding external invocation environment. At this coupling point, an interface features a mechanism to bind one or more external arguments to a corresponding set of variables internal to the function. This binding mechanism allows external argument names to be properly correlated, even if their names are different, to their internal counterparts, thereby, eliminating the need to synchronize argument names to mirror the variable names internal to a function and avoid binding to a function’s (or invocation environment's) implementation. Finally, since this binding mechanism occurs at each function invocation, it encourages function reuse, as the same function body can be called at various locations with differing argument names and values.
- Improves Dockerfile performance as the encapsulation mechanisms by function idiom and assumed purity of encoded functions:
- Limit cache evaluation to what's necessary: the input/output argument bindings, the input file checksums defined by its invocation, and the invoked image's checksum. The individual commands representing the function's body are ignored, as they are reflected by the image's checksum.
- Realize concurrent/parallel execution of independent functions. Since a function's defined interface permits easy recognition of dependencies between it and other functions, independent invocations, functions whose inputs aren't dependent on another function's output(s), can be executed in parallel, accelerating the apparent execution speed (not CPU time), when compared to serially running the invocations.
- Avoid repeated evaluations for function invocations whose input argument values are the same. In this situation, the previously computed output value(s) can simply replace the function invocation.
- Dynamically extends Dockerfile command set through shareable images via Docker Hub.
- A function’s interface limits its coupling to only those artifacts needed or produced by the function.
- Preserves the current Dockerfile semantics of first FROM and its ”pull” model when assembling images.
- Adding or removing generated artifacts can be accomplished by either adding a function invocation, encoded as a few consecutive lines of cohesive code, or removing them.
- Improves a Developer’s uptake of FUN by leveraging his/her experience with the existing docker run IMAGE command.
- This function invocation method could be generalized and made available to Docker’s runtime environment.
Syntax
The syntax presented below provides a means to explore concepts. For example, words beginning with '--', like --CONTEXT, reflect keywords whose final form remains undecided.
FUN
[--CONTEXT { [:]
[[:]... ]
| [--FROM_IMAGE :[]
[[:]... ] } ]
[ { --IN [ ]...
| --IN_IMAGE
[ ]... } ]
[--OUT [ ]... ]
{--NOCOMMAND | [] []}
: see docker run IMAGE command
: The files provided by the PATH or URL supplied by the initiating BUILD command.
: A build context assembled from and/or files available from the . This assembled context conforms to the interface expected by/expose to the image when performing BUILD processing.
: File paths/environment variables to be resolved within the function's ('s) body. Although, a Developer could specify a value instead of an environment variable name, avoiding harmful coupling to a function's implementation requires a level of indirection and an associated resolution process which the Dockerfile ENV provides. The use of ENV also offers a method to minimally document the function's interface via the docker inspect command.
: File paths/environment variables to be resolved within the context of the image being built at the moment of invocation.
: see docker run IMAGE command
: see docker run IMAGE command
Semantics
FUN's behavior presented using Dockerfile/docker operations when possible.
# Assemble the <BuilderPhaseContext> from --CONTEXT specifier(s).
# Each <BuildContext>[:<BuilderPhaseContext>] pair generates:
ADD <BuildContext> <BuilderPhaseContext>
ADD...
ADD...
...
# Each <InvokingImageContext>:[<BuilderPhaseContext>] pair identified by the
# --FROM_IMAGE keyword generates:
COPY_FROM_IMAGE <InvokingImageContext> <BuilderPhaseContext>
COPY_FROM_IMAGE ...
COPY_FROM_IMAGE ...
...
# Inherit file system from an existing image or replace below
# with DEF FUN body and protect it with a container layer.
# Use <BuilderPhaseContext> as build context.
FROM <ImageID>
# Build complete. Container reflects state right before executing "docker run IMAGE".
# Each --IN argument pair is translated to an ADD.
ADD <BuildContext> <FunctionContext>
ADD...
ADD...
...
# Each --IN_IMAGE argument pair generates:.
COPY_FROM_IMAGE <InvokingImageContext> <FunctionContext>
COPY_FROM_IMAGE ...
COPY_FROM_IMAGE ...
...
# Images Entrypoint/<COMMAND> is executed along with its optional <ARGS>.
docker run [<COMMAND>}] [<ARGS>]
# Each --OUT argument pair generates:
COPY_TO_IMAGE <FunctionContext> <InvokingImageContext>
COPY_TO_IMAGE ...
COPY_TO_IMAGE ...
...
RETURN
The example is intended to convey the proposed FUN semantics leveraging the experience of familiar commands. It's not a definitive implementation spec. Here's a written description of FUN's invocation:
- When one or more --CONTEXT keywords exist, assemble the . Specifying multiple --CONTEXTs creates an aggregate one from disjoint file/directory references while overlapping references generate a build time error, at least in the situation involving the same --CONTEXT specification. If the --CONTEXT specification omits the optional [:], the resources specified by the source context, either or , are copied to the replicating their source path and file names. Absence of --CONTEXT produces an empty . --CONTEXT enables the expression of minimal interfaces to avoid this vulnerability.
- Allocate the function's file system and protect it with a container layer. Use as build context. For prebuilt images, run ONBUILD triggers, if they exist. For transient images, run Dockerfile commands.
- Image now reflects state immediately before executing "docker run IMAGE"
- Copy by value (--IN), all the input arguments resolving the source references within the build context of the "docker build" command and target references within the function's context. For example, target references can be environment variables established during construction of the function being called. During a function's invocation they would be expanded within the function's context and reflect their values, as established when the image (function) was built.
- Copy by value (--IN_IMAGE), all the input arguments resolving the source references within the invoking image's context (ENV variables & file system) and target references within the function's context.
- Run the function's (image's) entrypoint/specified command with the arguments provided when the function (image) was created or stated by the FUN operator. If "--NOCOMMAND" keyword specified by function invocation, do not execute the docker run IMAGE command. Use NOCOMMAND in situations where the ONBUILD triggers initiate the RUN command and produce all the output artifacts desired.
- Copy by value (--OUT), all the output arguments. Resolve source references within the function's context while target references are resolved within the image being built.
- Upon return, the allocated container can either be immediately or lazily destroyed. A lazy destruction would allow caching the output artifacts for situations where the function is called repeatedly (within a given Dockerfile), with the same input arguments and values. Under these circumstances, and when the function is considered pure, the semantics of FUN can be short circuited to only execute the COPY_TO_IMAGE operations.
- FUN complete. Perform Dockerfile implicit commit to create an (intermediate/final) image.
Example
Given: An image called “appCompile” already created by the following Dockerfile:
FROM Ubuntu
RUN apt-get install build-essentials
ENV IN_SOURCE /src
RUN mkdir $IN_SOURCE
RUN echo ‘#!/bin/bash’ >/appBuild.sh ;\
echo "cd $IN_SOURCE && make" >>/appBuild.sh ;\
ENV OUT_EXECUTABLE /$IN_SOURCE/build/app
ENTRYPOINT /appBuild.sh
Create the “app” image via this second Dockerfile:
FROM busybox
FUN appCompile --IN . IN_SOURCE --OUT OUT_EXECUTABLE /usr/local/bin/app
EXPOSE 80
ENTRYPOINT /usr/local/bin/app
Additional more substantive Docker Hub example using google/golang image. Example also contrasts Function Idiom approach to Nested/Chained Build.
Description
In addition to the FUN operator described above, the proposal would also include a mechanism to permit the construction of supporting transient functions (images) within a Dockerfile. The mechanism is similar to an inline function declaration which emerged during this discussion with Alexander Larsson.
Benefits
- Permits refactoring a Dockerfile employing widely understood function idiom. For example, a Dockerfile for an image requiring a complex tool chain can be segmented into cohesive functions. These functions can be organized to both layer concerns and reflect additional ones besides build vs runtime.
- Transient build functions and their build contexts unique to constructing a resultant image are aggregated into a single Dockerfile eliminating the effort to maintain separate Dockerfiles and build contexts.
- Reduces dependencies/reliance on resources external to the resulting image’s Dockerfile. If so inclined, the external dependency list might be reduced to just the initial FROM image request.
- Facilitates the transition from locally defined images to an external one by extracting the inline function and producing a Dockerfile from it. Any invocations in the original Dockerfile would continue to operate without changing the original Dockerfile. The same benefit applies when converting from an external function to a locally defined one.
Syntax
DEF FUN <ImageID>
# Dockerfile commands.
...
...
...
END FUN
: See docker run IMAGE command. Typically, it will be a human readable label reflective of the function’s primary responsibility using the :[] form. When using :[] form, assumes the default of “latest”. However, could also assume any other valid image label, like a short/long GUID.
Semantics
DEF FUN declares the start of a function (image) definition. When recognized, the current BUILD process writes the commands to a cached file until it detects the matching END FUN. The is placed into a function resolution table maintained by the current BUILD process. Whenever an image name requires resolution, to satisfy either a FUN or FROM operation, the resolution process first reviews the current local function resolution table for the given name. It spawns a child build process and passes the assembled by the initiating function invocation to this child. Once the child build process completes, the function (image), situated in the parent, is executed. The image generated by the child build process can be cached to satisfy future requests initiated by the same parent or a spawned (child) level. In situations where two functions share the same , the definition nearest to the FUN operator will be used. Local inline function definitions override any external function (image) that share the same .
Example
Previous example rewritten to employ DEF FUN aggregating the two distinct Dockerfiles into a composite one:
# declare and define application compiler function:
DEF FUN appCompile
FROM Ubuntu
RUN apt-get install build-essentials
ENV IN_SOURCE /src
RUN mkdir $IN_SOURCE
RUN echo ‘#!/bin/bash’ >/appBuild.sh ;\
echo "cd $IN_SOURCE && make" >>/appBuild.sh ;\
ENTRYPOINT /appBuild.sh
ENV OUT_EXECUTABLE /$IN_SOURCE/build/app
END FUN
# now create the surviving image:
FROM busybox
# call inline function:
FUN appCompile --IN . IN_SOURCE --OUT OUT_EXECUTABLE /usr/local/bin/app
EXPOSE 80
ENTRYPOINT /usr/local/bin/app
Additional DEF FUN example contrasting Function Idiom approach to Nested/Chained Build.