Proposal: Add a NOCACHE instruction to Dockerfiles#10682
Proposal: Add a NOCACHE instruction to Dockerfiles#10682duglin wants to merge 1 commit intomoby:masterfrom
Conversation
|
I'll definitely use it. Thank you. |
|
I'm still very strongly -1 on this (#1996 (comment)):
|
|
@tianon how does someone automatically bust the cache at a certain spot in the Dockerfile w/o touching the Dockerfile each time? |
|
Why would you need to do that, and why wouldn't that be covered more flexibly by For example, https://github.com/docker/docker/blob/f4dc496d36d31cf9ca1b3508f10954066ff7f8bc/Dockerfile#L51 and https://github.com/docker/docker/blob/f4dc496d36d31cf9ca1b3508f10954066ff7f8bc/Dockerfile#L119 and https://github.com/docker/docker/blob/f4dc496d36d31cf9ca1b3508f10954066ff7f8bc/Dockerfile#L127. Using these techniques instead ensures that everyone either gets a |
|
@tianon reading thru the issue its clear to me that there are a number of people who want the ability to turn off caching at a certain point in the Dockerfile processing every time. Asking them to modify their Dockerfile each time, while may be possible, isn't very user friendly. As to your Unless I'm missing something, the links you included don't show how to bust the cache each time w/o modifying the Dockerfile, and that's what I'm interested in seeing because if that's possible then this PR isn't needed (I think). I'd like to understand why you think its so bad to have something within the Dockerfile say "stop caching every time right here"? If that's what people need, as expressed by the original issue, I'm having a hard time understanding why we wouldn't offer a nice/easy solution. Its not like it breaks some fundamental Docker philosophy does it? Now, if we want to discuss why this feature is needed at all, then I think that would be good. While I implemented it to help scratch an itch that people had, I have to be honest and say that I still don't fully understand when people need it. But I'm willing to accept that not everyone has the same needs that I do :-) |
|
Another creative way to bust the cache would be; to bust the cache run
Not sure about the hub, does it use caching at all? Especially when it does a checkout of the GitHub repo, the files would always be newer, thus not cached. |
|
The Hub's automated builds currently don't pre-pull the image, so they're
essentially full "--no-cache" builds every single time -- AIUI, they're
discussing making that more configurable, too.
|
|
@thaJeztah the issue I can see with that is that is still requires work by the person kicking off the build. I think any solution that requires the person kicking off the build to have "extra" knowledge is probably not the right solution. As I understand it, the requirement is that the Dockerfile author wants to have control (bust it here!). |
|
@duglin well, the "cache-buster" enables the user to decide if the cache has to be busted. If I want the cache to be always busted from a specific point onwards, my usual approach is just; And then Please, don't see my comments as negative, I appreciate your attempt to solve this, I just think most cases can already be solved with the existing options, and those options are not even that hacky. |
|
@thaJeztah your first approach puts control in the hands of the user, my understanding is that the requirement is that its in control of the Dockerfile author. As for the "create a non-changing base image" approach... if that works for people I'll close the issue. But, it feels like a hack and makes people jump through a lot of hoops when a simple |
|
I definitely agree with @tianon. While you should design your file with the ability to run I still don't understand why you would have users of the Dockerfile needing a forced cache-bust. Shouldn't they either a) be using the image with docker pull or b) be understanding the Dockerfile (ie. know how to break the cache if they need it). I would love a concrete example where this would actually be more useful than the flag. |
|
Your "know how to break the cache if they need it" is where the ugly kludge comes in of writing in some arbitrary data into the Dockerfile that gets changed when the cache needs to be abandoned. A flag is far more user-friendly. Regex in the build command is a headache waiting to happen, imo. What's the harm in a Dockerfile flag, seriously? |
|
I think it would really help (at least for me) if someone could enumerate some of the usecases that require the need to invalidate the cache 1/2 thru a Dockefile every single time it is processed. |
|
Git repo updates. Having to write a commit hash into a Dockerfile which changes and therefore triggers an actual "git clone" is an awful practice. |
|
There are at least a couple of use cases around in the various requests for this feature. At least I know you've seen a couple suggested in one of the issues we've talked it over in. |
|
@curtiszimmerman I think that's a perfectly acceptable/reasonable thing to do. I agree with @tianon that embedding the control of the caching into the Dockerfile itself feels wrong. |
|
If we want control to be in the hands of the builder then it seems like some of the solutions mentioned would work just fine - even saying "if you don't like what the Dockerfile defines then modify it". This PR is more about letting the Dockerfile author set an initial baseline for what's supposed to happen w/o requiring intervention from the builder. I'd really like to understand why it feels wrong to specify "don't use the cache" from within the Dockerfile if the authors know they'll want it run/updated/etc... every time. People are clearly asking for this so why make it harder on them to do what they want? |
There was a problem hiding this comment.
it would be nice to have an example why someone would choose to do this
|
@cpuguy83 I really disagree. Dockerfiles are state transformation vehicles. I'm operating on an image, permuting through states until I get to an end state I am happy with. As an image builder, I want maximum control over this process. But what you're suggesting is that it's reasonable to force people to use a third-party state vehicle (e.g. |
|
Been thinking more about this, and it seems to me that there might be two different requirements at play here:
While these both end up tweaking the "UtilizeCache" flag during the build process, they are quite a bit different w.r.t. who is being given control. And unfortunately I don't think there's a solution that can satisfy both requirements. After all, they are different users working with a different set of tools (vi/Dockerfile vs With that in mind, I think we may need two different solutions:
Sorry for the rambling, but I think the net here is:
|
|
I'm definitely still personally -1 on that first one. I'm indifferent on the second -- in my experience, combining "rmi" and the rare "--no-cache" solves all my cache problems, as @thaJeztah summarized so nicely previously in this thread (and it's definitely even easier to solve this way now that we have "docker build -f"). IMO, the cache isn't a feature a Dockerfile author should be counting on -- it's a convenience provided to speed up subsequent builds. https://en.wikipedia.org/wiki/Cache_%28computing%29
|
|
@tianon I believe the first one is exactly related to your comment about not counting on the cache. I think people are talking about the case where the Dockerfile author knows that using a cache would result in the wrong output so relying on the builder to use --no-cache (or something else to bust it) is exactly what Also, since you're still -1 on the first one, I have to ask you to elaborate on why you think its an invalid usecase/requirement. |
|
Yes @tianon, that's my thought as well, just more eloquently put... the cache is an implementation detail. |
|
Hello! Sorry I could not find a more exact issue, but it can solve some of the problems related to this issue: I believe strongly now that we have the new What would help is a generic global Well that is the simplest basic way to implement build-time secrets support. Because then we could BUT ALSO: being a gerneral flag it is actually can be used for other purposes too. For example when I would also like to add that this is a much requested feature, so if anyone else can suggest a better and simpler solution than |
|
I assume a |
|
@duglin Nope. It would only apply to that specific instruction. Committing would automatically re-enable upon the next instruction. Get it? Because that way, you do not need to write 3 commands to do what can be achieved with only 1 command. For what you want that would need a more full (and coherent) feature set and look something like a total of 4 related flags: # One-time command flags that apply on current Docker <CMD> only
<CMD> --no-commit <args>
<CMD> --commit <args>
# The toggle flags
RUN --commit-off true
<CMDS...>
RUN --commit-on trueBut I'm not suggesting to actually implement those additional extra 3 flags to fill out the full feature set. But rather just leave them in reserve. Because that would be extraneous for and unnecessary to solving those 2 immediate problems I previously highlighted. Just Heck I really need that Running a special local secrets server is way too much hassle and totally unjustified for just 1 api key. And not actually feasible when building images on dockerhub. Whilst the other main option of a new docker secrets api would also be much more complex solution to secrets. And at the moment such a new feature seems to be a big reach to be counting upon. Never mind the fact a fully-blow secrets API would take ages to implement / test / etc. |
|
@duglin Of course! If we do not wish for so many individual flags to maintain, then this exact same (the full feature set) can easily be implemented in just 1 flag. Setting it to one of four possible / recognised values. It would be: <CMD> --cache=<on|off|true|false>Again, where |
|
Ok I am lost, Is this going to be there or not? My use case:
Now I run it with --no-cache to make sure the new war is retrieved. If I could do I would be certains it is always done right. |
No, this won't be implemented, see: #10682 (comment). However, we're finalizing a PR to allow build-time parameters (see #15182), which would allow you to do, for example: # Define "BUILDNUMBER" arg, and (optionally) set a default
ARG BUILDNUMBER=1234
RUN wget https://foobar.example.com/builds/app-1.0.0.$BUILDNUMBER.warAnd build it with; docker build --build-arg BUILDNUMBER=319485 --tag myapp:build-319485 .Just an example, but demonstrates the possibilities. |
|
So, just to clarify, if I'm using a service such as Google Managed VM's, the only way for me to get non cached content from git is to alter my Dockerfile each time? https://cloud.google.com/sdk/gcloud/reference/preview/app/deploy These are built based on a supplied Dockerfile, so due to the deployment mechanisms for the Google services, I am not aware of any way to pass any build time parameters to the docker build process, so cannot use --no-cache or the suggested --build-arg or anything like that. A flag in the Dockerfile that works the same as the --no-cache would allow for dynamic deployment / scaling of the latest code, without any manipulation of the Dockerfile. |
|
@ganey that sounds like something to ask Google. For example, automated builds on Docker Hub default to |
|
@thaJeztah okay, thanks. Hopefully they'll consider it! |
|
Well I'm late to the party but honestly, this is the kind of situation that I hate to see in open-source projects. Y'all might not see a "real world" use for it or have some kind of weird, intricate ways to do it already but unless you have a real technical opposition to a feature -- for instance if the codebase makes it hard or hacky to implement, then you should listen to your users.. Anyway, here's my "real world" use case: we want to have the And, yeah, we might put this at the very end of the Functionally, the binary is always the same but it is updated regularly to adapt to the latest youtube/vimeo layout. There is no reason to force upon us a change in our code. Also, having an explicit instruction in the |
|
@toots also check my earlier example using build-time args, which would allow you to break the cache "on demand" by specifying a value for the arg (e.g. current time, current date); #10682 (comment) |
|
I really like the NOCACHE or even an ALWAYSRUN Dockerfile command for those situations. |
|
as others have mentioned, commands like My simple workaround is: edit: ADD http://www.timeapi.org/utc/now /tmp/bustcache
RUN git pull |
|
@paul-callahan I'm pretty sure that doesn't work since $(date) is evaluated after the cache check is done - and by the shell, not docker. |
|
@paul-callahan Right, It might be worthwhile to use |
|
@duglin actually, you're right, that doesn't work. ADD http://www.timeapi.org/utc/now /tmp/bustcache
RUN git pull@cpuguy83 if --build-arg variable value changing causes cache invalidation for docker build --build-arg CACHE_BUSTER=$SECONDS -t myimg . |
|
Yeah, I'm pretty sad to see this closed. It's clear to me that there are tons of use cases where someone grabs something from the internet in a |
|
I'm also very sad to see this closed. In our use case we are pulling from a git repository very late in a larger build, and using a --no-cache when building the image have two problems:
|
|
What I do is templatize my Dockerfiles (via bash heredocs) and I use a construct like: Which basically busts the cache daily and can be inserted at any point of the Dockerfile. Tweak that BUILT_AT variable as desired. |
|
@LouisKottmann @lillem4n no need to construct the Dockerfile, you can simply pass it as a build-arg. However, for the Git repo, I'd suggest to use the actual commit over using a date. Using the commit would make it reproducible. Also see #10682 (comment) However if you're having a big part that does not change, and only the last part of the Dockerfile should be always rebuilt; I'd consider using a base image for the infrequent changing parts. |
|
When everyone has to resort to a hacky solution to make something happen, it's a powerful argument for a new feature. |
This adds a NOCACHE instruction to Dockerfiles which will disable all
caching from that point forward. The cache is still populated, but the
look-up processing is disbled.
Closes #1996
Signed-off-by: Doug Davis dug@us.ibm.com