Skip to content

"Sending build context to daemon" performance with large projects #26816

@pwaller

Description

@pwaller

For a moderately large project, with the whole build cached, docker build consumes 33 CPU seconds on the client. It should be at last 50x faster than this.

The build context I'm using has 26,913 files in it across 8,870 MiB. Of which, most are ignored via a .dockerignore with 12 lines in it, only 14,253 files and 1,145 MiB gets sent to the docker daemon.

33 CPU seconds is considerably more than the 0.8 CPU seconds that docker-show-context uses to build a tar archive, even though docker-show-context uses the same code as docker to construct the archive!

Profiling shows that a substantial amount of time is spent compiling regular expressions. The reason for this is that the docker client compiles every .dockerignore directive as a regex for every file which is tested. This alone results in lots of needless work.

The first thing I observe is that there is a significant pause before the docker daemon even starts sending the build context to the daemon, according to the size of the .dockerignore. I recall from reading the code in the past that all of the files were considered more than once in the inefficient way I described in the last paragraph.

I made an attempt (or two) rectify this in the past, but ran out of time. It was very promising, showing potential speedups of 100x in CPU time. I don't have time to work on this myself but would very much like to see it fixed! Someone is welcome to resurrect my old pull request.

Some work has been done which slightly reduces the amount of work done, but it is still quite bad for me on docker 1.12.1.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions