Skip to content

Copying a symlinked wildcard directory leads to incorrect caching #2300

@aaronlehmann

Description

@aaronlehmann

In this scenario, there is a symlink from /data -> /mnt/data. On a subsequent layer, files are created under /data. A copy operation with a source path under /mnt/data ends up copying these files, but the cache key is set as if no files would be copied, because the contenthash logic only considers files with a /mnt/data prefix. This can lead to a very bad situation where the results of the copy are returned from cache for any similar copy that does not match any files.

This gist contains a repro case which demonstrates the problem: https://gist.github.com/aaronlehmann/77bd093d5f904719e0a329450d7c8453

In this repro case, we first copy /mnt/data/d1 from st1. The copy is done correctly, but the contenthash logic does not realize that any files would be copied, so it returns digest.FromBytes([]byte{}) for the cache key. Then when we run the same copy with a stock alpine image as input, the copy is cached, which is completely wrong because this directory doesn't even exist in the image.

This is the output from running the repro case:

Copying from st1 with a single file at /data/d1/foo:
#1 docker-image://docker.io/library/alpine:latest
#1 resolve docker.io/library/alpine:latest
#1 resolve docker.io/library/alpine:latest 1.0s done
#1 DONE 1.1s

#3 sh -c mkdir -p /mnt/data/d1 && ln -s /mnt/data /data
#3 CACHED

#4 sh -c echo abc > /data/d1/foo
#4 CACHED

#5 copy /mnt/data/d1 /
#5 CACHED

#2 sh -c apk add -U findutils
#2 CACHED

#6 sh -c find . -ls
#6 0.105   3343154      4 drwxr-xr-x   1 root     root         4096 Aug 10 18:28 .
#6 0.106   3343062      4 drwxr-xr-x   2 root     root         4096 Aug 10 18:17 ./d1
#6 0.106   3340803      4 -rw-r--r--   1 root     root            4 Aug 10 18:17 ./d1/foo
#6 DONE 0.1s

#7 exporting to client
#7 copying files 1.53MB 0.1s
#7 copying files 8.09MB 0.5s done
#7 DONE 0.5s
Copying from stock alpine image - copy should not be cached!:
#4 docker-image://docker.io/library/alpine:latest
#4 resolve docker.io/library/alpine:latest
#4 resolve docker.io/library/alpine:latest 0.2s done
#4 DONE 0.2s

#2 copy /mnt/data/d1 /
#2 CACHED

#1 sh -c apk add -U findutils
#1 CACHED

#3 sh -c find . -ls
#3 0.177   3343164      4 drwxr-xr-x   1 root     root         4096 Aug 10 18:28 .
#3 0.178   3343062      4 drwxr-xr-x   2 root     root         4096 Aug 10 18:17 ./d1
#3 0.178   3340803      4 -rw-r--r--   1 root     root            4 Aug 10 18:17 ./d1/foo
#3 DONE 0.2s

#5 exporting to client
#5 copying files
#5 copying files 8.09MB 0.4s done
#5 DONE 0.4s

cc @coryb @sipsma

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions