-
Notifications
You must be signed in to change notification settings - Fork 18.9k
Description
Description
When images are built by a deamon running with user namespaces, extended attribute capabilities are saved in v3 format, thereby including the UID of the root user of the user namespace.
Running such images inside another runtime, either with no user-namespacing, or with a different user-namespace setup for the UID of the root user, the effective bit is ignored by execve(2) as the UID does not match.
For images built with the userns feature to be portable across runtimes, we need capabilities to be saved in v2 format in the image layer archives.
Steps to reproduce the issue:
A reproducer can be found here.
Describe the results you received:
End of the reproducer output:
...
default: + docker run --rm capabilities-built-with-no-userns:1.0 /bin/bash -c '(/usr/local/bin/sleep-test infinity & ); sleep 1; grep Cap /proc/$(pgrep sleep-test)/status'
default: CapInh: 00000000a80425fb
default: CapPrm: 0000000000000400
default: CapEff: 0000000000000400
default: CapBnd: 00000000a80425fb
default: CapAmb: 0000000000000000
default: + docker run --rm capabilities-built-with-userns:1.0 /bin/bash -c '(/usr/local/bin/sleep-test infinity & ); sleep 1; grep Cap /proc/$(pgrep sleep-test)/status'
default: CapInh: 00000000a80425fb
default: CapPrm: 0000000000000000
default: CapEff: 0000000000000000
default: CapBnd: 00000000a80425fb
default: CapAmb: 0000000000000000
In the output above, the 2nd set of Cap* should match the first set. execve(2) has ignored the effective bit on sleep-test in the 2nd case because the root UID stored in the extended attribute does not match the runtime.
Describe the results you expected:
Expect to be able to run images built on user-namespaced environments in non-user-namespaced environments (or with a different user-namespace owner UID) and have the effective bit for capabilities honoured by execve(2).
Additional information you deem important (e.g. issue happens only occasionally):
Output of docker version:
Client: Docker Engine - Community
Version: 19.03.13
API version: 1.40
Go version: go1.13.15
Git commit: 4484c46d9d
Built: Wed Sep 16 17:02:52 2020
OS/Arch: linux/amd64
Experimental: false
Server: Docker Engine - Community
Engine:
Version: 19.03.13
API version: 1.40 (minimum version 1.12)
Go version: go1.13.15
Git commit: 4484c46d9d
Built: Wed Sep 16 17:01:20 2020
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: 1.3.7
GitCommit: 8fba4e9a7d01810a393d5d25a3621dc101981175
runc:
Version: 1.0.0-rc10
GitCommit: dc9208a3303feef5b3839f4323d9beb36df0a9dd
docker-init:
Version: 0.18.0
GitCommit: fec3683
Output of docker info:
This is for the non-user-namespaced configuration, however the user-namespaced configuration is identical, with userns added to the Security Options section:
Client:
Debug Mode: false
Server:
Containers: 0
Running: 0
Paused: 0
Stopped: 0
Images: 6
Server Version: 19.03.13
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 8fba4e9a7d01810a393d5d25a3621dc101981175
runc version: dc9208a3303feef5b3839f4323d9beb36df0a9dd
init version: fec3683
Security Options:
apparmor
seccomp
Profile: default
Kernel Version: 5.4.0-54-generic
Operating System: Ubuntu 20.04.1 LTS
OSType: linux
Architecture: x86_64
CPUs: 2
Total Memory: 981.1MiB
Name: ubuntu-focal
ID: K6QA:QH7R:K6QB:MYYB:URWF:RO3V:E6A5:KCJ4:TSRS:5HPK:ZKAC:53UN
Docker Root Dir: /var/lib/docker
Debug Mode: false
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
WARNING: No swap limit support
Additional environment details (AWS, VirtualBox, physical, etc.):
N/A