Skip to content

docker container is privileged tries to assume more capabilities than available #42906

@smira

Description

@smira

tl;dr is that docker run --privileged tries to assign all known caps, while dockerd itself might not have all caps already.

Background: Talos in new version 0.13 started dropping two capabilities (kexec + module loading) from all processes but PID 1. Talos itself doesn't use dockerd, but if I launch privileged pod on Kubernetes with docker:20.10-dind image, I can't run any privileged container inside:

/ # docker run -it --rm --privileged alpine
docker: Error response from daemon: OCI runtime create failed: container_linux.go:380: starting container process caused: apply caps: operation not permitted: unknown.

The problem starts in

p.Capabilities.Bounding = caps.GetAllCapabilities()

Which essentially uses the list of capabilities built here:

moby/oci/caps/utils.go

Lines 23 to 37 in 306fa44

func init() {
last := capability.CAP_LAST_CAP
rawCaps := capability.List()
allCaps = make([]string, min(int(last+1), len(rawCaps)))
capabilityList = make(map[string]*capability.Cap, len(rawCaps))
for i, c := range rawCaps {
capName := "CAP_" + strings.ToUpper(c.String())
if c > last {
capabilityList[capName] = nil
continue
}
allCaps[i] = capName
capabilityList[capName] = &c
}
}

This list contains every capability present on the host, which might be not be true (as some capabilities might have already been dropped).

Containerd OCI does proper thing:

https://github.com/containerd/containerd/blob/d193dc2b8afb1467255cea5326e9807514f94c0f/pkg/cap/cap_linux.go#L123-L136

I'm happy to send a PR, but what is the best way to solve this in the docker codebase?

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions