Skip to content

Remove container fails on 1.11 due to root filesystem busy when any container mounts host /var/run - regression #21969

@dhiltgen

Description

@dhiltgen

Something changed between commits c48439a...dd51e85 in 1.11 development where the daemon now fails removal of containers in some circumstances. I haven't managed to figure out exactly what is unique about our use-case that triggers the failure yet. Here's what I do know:

  • We've seen it fail primarily on AWS.
  • It fails with aufs and devicemapper (tested with debian AMIs and centos AMIs)
  • I have repro'd on debian running under KVM (but not with boot2docker+machine)
  • Our use-case is a bit complicated and involves a container mounting the docker.sock and spawning a second container with various additional volume/host mounts, which in turn attempts to stop/remove containers that share some of the same volume mounts. It seems some portion of this is required to trigger this failure mode, as running docker stop and docker rm by hand works without failure.

I've been attempting a git bisect on the docker/docker tree to find the exact commit that broke it but I'm having some challenges as the containerd integration was going through churn during this timeframe so many commits aren't yielding a testable setup for me.

Examples from the client's perspective:

DEBU[0000] daemon reported: Error response from daemon: Driver aufs failed to remove root filesystem 25693d520e87d334fedbfd8f1bc31748be35cae690ab3e1f8fad0c79a5ca3946: rename /var/lib/docker/aufs/diff/7aa440b05939346200bec909079a4e280303dfebd6898c665ed119b537f7c3ed /var/lib/docker/aufs/diff/7aa440b05939346200bec909079a4e280303dfebd6898c665ed119b537f7c3ed-removing: device or resource busy 

What you see on the damon log:

Apr 12 23:28:13 dh-manual-test1 docker[29379]: time="2016-04-12T23:28:13.135946618Z" level=error msg="Error removing mounted layer 25693d520e87d334fedbfd8f1bc31748be35cae690ab3e1f8fad0c79a5ca3946: rename /var/lib/docker/aufs/diff/7aa440b05939346200bec909079a4e280303dfebd6898c665ed119b537f7c3ed /var/lib/docker/aufs/diff/7aa440b05939346200bec909079a4e280303dfebd6898c665ed119b537f7c3ed-removing: device or resource busy"
Apr 12 23:28:13 dh-manual-test1 docker[29379]: time="2016-04-12T23:28:13.136021604Z" level=error msg="Handler for DELETE /containers/25693d520e87d334fedbfd8f1bc31748be35cae690ab3e1f8fad0c79a5ca3946 returned error: Driver aufs failed to remove root filesystem 25693d520e87d334fedbfd8f1bc31748be35cae690ab3e1f8fad0c79a5ca3946: rename /var/lib/docker/aufs/diff/7aa440b05939346200bec909079a4e280303dfebd6898c665ed119b537f7c3ed /var/lib/docker/aufs/diff/7aa440b05939346200bec909079a4e280303dfebd6898c665ed119b537f7c3ed-removing: device or resource busy"

I'll continue my investigation and update this issue as I uncover more details.

Metadata

Metadata

Assignees

No one assigned

    Labels

    area/storageImage Storagearea/storage/aufskind/bugBugs are bugs. The cause may or may not be known at triage time so debugging may be needed.priority/P2Normal priority: default priority applied.version/1.11

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions