Skip to content

Randomly cannot start Containers with "Clean up Error! Cannot destroy container" "mkdir ...-init/merged/dev/shm: invalid argument" #22937

@KekSfabrik

Description

@KekSfabrik

Output of docker version:

# docker version
Client:
 Version:      1.11.1
 API version:  1.23
 Go version:   go1.5.4
 Git commit:   5604cbe
 Built:        Tue Apr 26 23:30:23 2016
 OS/Arch:      linux/amd64

Server:
 Version:      1.11.1
 API version:  1.23
 Go version:   go1.5.4
 Git commit:   5604cbe
 Built:        Tue Apr 26 23:30:23 2016
 OS/Arch:      linux/amd64

Output of docker info:

# docker info
Containers: 4
 Running: 3
 Paused: 0
 Stopped: 1
Images: 20
Server Version: 1.11.1
Storage Driver: overlay
 Backing Filesystem: xfs
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins: 
 Volume: local
 Network: null host bridge
Kernel Version: 3.19.0-39-generic
Operating System: Ubuntu 14.04.4 LTS
OSType: linux
Architecture: x86_64
CPUs: 24
Total Memory: 94.29 GiB
Name: srv-0
ID: IDQ4:CJPJ:HFBJ:FGEP:XDNE:N5I6:VTZQ:O7LB:7EGT:MSAT:RAZK:74FH
Docker Root Dir: /data/docker/mnt
Debug mode (client): false
Debug mode (server): false
Username: keks
Registry: https://index.docker.io/v1/
WARNING: No swap limit support
Labels:
 my.company.environment=prod
 my.company.storage=hdd
 my.company.gateway=true
 my.company.name=srv-0
Cluster store: consul://192.168.10.1:8500
Cluster advertise: 192.168.10.1:2375

The running images are consul, gliderlabs/registrator and swarm:1.2.2

Additional environment details (AWS, VirtualBox, physical, etc.):
physical (hardware -> ubuntu 14.04.4 -> docker)

Steps to reproduce the issue:

  1. docker run ubuntu:precise

Describe the results you received:
I set up the daemon on new machines to use a non-default graph directory (--graph option) among other things (only bound to local network etc):

DOCKER_OPTS="\
-H                  unix:///var/run/docker.sock \
-H                  tcp://192.168.10.1:2375 \
--storage-driver    overlay \
--cluster-store     consul://192.168.10.1:8500 \
--cluster-advertise 192.168.10.1:2375 \
--dns               192.168.10.1 \
--dns               127.0.0.1 \
--dns-search        service.consul \
--graph             /data/docker/mnt \
--ip                192.168.10.1 \
--label             my.company.environment=prod \
--label             my.company.storage=hdd \
--label             my.company.gateway=true \
--label             my.company.name=srv-0 \
--log-driver        json-file \
--log-opt           max-size=10m \
--log-opt           max-file=9 \
"

Apparently some Images "randomly" work and then don't - in this example I tried (re-)running an ubuntu image (which worked about 30min ago):

root@srv-0:/data/docker/bootstrap# docker run --rm -it alpine sh # there, alpine works
/ # ^C
root@srv-0:/data/docker/bootstrap# docker run --rm -it ubuntu # yet ubuntu somehow doesn't
docker: Error response from daemon: mkdir /data/docker/mnt/overlay/646f16d12c8b5060f8a9e65a5fdabcf3604a108dc6a1d8c0497f7f2689e47e4a-init/merged/dev/shm: invalid argument.
See 'docker run --help'.
root@srv-0:/data/docker/bootstrap# tail -n 2 /var/log/upstart/docker.log
time="2016-05-24T11:40:06.625413479+02:00" level=error msg="Clean up Error! Cannot destroy container ffbcc96da1d9611a8d200e094bac3f12319360fa439f5105953160e5be64cb36: No such container: ffbcc96da1d9611a8d200e094bac3f12319360fa439f5105953160e5be64cb36" 
time="2016-05-24T11:40:06.625465100+02:00" level=error msg="Handler for POST /v1.23/containers/create returned error: mkdir /data/docker/mnt/overlay/646f16d12c8b5060f8a9e65a5fdabcf3604a108dc6a1d8c0497f7f2689e47e4a-init/merged/dev/shm: invalid argument" 
root@srv-0:/data/docker/bootstrap# ll -h /data/docker/mnt/
total 28K
drwx--x--x   9 root root  131 May 23 08:30 ./
drwxr-xr-x  11 root root 4.0K May 23 08:12 ../
drwx------   5 root root 4.0K May 24 11:49 containers/
drwx------   3 root root   28 May 23 08:30 image/
drwxr-x---   3 root root   26 May 23 08:30 network/
drwx------ 133 root root  12K May 24 11:49 overlay/
drwx------   2 root root   10 May 24 11:22 tmp/
drwx------   2 root root   10 May 23 08:30 trust/
drwx------   4 root root 4.0K May 23 12:32 volumes/

The configuration is about the same on all 3 boxes (IPs are different obviously) and they all show the same symptoms (one of the boxes has no internet access and is only on the 192.168.10.0/24 network with the other 2 - i put the images on there using docker save <images> | ssh docker load). Like this on another machine:

root@srv-2:/data/docker/bootstrap# docker run --rm -it ubuntu:latest
root@b031c7a80b8d:/# exit
root@srv-2:/data/docker/bootstrap# docker run --rm -it ubuntu:trusty
docker: Error response from daemon: mkdir /data/docker/mnt/overlay/e37098a0043c2bd200b919c4cd466a1cfe98a03865b08be82efa215e32e92196-init/merged/dev/shm: invalid argument.
See 'docker run --help'.

Describe the results you expected:
A Container starting..

Additional information you deem important (e.g. issue happens only occasionally):
I have a big chain of images all based on my (modification/addon FROM ubuntu:trusty) ubuntu image - i re-built the entire chain yesterday so all images (ntp, squid, java-base and its descendants tomcat & karaf, postgresql, rabbitmq, redis, mongodb, ...) share the layers of my ubuntu base - i pushed that to my private registry (registry:2.3). My first instinct when i failed to run anything earlier today was to think i cocked up somewhere along that path but a docker run ubuntu:trusty (layers "Already exists") failed while a newly pulled ubuntu:latest didn't - so my next thought was that maybe the official trusty image was b0rked?
However i can run either images on any other daemon (including one with the same 14.04.4 ubuntu, the same 3.19.0-39-generic kernel and overlay settings but in the default graph directory) so i'm guessing there's either some bug in the graph dir change (however there's still a single subfolder with a socket in /var/lib/docker/network/files/.sock that was apparently generated (either by compose or the daemon)
df -h and df -i report there's plenty of space (1% in use on everything but /boot), the host fs is ext4 (ubuntu) and the partition used for graphdir xfs and is a LVM partition..

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions