-
Notifications
You must be signed in to change notification settings - Fork 18.9k
Description
BUG REPORT INFORMATION (may be related to #19988)
Output of docker version:
Docker version 1.9.0, build 76d6bc9
# and also
Docker version 1.10.2, build c3959b1
Output of docker info:
# none, I can't reanimate the daemon anymore and all my 230+ containers are basically down
Provide additional environment details (AWS, VirtualBox, physical, etc.):
OpenStack VM, 4 cores, 8 GB mem, Ubuntu 14.04.4 LTS, Linux 2b320682-1875-4548-a1e2-29d15b11b5a8 3.19.0-49-generic #55~14.04.1hf1533043v20160201b1-Ubuntu SMP Mon Feb 1 20:41:00 UT x86_64 x86_64 x86_64 GNU/Linux.
List the steps to reproduce the issue:
Started by script since months without problems, but now failing. Also fails after Docker update and OS update. Here the launch command we use (230+ containers all started with restart policy = always):
docker daemon --api-cors-header=false --debug=false --disable-legacy-registry=false --group vcap --graph /var/vcap/store/docker/docker --host unix:///var/vcap/sys/run/docker/docker.sock --icc=true --ip-forward=true --ip-masq=true --iptables=true --ipv6=false --log-level=info --pidfile /var/vcap/sys/run/docker/docker.pid --selinux-enabled=false --host tcp://0.0.0.0:4243 --tls=false --userland-proxy=true
Describe the results you received:
INFO[0000] API listen on /var/vcap/sys/run/docker/docker.sock
INFO[0000] API listen on [::]:4243
INFO[0000] [graphdriver] using prior storage driver "aufs"
ERRO[0000] getNetworkFromStore for nid d48a6206205146ee2e1ca9ae0ce6f5ea1c6b2b1a64d739f771bccfd0ac5970e1 failed while trying to build sandbox for cleanup: network d48a6206205146ee2e1ca9ae0ce6f5ea1c6b2b1a64d739f771bccfd0ac5970e1 not found
ERRO[0000] getEndpointFromStore for eid 8fc41dcb6bf31938e79b193d01ce84acd7f660d8ae82afb6f31c8bf191e1663f failed while trying to build sandbox for cleanup: could not find endpoint 8fc41dcb6bf31938e79b193d01ce84acd7f660d8ae82afb6f31c8bf191e1663f: Key not found in store
WARN[0000] Failed detaching sandbox 569c2a0eefb84fb14b3e3c6cbb554163b480331b996e2ef9f7b2fa9a4e033672 from endpoint c71f3a95931e93968f28f0ab27bef203d10aa1d48b2c18b36ca56eaf023fcc10: failed to get network from store during leave: network d48a6206205146ee2e1ca9ae0ce6f5ea1c6b2b1a64d739f771bccfd0ac5970e1 not found
WARN[0000] Failed deleting endpoint c71f3a95931e93968f28f0ab27bef203d10aa1d48b2c18b36ca56eaf023fcc10: failed to get network during Delete: network d48a6206205146ee2e1ca9ae0ce6f5ea1c6b2b1a64d739f771bccfd0ac5970e1 not found
WARN[0000] Failed detaching sandbox 569c2a0eefb84fb14b3e3c6cbb554163b480331b996e2ef9f7b2fa9a4e033672 from endpoint 8fc41dcb6bf31938e79b193d01ce84acd7f660d8ae82afb6f31c8bf191e1663f: failed to get endpoint from store during leave: could not find endpoint 8fc41dcb6bf31938e79b193d01ce84acd7f660d8ae82afb6f31c8bf19
1e1663f: Key not found in store
WARN[0000] Failed deleting endpoint 8fc41dcb6bf31938e79b193d01ce84acd7f660d8ae82afb6f31c8bf191e1663f: failed to get endpoint from store during Delete: could not find endpoint 8fc41dcb6bf31938e79b193d01ce84acd7f660d8ae82afb6f31c8bf191e1663f: Key not found in store
ERRO[0000] failed to delete sandbox 569c2a0eefb84fb14b3e3c6cbb554163b480331b996e2ef9f7b2fa9a4e033672 while trying to cleanup: could not cleanup all the endpoints in container 27f42f1c8f2792639d4c96f621408b1551a9121c184f2721b6aa17444c96177f / sandbox 569c2a0eefb84fb14b3e3c6cbb554163b480331b996e2ef9f7b2fa9a4e03
3672
...
ERRO[0025] getNetworkFromStore for nid d48a6206205146ee2e1ca9ae0ce6f5ea1c6b2b1a64d739f771bccfd0ac5970e1 failed while trying to build sandbox for cleanup: network d48a6206205146ee2e1ca9ae0ce6f5ea1c6b2b1a64d739f771bccfd0ac5970e1 not found [18/6605]
panic: runtime error: invalid memory address or nil pointer dereference
[signal 0xb code=0x1 addr=0x168 pc=0x87f3d9]
goroutine 1 [running]:
github.com/docker/libnetwork.epHeap.Less(0xc20864d570, 0x2, 0x2, 0x1, 0x0, 0xc208033a70)
/go/src/github.com/docker/docker/vendor/src/github.com/docker/libnetwork/sandbox.go:968 +0x159
github.com/docker/libnetwork.(*epHeap).Less(0xc208854630, 0x1, 0x0, 0x2)
<autogenerated>:50 +0xb5
container/heap.up(0x7f835b7a4438, 0xc208854630, 0x1)
/usr/local/go/src/container/heap/heap.go:93 +0x5f
container/heap.Push(0x7f835b7a4438, 0xc208854630, 0x13ce9e0, 0xc208033a70)
/usr/local/go/src/container/heap/heap.go:54 +0x7e
github.com/docker/libnetwork.(*controller).sandboxCleanup(0xc2080a0370)
/go/src/github.com/docker/docker/vendor/src/github.com/docker/libnetwork/sandbox_store.go:226 +0xc06
github.com/docker/libnetwork.New(0xc208765bc0, 0x5, 0x8, 0x0, 0x0, 0x0, 0x0)
/go/src/github.com/docker/docker/vendor/src/github.com/docker/libnetwork/controller.go:200 +0x6cb
github.com/docker/docker/daemon.(*Daemon).initNetworkController(0xc208071040, 0xc2080eb180, 0x0, 0x0, 0x0, 0x0)
/go/src/github.com/docker/docker/daemon/daemon_unix.go:364 +0xfd
github.com/docker/docker/daemon.NewDaemon(0xc2080eb180, 0xc20803a430, 0x0, 0x0, 0x0)
/go/src/github.com/docker/docker/daemon/daemon.go:770 +0x2149
main.(*DaemonCli).CmdDaemon(0xc2084f5240, 0xc20800a020, 0x16, 0x16, 0x0, 0x0)
/go/src/github.com/docker/docker/docker/daemon.go:269 +0x1722
reflect.callMethod(0xc2084d7f50, 0xc2084dbce8)
/usr/local/go/src/reflect/value.go:605 +0x179
reflect.methodValueCall(0xc20800a020, 0x16, 0x16, 0x1, 0xc2084d7f50, 0x0, 0x0, 0xc2084d7f50, 0x0, 0x4dcb9f, ...)
/usr/local/go/src/reflect/asm_amd64.s:29 +0x36
github.com/docker/docker/cli.(*Cli).Run(0xc2084d7e90, 0xc20800a010, 0x17, 0x17, 0x0, 0x0)
/go/src/github.com/docker/docker/cli/cli.go:89 +0x38e
main.main()
/go/src/github.com/docker/docker/docker/docker.go:65 +0x418
goroutine 5 [syscall]:
os/signal.loop()
/usr/local/go/src/os/signal/signal_unix.go:21 +0x1f
created by os/signal.init·1
/usr/local/go/src/os/signal/signal_unix.go:27 +0x35
goroutine 17 [syscall, locked to thread]:
runtime.goexit()
/usr/local/go/src/runtime/asm_amd64.s:2232 +0x1
goroutine 12 [chan receive]:
github.com/docker/docker/pkg/listenbuffer.(*defaultListener).Accept(0xc208504380, 0x0, 0x0, 0x0, 0x0)
/go/src/github.com/docker/docker/pkg/listenbuffer/buffer.go:73 +0xb6
net/http.(*Server).Serve(0xc2080428a0, 0x7f835b79a568, 0xc208504380, 0x0, 0x0)
/usr/local/go/src/net/http/server.go:1728 +0x92
github.com/docker/docker/api/server.(*HTTPServer).Serve(0xc208504460, 0x0, 0x0)
/go/src/github.com/docker/docker/api/server/server.go:113 +0x4d
github.com/docker/docker/api/server.func·006(0xc208504460)
/go/src/github.com/docker/docker/api/server/server.go:86 +0x163
created by github.com/docker/docker/api/server.(*Server).ServeAPI
/go/src/github.com/docker/docker/api/server/server.go:90 +0x144
goroutine 10 [chan receive]:
github.com/docker/docker/api/server.(*Server).ServeAPI(0xc208037900, 0x0, 0x0)
/go/src/github.com/docker/docker/api/server/server.go:94 +0x1b6
main.func·007()
/go/src/github.com/docker/docker/docker/daemon.go:255 +0x3b
created by main.(*DaemonCli).CmdDaemon
/go/src/github.com/docker/docker/docker/daemon.go:261 +0x1571
goroutine 11 [chan receive]:
github.com/docker/docker/daemon.func·030()
/go/src/github.com/docker/docker/daemon/debugtrap_unix.go:17 +0x5c
created by github.com/docker/docker/daemon.setupDumpStackTrap
/go/src/github.com/docker/docker/daemon/debugtrap_unix.go:20 +0x18e
goroutine 13 [chan receive]:
github.com/docker/docker/pkg/listenbuffer.(*defaultListener).Accept(0xc2085044a0, 0x0, 0x0, 0x0, 0x0)
/go/src/github.com/docker/docker/pkg/listenbuffer/buffer.go:73 +0xb6
net/http.(*Server).Serve(0xc208042a20, 0x7f835b79a568, 0xc2085044a0, 0x0, 0x0)
/usr/local/go/src/net/http/server.go:1728 +0x92
github.com/docker/docker/api/server.(*HTTPServer).Serve(0xc208504580, 0x0, 0x0)
/go/src/github.com/docker/docker/api/server/server.go:113 +0x4d
github.com/docker/docker/api/server.func·006(0xc208504580)
/go/src/github.com/docker/docker/api/server/server.go:86 +0x163
created by github.com/docker/docker/api/server.(*Server).ServeAPI
/go/src/github.com/docker/docker/api/server/server.go:90 +0x144
goroutine 14 [select]:
github.com/docker/libnetwork.(*controller).watchLoop(0xc2080a0370)
/go/src/github.com/docker/docker/vendor/src/github.com/docker/libnetwork/store.go:399 +0x13f
created by github.com/docker/libnetwork.(*controller).startWatch
/go/src/github.com/docker/docker/vendor/src/github.com/docker/libnetwork/store.go:413 +0xf0
goroutine 39 [select]:
github.com/docker/libnetwork/osl.removeUnusedPaths()
/go/src/github.com/docker/docker/vendor/src/github.com/docker/libnetwork/osl/namespace_linux.go:73 +0x48b
created by github.com/docker/libnetwork/osl.createBasePath
/go/src/github.com/docker/docker/vendor/src/github.com/docker/libnetwork/osl/namespace_linux.go:58 +0xb1
Describe the results you expected:
Daemon starts, all containers start.
Provide additional info you think is important:
We haven't restarted the Docker daemon for two weeks. Maybe there are too many containers deployed now? But we had already daemons with 300+ containers running.
Maybe something went wrong during shutdown/the VM was terminated? The VM gets automatically recreated/reprovisioned and the former volume gets reattached. Could there by some issues with a hard kill?
In any case, the crash when relaunching the daemon doesn't look too good/like a real issue.
Do you have short term remedy proposals? Is there a way to force a cleanup? Or some way to start the daemon, but not all the containers (all were started with restart policy = always)?
In short, we see tons of:
* getNetworkFromStore for nid failed while trying to build sandbox for cleanup: network not found
* failed getting network for ep during sandbox delete: network not found
* failed to delete sandbox while trying to cleanup: could not cleanup all the endpoints in container / sandbox
... and then it crashes in:
- github.com/docker/libnetwork.epHeap.Less (Docker v1.9) or
- github.com/docker/libnetwork.(*sandbox).clearNetworkResources (Docker 1.10)