-
Notifications
You must be signed in to change notification settings - Fork 18.9k
Closed
Labels
area/networkingNetworkingNetworkingkind/bugBugs are bugs. The cause may or may not be known at triage time so debugging may be needed.Bugs are bugs. The cause may or may not be known at triage time so debugging may be needed.status/confirmedversion/25.0
Description
Description
I'm polling a docker network endpoint /networks/my_network?verbose=true continuously once every second, and after some time (can be many days) the docker daemon crashes with the following error in the logs:
fatal error: concurrent map read and map write
goroutine 29805570 [running]:
github.com/docker/docker/libnetwork/networkdb.(*NetworkDB).GetTableByNetwork(0xc0012987e0, {0x8c54cc, 0x12}, {0xc001380ce0, 0x19})
/go/src/github.com/docker/docker/libnetwork/networkdb/networkdb.go:426 +0x69
github.com/docker/docker/libnetwork.(*Network).Services(0xc0020a8e00)
/go/src/github.com/docker/docker/libnetwork/agent.go:497 +0x551
github.com/docker/docker/daemon.buildServiceAttachments(0x455620?)
/go/src/github.com/docker/docker/daemon/network.go:653 +0x3f
github.com/docker/docker/daemon.(*Daemon).GetNetworks(0x44f4a0?, {0xc0034afaa0?}, {0xaf?, 0xad?})
/go/src/github.com/docker/docker/daemon/network.go:595 +0x46d
github.com/docker/docker/api/server/router/network.(*networkRouter).getNetwork(0xc0018b20c0, {0x8c913a?, 0x13?}, {0xc8b2a0, 0xc000744d20}, 0xc00110e900, 0xc0034af740?)
/go/src/github.com/docker/docker/api/server/router/network/network_routes.go:121 +0x710
github.com/docker/docker/api/server/middleware.(*ExperimentalMiddleware).WrapHandler.ExperimentalMiddleware.WrapHandler.func1({0xc94520, 0xc0034af9b0}, {0xc8b2a0?, 0xc000744d20?}, 0x373de0?, 0xc000d98870?)
/go/src/github.com/docker/docker/api/server/middleware/experimental.go:26 +0xb4
github.com/docker/docker/api/server/middleware.(*VersionMiddleware).WrapHandler.VersionMiddleware.WrapHandler.func1({0xc94520, 0xc0034af8c0}, {0xc8b2a0, 0xc000744d20}, 0x40?, 0x40?)
/go/src/github.com/docker/docker/api/server/middleware/version.go:62 +0x2ae
github.com/docker/docker/pkg/authorization.(*Middleware).WrapHandler.func1({0xc94520, 0xc0034af8c0}, {0xc8b2a0?, 0xc000744d20?}, 0xc00110e900, 0x3b13460?)
/go/src/github.com/docker/docker/pkg/authorization/middleware.go:59 +0x683
github.com/docker/docker/api/server.(*Server).makeHTTPHandler.func1({0xc8b2a0, 0xc000744d20}, 0xc00110e700)
/go/src/github.com/docker/docker/api/server/server.go:55 +0x1c3
net/http.HandlerFunc.ServeHTTP(0xc94520?, {0xc8b2a0?, 0xc000744d20?}, 0xc61318?)
/usr/local/go/src/net/http/server.go:2136 +0x29
go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp.(*middleware).serveHTTP(0xc00218af20, {0xc831d0?, 0xc002c58fc0}, 0xc00110e500, {0xc6a080, 0xc001a2b470})
/go/src/github.com/docker/docker/vendor/go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp/handler.go:217 +0x1202
go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp.NewMiddleware.func1.1({0xc831d0?, 0xc002c58fc0?}, 0xc5ae01?)
/go/src/github.com/docker/docker/vendor/go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp/handler.go:81 +0x35
net/http.HandlerFunc.ServeHTTP(0xc94520?, {0xc831d0?, 0xc002c58fc0?}, 0xc5ae28?)
/usr/local/go/src/net/http/server.go:2136 +0x29
net/http.HandlerFunc.ServeHTTP(0xc00110e400?, {0xc831d0?, 0xc002c58fc0?}, 0x7fcd4c6c7df0?)
/usr/local/go/src/net/http/server.go:2136 +0x29
github.com/gorilla/mux.(*Router).ServeHTTP(0xc0018903c0, {0xc831d0, 0xc002c58fc0}, 0xc00110e300)
/go/src/github.com/docker/docker/vendor/github.com/gorilla/mux/mux.go:212 +0x1c5
net/http.serverHandler.ServeHTTP({0xc0018bd6e0?}, {0xc831d0?, 0xc002c58fc0?}, 0x6?)
/usr/local/go/src/net/http/server.go:2938 +0x8e
net/http.(*conn).serve(0xc001f94000, {0xc94520, 0xc0011e0e70})
/usr/local/go/src/net/http/server.go:2009 +0x5f4
created by net/http.(*Server).Serve in goroutine 796
/usr/local/go/src/net/http/server.go:3086 +0x5cb
goroutine 1 [semacquire, 22798 minutes, locked to thread]:
sync.runtime_Semacquire(0xc000f08a80?)
/usr/local/go/src/runtime/sema.go:62 +0x25
sync.(*WaitGroup).Wait(0xc000d9c080?)
/usr/local/go/src/sync/waitgroup.go:116 +0x48
main.(*DaemonCli).start(0xc000d9c080, 0xc0006cbf00)
/go/src/github.com/docker/docker/cmd/dockerd/daemon.go:350 +0x1cf7
main.runDaemon(...)
/go/src/github.com/docker/docker/cmd/dockerd/docker_unix.go:13
main.newDaemonCommand.func1(0xc000d88100?, {0xc0001f00e0?, 0x7?, 0x88f9be?})
/go/src/github.com/docker/docker/cmd/dockerd/docker.go:37 +0x94
github.com/spf13/cobra.(*Command).execute(0xc000ad3800, {0xc000052100, 0xe, 0xe})
/go/src/github.com/docker/docker/vendor/github.com/spf13/cobra/command.go:983 +0xabc
github.com/spf13/cobra.(*Command).ExecuteC(0xc000ad3800)
/go/src/github.com/docker/docker/vendor/github.com/spf13/cobra/command.go:1115 +0x3ff
github.com/spf13/cobra.(*Command).Execute(...)
/go/src/github.com/docker/docker/vendor/github.com/spf13/cobra/command.go:1039
main.main()
/go/src/github.com/docker/docker/cmd/dockerd/docker.go:106 +0x17b
...
<snipped, plenty more here, let me know if you're interested>
I'm also polling some other endpoints (/services, /tasks and /nodes) every 10s, unsure if that's related.
Reproduce
I'm calling the endpoint directly from a script, but I guess docker network inspect --verbose x could trigger it as well.
Expected behavior
not crash
docker version
Client:
Version: 25.0.5
API version: 1.44
Go version: go1.21.8
Git commit: 5dc9bcc
Built: Tue Mar 19 15:04:17 2024
OS/Arch: linux/amd64
Context: default
Server: Docker Engine - Community
Engine:
Version: 25.0.5
API version: 1.44 (minimum version 1.24)
Go version: go1.21.8
Git commit: e63daec
Built: Tue Mar 19 15:05:39 2024
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: v1.7.13
GitCommit: 7c3aca7a610df76212171d200ca3811ff6096eb8
runc:
Version: 1.1.12
GitCommit: v1.1.12-0-g51d5e94
docker-init:
Version: 0.19.0
GitCommit: de40ad0docker info
Client:
Version: 25.0.5
Context: default
Debug Mode: false
Server:
Containers: 33
Running: 9
Paused: 0
Stopped: 24
Images: 69
Server Version: 25.0.5
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Using metacopy: false
Native Overlay Diff: true
userxattr: false
Logging Driver: json-file
Cgroup Driver: cgroupfs
Cgroup Version: 1
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local splunk syslog
Swarm: active
NodeID: wsfcvwmwg65zho2l63hon1wsb
Is Manager: false
Node Address: 10.0.1.33
Manager Addresses:
10.0.1.31:2377
Runtimes: runc io.containerd.runc.v2
Default Runtime: runc
Init Binary: docker-init
containerd version: 7c3aca7a610df76212171d200ca3811ff6096eb8
runc version: v1.1.12-0-g51d5e94
init version: de40ad0
Security Options:
seccomp
Profile: builtin
Kernel Version: 6.6.33-0-lts
Operating System: Alpine Linux v3.19
OSType: linux
Architecture: x86_64
CPUs: 40
Total Memory: 31.28GiB
Name: x
ID: 84d016f3-5708-4b31-9437-3d147ee97829
Docker Root Dir: /var/lib/docker
Debug Mode: false
Username: x
Experimental: false
Insecure Registries:
10.0.1.31:5000
127.0.0.0/8
Live Restore Enabled: false
Product License: Community EngineAdditional Info
No response
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
area/networkingNetworkingNetworkingkind/bugBugs are bugs. The cause may or may not be known at triage time so debugging may be needed.Bugs are bugs. The cause may or may not be known at triage time so debugging may be needed.status/confirmedversion/25.0