-
Notifications
You must be signed in to change notification settings - Fork 18.9k
Closed
Labels
Description
Description
I've setup haproxy to load balance services (following manuals set dns: "127.0.0.11" to do not forward requests to the external DSN servers) and noticed hundreds of errors per second in syslog when any backend service gets down:
Apr 14 15:51:08 staging-manager1 dockerd[2653338]: time="2024-04-14T15:51:08.083971294Z" level=error msg="[resolver] failed to query external DNS server" client-addr="udp:127.0.0.1:35393" dns-server="udp:127.0.0.11:53" error="read udp 127.0.0.1:35393->127.0.0.11:53: i/o timeout" question=";tasks.mon_prometheus.\tIN\t A" spanID=0a662e24539c4e08 traceID=3e0e421519bb2e7dcc60adf180880fb7
How can I avoid log pollution without making load to the external DNS service with queries of down service?
Reproduce
- create stack file
docker-compose.yml:
version: '3.8'
services:
dnstest:
image: nicolaka/netshoot:v0.12
dns: 127.0.0.11
entrypoint:
- sh
- -c
- 'while :; do dig non-existing; sleep 1; done'- deploy by running
docker stack deploy -c docker-compose.yml dnstest - examine syslog flooded by errors
tail -n10 -f /var/log/syslog
Expected behavior
logs should not be filled with hundreds of errors quering down service when limiting dns resolvers to single 127.0.0.11
docker version
Client: Docker Engine - Community
Version: 26.0.0
API version: 1.45
Go version: go1.21.8
Git commit: 2ae903e
Built: Wed Mar 20 15:17:48 2024
OS/Arch: linux/amd64
Context: default
Server: Docker Engine - Community
Engine:
Version: 26.0.0
API version: 1.45 (minimum version 1.24)
Go version: go1.21.8
Git commit: 8b79278
Built: Wed Mar 20 15:17:48 2024
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: 1.6.28
GitCommit: ae07eda36dd25f8a1b98dfbf587313b99c0190bb
runc:
Version: 1.1.12
GitCommit: v1.1.12-0-g51d5e94
docker-init:
Version: 0.19.0
GitCommit: de40ad0docker info
Client: Docker Engine - Community
Version: 26.0.0
Context: default
Debug Mode: false
Plugins:
buildx: Docker Buildx (Docker Inc.)
Version: v0.13.1
Path: /usr/libexec/docker/cli-plugins/docker-buildx
compose: Docker Compose (Docker Inc.)
Version: v2.17.2
Path: /root/.docker/cli-plugins/docker-compose
Server:
Containers: 51
Running: 35
Paused: 0
Stopped: 16
Images: 92
Server Version: 26.0.0
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Using metacopy: false
Native Overlay Diff: true
userxattr: false
Logging Driver: fluentd
Cgroup Driver: systemd
Cgroup Version: 2
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local splunk syslog
Swarm: active
NodeID: ye1tqwlj7lag2sy839fce03ca
Is Manager: true
ClusterID: kwpn459kfqifaedft9c6naknp
Managers: 1
Nodes: 3
Default Address Pool: 10.0.0.0/8
SubnetSize: 24
Data Path Port: 4789
Orchestration:
Task History Retention Limit: 5
Raft:
Snapshot Interval: 10000
Number of Old Snapshots to Retain: 0
Heartbeat Tick: 1
Election Tick: 10
Dispatcher:
Heartbeat Period: 5 seconds
CA Configuration:
Expiry Duration: 3 months
Force Rotate: 0
Autolock Managers: false
Root Rotation In Progress: false
Node Address: 192.168.1.2
Manager Addresses:
192.168.1.2:2377
Runtimes: io.containerd.runc.v2 runc
Default Runtime: runc
Init Binary: docker-init
containerd version: ae07eda36dd25f8a1b98dfbf587313b99c0190bb
runc version: v1.1.12-0-g51d5e94
init version: de40ad0
Security Options:
apparmor
seccomp
Profile: builtin
cgroupns
no-new-privileges
Kernel Version: 5.15.0-101-generic
Operating System: Ubuntu 22.04.4 LTS
OSType: linux
Architecture: x86_64
CPUs: 2
Total Memory: 3.82GiB
Name: staging-manager1
ID: c92b7ce2-fc57-487d-8b93-6b85847c857b
Docker Root Dir: /var/lib/docker
Debug Mode: false
Experimental: false
Insecure Registries:
127.0.0.0/8
Registry Mirrors:
https://*****:5010/
Live Restore Enabled: falseAdditional Info
Initially I was on v25, upgrade to v26 did not help.
I've opened haproxy issue but it seems like it's some docker edge case.
root@0b11c7683e24:/# dig mon_prometheus
;; communications error to 127.0.0.11#53: timed out
;; communications error to 127.0.0.11#53: timed out
;; communications error to 127.0.0.11#53: timed out
; <<>> DiG 9.18.24-1-Debian <<>> mon_prometheus
;; global options: +cmd
;; no servers could be reached
Here is output of tcpdump -v -i lo udp:
tcpdump-any-port.txt
I tried to run nslookup without overriding dns and got
/ # nslookup mon_prometheus
Server: 127.0.0.11
Address: 127.0.0.11:53
** server can't find mon_prometheus: NXDOMAIN
** server can't find mon_prometheus: NXDOMAIN
OS: Digitalocean image "Docker 25.0.3 on Ubuntu 22.04"
Reactions are currently unavailable