-
Notifications
You must be signed in to change notification settings - Fork 18.9k
Description
Description
Hello,
Sometimes my docker Swarm cluster doesn't update weights of IPVS addresses during a deployment process.
It should set weight=0 during a rollout, but it just skips some endpoints. So I get wrong load balancer config and errors during a deployment:
### Old containers are waiting for graceshutdown
-> 10.0.108.161:0 Masq 0 0 0
-> 10.0.108.162:0 Masq 1 0 3 <- Problem
-> 10.0.108.163:0 Masq 0 0 0
-> 10.0.108.164:0 Masq 0 0 0
### New containers are ready
-> 10.0.108.165:0 Masq 1 0 4
-> 10.0.108.166:0 Masq 1 0 2
-> 10.0.108.167:0 Masq 1 0 4
-> 10.0.108.168:0 Masq 1 0 5
The more containers in the service and the more network peers, the greater the chance of catching an error. On services with 32 replicas on 20 nodes I face issue with 100% chance.
I've started investigations and looks I found some kind of a race condition between the BulkSync procedure and a logic of DisableService method.
Let me explain on the example:
- NodeA disabled EndpointA via DisableService method ( https://github.com/moby/libnetwork/blob/fa125a3512ee0f6187721c88582bf8c4378bd4d7/sandbox.go#L712 ) that set ServiceDisabled=true on the endpoint record locally, increased LTime and sent UPDATE event to all networks peers
- NodeA started BulkSync with NodeB
- NodeB got updated value of EndpointA endpoint from the bulkSync procedure, processed it as CREATE event and saved it in local state
- NodeB got UPDATE event from the point 1 but ignores it since endpointA with the same LTime already exists in networkdb ( https://github.com/moby/libnetwork/blob/d0951081b35fa4216fc4f0064bf065beeb55a74b/networkdb/delegate.go#L174 )
- NodeA stopped container, removed endpoint and sent DELETE event to all network peers
- NodeB get DELETE event and removed endpoint from IPVS
Usually UPDATE event processing sets weight=0 to disable endpoint. CREATE events processing doesn't have any logic related to ServiceDisabled=true flag. Since we ignored UPDATE event, we got IPVS address with weight=1 until address has been deleted.
In such situations we have two "handleEpTableEvent ADD" for a single endpoint in logs because bulkSync processes the record as a new endpoint adding:
time="2021-01-19T06:24:16.721195937Z" level=debug msg="handleEpTableEvent ADD 96309316d9ca369cfae8bd02f9ecfa4cc468e4fd60cd0e33863ec33ef4a0bdd6 R:{test-app_php-fpm.3.y8k0h8wumjhbvrsqd8w66s1wy test-app_php-fpm 21ivc1mb008pv9gu2if59go9k 10.0.108.8 10.0.108.72 [] [php-fpm] [fcec15cea6a3] false}"
time="2021-01-19T06:39:50.899346960Z" level=debug msg="handleEpTableEvent ADD 96309316d9ca369cfae8bd02f9ecfa4cc468e4fd60cd0e33863ec33ef4a0bdd6 R:{test-app_php-fpm.3.y8k0h8wumjhbvrsqd8w66s1wy test-app_php-fpm 21ivc1mb008pv9gu2if59go9k 10.0.108.8 10.0.108.72 [] [php-fpm] [fcec15cea6a3] true}"
time="2021-01-19T06:40:51.504938421Z" level=debug msg="handleEpTableEvent DEL 96309316d9ca369cfae8bd02f9ecfa4cc468e4fd60cd0e33863ec33ef4a0bdd6 R:{test-app_php-fpm.3.y8k0h8wumjhbvrsqd8w66s1wy test-app_php-fpm 21ivc1mb008pv9gu2if59go9k 10.0.108.8 10.0.108.72 [] [php-fpm] [fcec15cea6a3] true}"
I've collected tableEvents log from a node also. Endpoint 96309316d9ca369cfae8bd02f9ecfa4cc468e4fd60cd0e33863ec33ef4a0bdd6 didn't change weight to 0 in IPVS and we got unstable service during a deployment:
## Common events related to online endpoint from bulk sync
2021-01-19T06:38:10.886180117Z {Type:CREATE,LTime:93557892,NodeName:1b906f7d2be0,NetworkID:f312pn3rfujkpupqzeyumtvpm,TableName:endpoint_table,Key:96309316d9ca369cfae8bd02f9ecfa4cc468e4fd60cd0e33863ec33ef4a0bdd6,ResidualReapTime:0}
2021-01-19T06:38:40.905904777Z {Type:CREATE,LTime:93557892,NodeName:1b906f7d2be0,NetworkID:f312pn3rfujkpupqzeyumtvpm,TableName:endpoint_table,Key:96309316d9ca369cfae8bd02f9ecfa4cc468e4fd60cd0e33863ec33ef4a0bdd6,ResidualReapTime:0}
2021-01-19T06:39:10.885634958Z {Type:CREATE,LTime:93557892,NodeName:1b906f7d2be0,NetworkID:f312pn3rfujkpupqzeyumtvpm,TableName:endpoint_table,Key:96309316d9ca369cfae8bd02f9ecfa4cc468e4fd60cd0e33863ec33ef4a0bdd6,ResidualReapTime:0}
2021-01-19T06:39:41.193478062Z {Type:CREATE,LTime:93557892,NodeName:1b906f7d2be0,NetworkID:f312pn3rfujkpupqzeyumtvpm,TableName:endpoint_table,Key:96309316d9ca369cfae8bd02f9ecfa4cc468e4fd60cd0e33863ec33ef4a0bdd6,ResidualReapTime:0}
## Deploy process has been started, node 1b906f7d2be0 set ServiceDisabled=true and updated LTime of the endpoint record
## We got CREATE event from bulk sync earlier than UPDATE event from gossip
2021-01-19T06:39:50.899209788Z {Type:CREATE,LTime:93561596,NodeName:1b906f7d2be0,NetworkID:f312pn3rfujkpupqzeyumtvpm,TableName:endpoint_table,Key:96309316d9ca369cfae8bd02f9ecfa4cc468e4fd60cd0e33863ec33ef4a0bdd6,ResidualReapTime:0}
## CREATE event ignored ServiceDisabled=true flag and didn't update IPVS table
## Next events are ignored because LTime is the same as before
## during these events endpoint has weight 1 in IPVS table
2021-01-19T06:39:51.372175542Z {Type:UPDATE,LTime:93561596,NodeName:1b906f7d2be0,NetworkID:f312pn3rfujkpupqzeyumtvpm,TableName:endpoint_table,Key:96309316d9ca369cfae8bd02f9ecfa4cc468e4fd60cd0e33863ec33ef4a0bdd6,ResidualReapTime:0}
2021-01-19T06:39:51.520065112Z {Type:UPDATE,LTime:93561596,NodeName:1b906f7d2be0,NetworkID:f312pn3rfujkpupqzeyumtvpm,TableName:endpoint_table,Key:96309316d9ca369cfae8bd02f9ecfa4cc468e4fd60cd0e33863ec33ef4a0bdd6,ResidualReapTime:0}
2021-01-19T06:39:52.240751438Z {Type:CREATE,LTime:93561596,NodeName:1b906f7d2be0,NetworkID:f312pn3rfujkpupqzeyumtvpm,TableName:endpoint_table,Key:96309316d9ca369cfae8bd02f9ecfa4cc468e4fd60cd0e33863ec33ef4a0bdd6,ResidualReapTime:0}
2021-01-19T06:39:53.174523912Z {Type:UPDATE,LTime:93561596,NodeName:1b906f7d2be0,NetworkID:f312pn3rfujkpupqzeyumtvpm,TableName:endpoint_table,Key:96309316d9ca369cfae8bd02f9ecfa4cc468e4fd60cd0e33863ec33ef4a0bdd6,ResidualReapTime:0}
2021-01-19T06:40:01.415950175Z {Type:UPDATE,LTime:93561596,NodeName:1b906f7d2be0,NetworkID:f312pn3rfujkpupqzeyumtvpm,TableName:endpoint_table,Key:96309316d9ca369cfae8bd02f9ecfa4cc468e4fd60cd0e33863ec33ef4a0bdd6,ResidualReapTime:0}
2021-01-19T06:40:01.920180779Z {Type:UPDATE,LTime:93561596,NodeName:1b906f7d2be0,NetworkID:f312pn3rfujkpupqzeyumtvpm,TableName:endpoint_table,Key:96309316d9ca369cfae8bd02f9ecfa4cc468e4fd60cd0e33863ec33ef4a0bdd6,ResidualReapTime:0}
2021-01-19T06:40:02.098081468Z {Type:UPDATE,LTime:93561596,NodeName:1b906f7d2be0,NetworkID:f312pn3rfujkpupqzeyumtvpm,TableName:endpoint_table,Key:96309316d9ca369cfae8bd02f9ecfa4cc468e4fd60cd0e33863ec33ef4a0bdd6,ResidualReapTime:0}
2021-01-19T06:40:05.270438843Z {Type:UPDATE,LTime:93561596,NodeName:1b906f7d2be0,NetworkID:f312pn3rfujkpupqzeyumtvpm,TableName:endpoint_table,Key:96309316d9ca369cfae8bd02f9ecfa4cc468e4fd60cd0e33863ec33ef4a0bdd6,ResidualReapTime:0}
2021-01-19T06:40:05.953138848Z {Type:UPDATE,LTime:93561596,NodeName:1b906f7d2be0,NetworkID:f312pn3rfujkpupqzeyumtvpm,TableName:endpoint_table,Key:96309316d9ca369cfae8bd02f9ecfa4cc468e4fd60cd0e33863ec33ef4a0bdd6,ResidualReapTime:0}
2021-01-19T06:40:06.920765495Z {Type:UPDATE,LTime:93561596,NodeName:1b906f7d2be0,NetworkID:f312pn3rfujkpupqzeyumtvpm,TableName:endpoint_table,Key:96309316d9ca369cfae8bd02f9ecfa4cc468e4fd60cd0e33863ec33ef4a0bdd6,ResidualReapTime:0}
2021-01-19T06:40:07.142009496Z {Type:UPDATE,LTime:93561596,NodeName:1b906f7d2be0,NetworkID:f312pn3rfujkpupqzeyumtvpm,TableName:endpoint_table,Key:96309316d9ca369cfae8bd02f9ecfa4cc468e4fd60cd0e33863ec33ef4a0bdd6,ResidualReapTime:0}
2021-01-19T06:40:08.304224480Z {Type:UPDATE,LTime:93561596,NodeName:1b906f7d2be0,NetworkID:f312pn3rfujkpupqzeyumtvpm,TableName:endpoint_table,Key:96309316d9ca369cfae8bd02f9ecfa4cc468e4fd60cd0e33863ec33ef4a0bdd6,ResidualReapTime:0}
2021-01-19T06:40:10.999052920Z {Type:CREATE,LTime:93561596,NodeName:1b906f7d2be0,NetworkID:f312pn3rfujkpupqzeyumtvpm,TableName:endpoint_table,Key:96309316d9ca369cfae8bd02f9ecfa4cc468e4fd60cd0e33863ec33ef4a0bdd6,ResidualReapTime:0}
2021-01-19T06:40:11.372201674Z {Type:UPDATE,LTime:93561596,NodeName:1b906f7d2be0,NetworkID:f312pn3rfujkpupqzeyumtvpm,TableName:endpoint_table,Key:96309316d9ca369cfae8bd02f9ecfa4cc468e4fd60cd0e33863ec33ef4a0bdd6,ResidualReapTime:0}
2021-01-19T06:40:13.022040640Z {Type:UPDATE,LTime:93561596,NodeName:1b906f7d2be0,NetworkID:f312pn3rfujkpupqzeyumtvpm,TableName:endpoint_table,Key:96309316d9ca369cfae8bd02f9ecfa4cc468e4fd60cd0e33863ec33ef4a0bdd6,ResidualReapTime:0}
2021-01-19T06:40:14.097906052Z {Type:UPDATE,LTime:93561596,NodeName:1b906f7d2be0,NetworkID:f312pn3rfujkpupqzeyumtvpm,TableName:endpoint_table,Key:96309316d9ca369cfae8bd02f9ecfa4cc468e4fd60cd0e33863ec33ef4a0bdd6,ResidualReapTime:0}
2021-01-19T06:40:16.670842605Z {Type:UPDATE,LTime:93561596,NodeName:1b906f7d2be0,NetworkID:f312pn3rfujkpupqzeyumtvpm,TableName:endpoint_table,Key:96309316d9ca369cfae8bd02f9ecfa4cc468e4fd60cd0e33863ec33ef4a0bdd6,ResidualReapTime:0}
2021-01-19T06:40:42.339699998Z {Type:CREATE,LTime:93561596,NodeName:1b906f7d2be0,NetworkID:f312pn3rfujkpupqzeyumtvpm,TableName:endpoint_table,Key:96309316d9ca369cfae8bd02f9ecfa4cc468e4fd60cd0e33863ec33ef4a0bdd6,ResidualReapTime:0}
## Application started to have timeouts and connection problems, because container is already DOWN, but has weight 1 in IPVS
## Container shutdowned, LTime changed, record disappeared from the ipvs
2021-01-19T06:40:51.504647244Z {Type:DELETE,LTime:93561883,NodeName:1b906f7d2be0,NetworkID:f312pn3rfujkpupqzeyumtvpm,TableName:endpoint_table,Key:96309316d9ca369cfae8bd02f9ecfa4cc468e4fd60cd0e33863ec33ef4a0bdd6,ResidualReapTime:1800}
2021-01-19T06:40:51.520386920Z {Type:DELETE,LTime:93561883,NodeName:1b906f7d2be0,NetworkID:f312pn3rfujkpupqzeyumtvpm,TableName:endpoint_table,Key:96309316d9ca369cfae8bd02f9ecfa4cc468e4fd60cd0e33863ec33ef4a0bdd6,ResidualReapTime:1800}
2021-01-19T06:40:51.710730260Z {Type:DELETE,LTime:93561883,NodeName:1b906f7d2be0,NetworkID:f312pn3rfujkpupqzeyumtvpm,TableName:endpoint_table,Key:96309316d9ca369cfae8bd02f9ecfa4cc468e4fd60cd0e33863ec33ef4a0bdd6,ResidualReapTime:1800}
2021-01-19T06:40:51.791670427Z {Type:DELETE,LTime:93561883,NodeName:1b906f7d2be0,NetworkID:f312pn3rfujkpupqzeyumtvpm,TableName:endpoint_table,Key:96309316d9ca369cfae8bd02f9ecfa4cc468e4fd60cd0e33863ec33ef4a0bdd6,ResidualReapTime:1800}
2021-01-19T06:40:51.992225524Z {Type:DELETE,LTime:93561883,NodeName:1b906f7d2be0,NetworkID:f312pn3rfujkpupqzeyumtvpm,TableName:endpoint_table,Key:96309316d9ca369cfae8bd02f9ecfa4cc468e4fd60cd0e33863ec33ef4a0bdd6,ResidualReapTime:1800}
...
Can you help me figure out the problem?
Output of docker version:
Client: Docker Engine - Community
Version: 19.03.12
API version: 1.40
Go version: go1.13.10
Git commit: 48a66213fe
Built: Mon Jun 22 15:45:52 2020
OS/Arch: linux/amd64
Experimental: false
Server: Docker Engine - Community
Engine:
Version: 19.03.12
API version: 1.40 (minimum version 1.12)
Go version: go1.13.10
Git commit: 48a66213fe
Built: Mon Jun 22 15:44:23 2020
OS/Arch: linux/amd64
Experimental: true
containerd:
Version: 1.2.13
GitCommit: 7ad184331fa3e55e52b890ea95e65ba581ae3429
runc:
Version: 1.0.0-rc10
GitCommit: dc9208a3303feef5b3839f4323d9beb36df0a9dd
docker-init:
Version: 0.18.0
GitCommit: fec3683
Output of docker info:
Client:
Debug Mode: false
Server:
Containers: 22
Running: 17
Paused: 0
Stopped: 5
Images: 20
Server Version: 19.03.12
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: active
NodeID: mn8g3iocbtq39mw8wtp1hqry6
Is Manager: true
ClusterID: bknezdiqnpzy2nwj4vj0imxwz
Managers: 3
Nodes: 25
Default Address Pool: 10.0.0.0/8
SubnetSize: 24
Data Path Port: 4789
Orchestration:
Task History Retention Limit: 5
Raft:
Snapshot Interval: 10000
Number of Old Snapshots to Retain: 0
Heartbeat Tick: 1
Election Tick: 10
Dispatcher:
Heartbeat Period: 5 seconds
CA Configuration:
Expiry Duration: 3 months
Force Rotate: 0
Autolock Managers: false
Root Rotation In Progress: false
Node Address: 10.10.10.1
Manager Addresses:
10.10.10.1:2377
10.10.10.2:2377
10.10.10.3:2377
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 7ad184331fa3e55e52b890ea95e65ba581ae3429
runc version: dc9208a3303feef5b3839f4323d9beb36df0a9dd
init version: fec3683
Security Options:
apparmor
seccomp
Profile: default
Kernel Version: 4.19.0-0.bpo.8-amd64
Operating System: Debian GNU/Linux 9 (stretch)
OSType: linux
Architecture: x86_64
CPUs: 8
Total Memory: 31.24GiB
Name: s88-swarm-manager-03
ID: QJN7:73T3:BB2Q:ADZD:J36N:EXDQ:IUFF:73JN:3R6C:SBU7:6JOJ:S7NP
Docker Root Dir: /var/lib/docker
Debug Mode: true
File Descriptors: 492
Goroutines: 745
System Time: 2021-01-19T11:54:20.986673668Z
EventsListeners: 19
Registry: https://index.docker.io/v1/
Labels:
provider=generic
Experimental: true
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
Additional environment details (AWS, VirtualBox, physical, etc.):
physical cluster with 22 workers and 3 managers