Skip to content

Node LB not deleted after container exits itself (cause IP overlapping) #40989

@xinfengliu

Description

@xinfengliu

Description

Since docker 18.09, there's a node LB per network on the node. When the last container of the network is finished, the node LB is removed.

This is correct when the swarm task container is explicitly shutdown. However, when the task container exits itself (either due to task completion or failure), the node LB is NOT removed. This is dangerous and could cause IP overlapping.

Steps to reproduce the issue:
On a simple 1-node swarmkit cluster:

  1. Create a very small network to make IP overlapping quicky happen
    $ docker network create -d overlay --subnet 10.6.6.0/29 net0

  2. Create a simple service that will complete by itself.
    $ docker service create --network net0 --name bb --restart-max-attempts=1 --restart-delay=10s busybox sleep 5

  3. Wait 30s for the service tasks to complete, at this time there's no container running, but node LB is still there.

$ docker network inspect --format '{{range .Containers}}{{printf "%s\t%s\n" .Name .IPv4Address}}{{end}}' net0
net0-endpoint	10.6.6.4/29
  1. Now try creating an IP overlapping
$ docker service scale bb=3
$ docker service ps bb --no-trunc

You will see "starting container failed: Address already in use"

Describe the results you received:
node LB is not deleted and could cause IP overlapping later.

Describe the results you expected:
node LB should be deleted when the last container on this network on this node exists.

Additional information you deem important (e.g. issue happens only occasionally):

Output of docker version:
I can reproduce this issue on any docker 18.09 and 19.03 version.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions