Skip to content

Container continues to run even though swarm reports it's gone already #34574

@EugeneKostrikov

Description

@EugeneKostrikov

Description

Service status reports are not consistent. After scaling service down to 0 replicas when update completes container might still continue to run. By "When update completes" i mean both UpdateStatus either not set at all or reporting "completed" AND all tasks fetched through API have DesiredState == Status.State. Example below is a lot more straightforward and doesn't check the tasks, but still from docker service ps top says it's already stopped while container is Running.

Eventually this floating container dies. But it's critical for automated tests as there's no way to identify it's actually stopped and started back with new configuration. In my real use-case there's a web server in the container and stays reachable though ingress port. So tests are getting response with outdated configuration from a phantom process that should not exist :)

It's best to reproduce with container not handling SIGTERM correctly so that it takes additional 10 seconds to actually nuke it.

Steps to reproduce the issue:

docker service create --name top --replicas 1 --constraint 'node.role == manager' --detach=false busybox top
echo "service ps:"
docker service ps top
echo "containers ps:"
docker ps | grep top
echo "scaling down:"
docker service scale top=0
echo "service ps: "
docker service ps top
echo "containers ps: "
docker ps | grep top
docker service rm top

Describe the results you received:

Second "service ps:" is empty as if no containers are running for the service. All tasks for the service have DesiredState == Status.State. If container has a web server it's still accessible at service's ingress port.

Describe the results you expected:

Second "service ps:" reporting

  • Desired state - stopped
  • Current state - running

Tasks for the service have DesiredState == Status.State until underlying container is stopped indeed. Container is not accessible at services's port.

Additional information you deem important (e.g. issue happens only occasionally):

Output of docker version:

docker version
Client:
 Version:      17.06.0-ce
 API version:  1.30
 Go version:   go1.8.3
 Git commit:   02c1d87
 Built:        unknown-buildtime
 OS/Arch:      darwin/amd64

Server:
 Version:      17.06.0-ce
 API version:  1.30 (minimum version 1.12)
 Go version:   go1.8.3
 Git commit:   02c1d87
 Built:        Fri Jun 23 21:51:55 2017
 OS/Arch:      linux/amd64
 Experimental: false

Output of docker info:

Containers: 15
 Running: 9
 Paused: 0
 Stopped: 6
Images: 444
Server Version: 17.06.0-ce
Storage Driver: aufs
 Root Dir: /mnt/sda1/var/lib/docker/aufs
 Backing Filesystem: extfs
 Dirs: 300
 Dirperm1 Supported: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins: 
 Volume: local
 Network: bridge host macvlan null overlay
 Log: awslogs fluentd gcplogs gelf journald json-file logentries splunk syslog
Swarm: active
 NodeID: mbtc6arenmce7jopouno9509b
 Is Manager: true
 ClusterID: 5e0jdpw9ban7f1f35gnm8x1ns
 Managers: 1
 Nodes: 2
 Orchestration:
  Task History Retention Limit: 5
 Raft:
  Snapshot Interval: 10000
  Number of Old Snapshots to Retain: 0
  Heartbeat Tick: 1
  Election Tick: 3
 Dispatcher:
  Heartbeat Period: 5 seconds
 CA Configuration:
  Expiry Duration: 3 months
  Force Rotate: 0
 Root Rotation In Progress: false
 Node Address: 192.168.99.104
 Manager Addresses:
  192.168.99.104:2377
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: cfb82a876ecc11b5ca0977d1733adbe58599088a
runc version: 2d41c047c83e09a6d61d464906feb2a2f3c52aa4
init version: 949e6fa
Security Options:
 seccomp
  Profile: default
Kernel Version: 4.4.74-boot2docker
Operating System: Boot2Docker 17.06.0-ce (TCL 7.2); HEAD : 0672754 - Thu Jun 29 00:06:31 UTC 2017
OSType: linux
Architecture: x86_64
CPUs: 4
Total Memory: 2.936GiB
Name: apps0
ID: 6JP4:LPJJ:AUYA:XZ3V:MFYK:LYNN:DMKK:DCDU:5N75:LOY6:TNVD:XSO4
Docker Root Dir: /mnt/sda1/var/lib/docker
Debug Mode (client): false
Debug Mode (server): true
 File Descriptors: 115
 Goroutines: 236
 System Time: 2017-08-20T09:46:08.864326557Z
 EventsListeners: 6
Username: flyvictor
Registry: https://index.docker.io/v1/
Labels:
 provider=virtualbox
Experimental: false
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: false

Additional environment details (AWS, VirtualBox, physical, etc.):
docker-machine running in VirtualBox on mac.

Metadata

Metadata

Assignees

No one assigned

    Labels

    area/swarmkind/bugBugs are bugs. The cause may or may not be known at triage time so debugging may be needed.version/17.06

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions