Skip to content

[Proposal] Handle "docker run --rm" inside the daemon instead of the client #16575

@codablock

Description

@codablock

The problem

Currently, docker handles the "--rm" on the client side, which means that the client will do the appropriate API calls to the daemon when it exits to remove the image on exit. This works great when you work locally in a simple setup, but may result in "garbage" containers existing forever under some circumstances. This may for example happen when:

  • You connect through the TCP endpoints, either directly or through a tunnel, and loose the connection. (see ssh: docker run --rm #16546 for example)
  • Your docker client crashes
  • Your docker client gets killed for whatever reason

The problem in our environment (CI)

Especially the last point hits me hard at the moment. We currently implement CI builds based on docker. The CI system calls a simple shell script, which builds the "builder" image and then runs it. The "builder" container is meant to be temporary and should vanish in case the build is finished or forcefully stopped (e.g. from the CI systems UI). We tried to use "run --rm" to accomplish this, which works fine when the build exits by itself. In case we externally stop the build, the CI system simply kills the script, which results in the "docker run --rm" process to die as well, but leaves the container running until it exits by itself. This leaves us with 2 problems:

  • The container is still running. This is already bad, but will get even worse in case one of our builds gets stuck, resulting in the container to run forever.
  • The container won't be deleted on completion.

We also want to spawn temporary containers (e.g. databases for testing) on the host docker daemon from inside the "builder" container, which makes the problem even worse, as all these containers would live forever.

Proposal/Feature request

When using "docker run", it should be possible to instruct the docker daemon what to do in case it looses the connection to the docker client. For example, we should be able to instruct the daemon to stop or kill and optionally rm the container on connection loss. This could for example look like:

docker run -ti --on-connection-loss=stop,rm myimage mycommand

I don't know much about the daemon internals or the API, so I can not really say how to implement it in a good way. One thing that I could imagine is an API call that says "do action X with name Y after Z seconds", where future calls to the same API with the same name Y would overwrite the old action and timeout Z. This would be comparable to TTL in services like etcd. The client could then do the following for the above "docker run" example:

  • Create and start the container as usual
  • After start, do an API call that says "stop and rm container after 30 seconds"
  • Wait 15 seconds and do the API call again. Repeat forever, until exit.

"docker rm" supports the -v flag to also delete the container volumes. I would expect that the proposed feature would also support something like "rm -v".

Metadata

Metadata

Assignees

No one assigned

    Labels

    kind/enhancementEnhancements are not bugs or new features but can improve usability or performance.

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions