Refactor libcontainerd to minimize containerd RPCs (rebase) by thaJeztah · Pull Request #43967 · moby/moby

thaJeztah · 2022-08-16T13:40:07Z

just a rebase of Refactor libcontainerd to minimize containerd RPCs #43564

pushing to have a run of CI to verify I didn't screw up anything; if it's okay we can reset #43564 to this branch 👍

The daemon.containerd.Exec call does not access or mutate the container's ExecCommands store in any way, and locking the exec config is sufficient to synchronize with the event-processing loop. Locking the ExecCommands store while starting the exec process only serves to block unrelated operations on the container for an extended period of time. Convert the Store struct's mutex to an unexported field to prevent this from regressing in the future. Signed-off-by: Cory Snider <csnider@mirantis.com>

The OOMKilled flag on a container's state has historically behaved rather unintuitively: it is updated on container exit to reflect whether or not any process within the container has been OOM-killed during the preceding run of the container. The OOMKilled flag would be set to true when the container exits if any process within the container---including execs---was OOM-killed at any time while the container was running, whether or not the OOM-kill was the cause of the container exiting. The flag is "sticky," persisting through the next start of the container; only being cleared once the container exits without any processes having been OOM-killed that run. Alter the behavior of the OOMKilled flag such that it signals whether any process in the container had been OOM-killed since the most recent start of the container. Set the flag immediately upon any process being OOM-killed, and clear it when the container transitions to the "running" state. There is an ulterior motive for this change. It reduces the amount of state the libcontainerd client needs to keep track of and clean up on container exit. It's one less place the client could leak memory if a container was to be deleted without going through libcontainerd. Signed-off-by: Cory Snider <csnider@mirantis.com>

The containerd client is very chatty at the best of times. Because the libcontained API is stateless and references containers and processes by string ID for every method call, the implementation is essentially forced to use the containerd client in a way which amplifies the number of redundant RPCs invoked to perform any operation. The libcontainerd remote implementation has to reload the containerd container, task and/or process metadata for nearly every operation. This in turn amplifies the number of context switches between dockerd and containerd to perform any container operation or handle a containerd event, increasing the load on the system which could otherwise be allocated to workloads. Overhaul the libcontainerd interface to reduce the impedance mismatch with the containerd client so that the containerd client can be used more efficiently. Split the API out into container, task and process interfaces which the consumer is expected to retain so that libcontainerd can retain state---especially the analogous containerd client objects---without having to manage any state-store inside the libcontainerd client. Signed-off-by: Cory Snider <csnider@mirantis.com>

The existing logic to handle container ID conflicts when attempting to create a plugin container is not nearly as robust as the implementation in daemon for user containers. Extract and refine the logic from daemon and use it in the plugin executor. Signed-off-by: Cory Snider <csnider@mirantis.com>

Attempting to delete the directory while another goroutine is concurrently executing a CheckpointTo() can fail on Windows due to file locking. As all callers of CheckpointTo() are required to hold the container lock, holding the lock while deleting the directory ensures that there will be no interference. Signed-off-by: Cory Snider <csnider@mirantis.com>

Modifying the builtin Windows runtime to send the exited event immediately upon the container's init process exiting, without first waiting for the Compute System to shut down, perturbed the timings enough to make TestWaitConditions flaky on that platform. Make TestWaitConditions timing-independent by having the container wait for input on STDIN before exiting. Signed-off-by: Cory Snider <csnider@mirantis.com>

thaJeztah · 2022-08-16T22:24:17Z

closing this one, as this was pushed to #43967

corhere added 2 commits August 16, 2022 15:27

thaJeztah mentioned this pull request Aug 16, 2022

Refactor libcontainerd to minimize containerd RPCs #43564

Merged

corhere added 4 commits August 16, 2022 16:05

thaJeztah force-pushed the containerd_overhaul_rebase branch from 2bb7483 to 16a9ec7 Compare August 16, 2022 14:05

thaJeztah closed this Aug 16, 2022

thaJeztah deleted the containerd_overhaul_rebase branch August 16, 2022 22:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor libcontainerd to minimize containerd RPCs (rebase)#43967

Refactor libcontainerd to minimize containerd RPCs (rebase)#43967
thaJeztah wants to merge 6 commits intomoby:masterfrom
thaJeztah:containerd_overhaul_rebase

thaJeztah commented Aug 16, 2022

Uh oh!

thaJeztah commented Aug 16, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

thaJeztah commented Aug 16, 2022

Uh oh!

thaJeztah commented Aug 16, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants