Skip to content

Refactor libcontainerd to minimize containerd RPCs (rebase)#43967

Closed
thaJeztah wants to merge 6 commits intomoby:masterfrom
thaJeztah:containerd_overhaul_rebase
Closed

Refactor libcontainerd to minimize containerd RPCs (rebase)#43967
thaJeztah wants to merge 6 commits intomoby:masterfrom
thaJeztah:containerd_overhaul_rebase

Conversation

@thaJeztah
Copy link
Member

pushing to have a run of CI to verify I didn't screw up anything; if it's okay we can reset #43564 to this branch 👍

The daemon.containerd.Exec call does not access or mutate the
container's ExecCommands store in any way, and locking the exec config
is sufficient to synchronize with the event-processing loop. Locking
the ExecCommands store while starting the exec process only serves to
block unrelated operations on the container for an extended period of
time.

Convert the Store struct's mutex to an unexported field to prevent this
from regressing in the future.

Signed-off-by: Cory Snider <csnider@mirantis.com>
The OOMKilled flag on a container's state has historically behaved
rather unintuitively: it is updated on container exit to reflect whether
or not any process within the container has been OOM-killed during the
preceding run of the container. The OOMKilled flag would be set to true
when the container exits if any process within the container---including
execs---was OOM-killed at any time while the container was running,
whether or not the OOM-kill was the cause of the container exiting. The
flag is "sticky," persisting through the next start of the container;
only being cleared once the container exits without any processes having
been OOM-killed that run.

Alter the behavior of the OOMKilled flag such that it signals whether
any process in the container had been OOM-killed since the most recent
start of the container. Set the flag immediately upon any process being
OOM-killed, and clear it when the container transitions to the "running"
state.

There is an ulterior motive for this change. It reduces the amount of
state the libcontainerd client needs to keep track of and clean up on
container exit. It's one less place the client could leak memory if a
container was to be deleted without going through libcontainerd.

Signed-off-by: Cory Snider <csnider@mirantis.com>
The containerd client is very chatty at the best of times. Because the
libcontained API is stateless and references containers and processes by
string ID for every method call, the implementation is essentially
forced to use the containerd client in a way which amplifies the number
of redundant RPCs invoked to perform any operation. The libcontainerd
remote implementation has to reload the containerd container, task
and/or process metadata for nearly every operation. This in turn
amplifies the number of context switches between dockerd and containerd
to perform any container operation or handle a containerd event,
increasing the load on the system which could otherwise be allocated to
workloads.

Overhaul the libcontainerd interface to reduce the impedance mismatch
with the containerd client so that the containerd client can be used
more efficiently. Split the API out into container, task and process
interfaces which the consumer is expected to retain so that
libcontainerd can retain state---especially the analogous containerd
client objects---without having to manage any state-store inside the
libcontainerd client.

Signed-off-by: Cory Snider <csnider@mirantis.com>
The existing logic to handle container ID conflicts when attempting to
create a plugin container is not nearly as robust as the implementation
in daemon for user containers. Extract and refine the logic from daemon
and use it in the plugin executor.

Signed-off-by: Cory Snider <csnider@mirantis.com>
Attempting to delete the directory while another goroutine is
concurrently executing a CheckpointTo() can fail on Windows due to file
locking. As all callers of CheckpointTo() are required to hold the
container lock, holding the lock while deleting the directory ensures
that there will be no interference.

Signed-off-by: Cory Snider <csnider@mirantis.com>
Modifying the builtin Windows runtime to send the exited event
immediately upon the container's init process exiting, without first
waiting for the Compute System to shut down, perturbed the timings
enough to make TestWaitConditions flaky on that platform. Make
TestWaitConditions timing-independent by having the container wait
for input on STDIN before exiting.

Signed-off-by: Cory Snider <csnider@mirantis.com>
@thaJeztah thaJeztah force-pushed the containerd_overhaul_rebase branch from 2bb7483 to 16a9ec7 Compare August 16, 2022 14:05
@thaJeztah
Copy link
Member Author

closing this one, as this was pushed to #43967

@thaJeztah thaJeztah closed this Aug 16, 2022
@thaJeztah thaJeztah deleted the containerd_overhaul_rebase branch August 16, 2022 22:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants