Conversation
client.go
Outdated
mlaventure
left a comment
There was a problem hiding this comment.
Looks good so far, don't have the time to finish goign through it all atm, but I'll try to get another pass at it.
client.go
Outdated
cmd/containerd-shim-oci-v1/main.go
Outdated
container_linux_test.go
Outdated
There was a problem hiding this comment.
I looked at the test and it really wasn't testing much, killing the shim then killing the process. I don't think it was a valid test or it was and changed sometime in the past. Felt safe to remove.
There was a problem hiding this comment.
It checks that the container process (redis) is correctly cleaned up if the shim crash/is sigkilled (e.g. by oom).
It doesn't actually kill the process, it checks that it died.
There was a problem hiding this comment.
ok, my bad, i'll fix it. i missed the kill 0
There was a problem hiding this comment.
Seems this test is still removed in the latest commit? or did I miss a move somewhere else?
There was a problem hiding this comment.
ok, so I researched this more and the test is only valid for v1 when container was still running and didn't have to reconnect to a shim again. The reason is that it sets an exit handler that does the "cleanup on dead shim" code. However, this only runs if the shim process that containerd is doing a waidpid on returns and checks the shim pid.
I think its safe to remove this as it doesn't really work, it only works in the tests because the daemon isn't restarted. If we want to keep it, i can do some refactoring but it will take a little bit of work to make this work across daemon reboots with running shims. I can do that in a followup PR.
There was a problem hiding this comment.
Makes sense.
I haven't yet got the time to compile and play around with the new API to see how it actually behave.
Having a way to start a "cleanup shim" in such scenario may be a cleaner solution for v2 though (and if that fails, things are left to the user to clean).
metrics/cgroups/metrics.go
Outdated
There was a problem hiding this comment.
No timeout / cancel / Sync ? We could end up with several go routine getting the stats on the same task, no?
There was a problem hiding this comment.
they are sync'd in the caller, hard to have a timeout for this because under load, collection could be slower
There was a problem hiding this comment.
oh, right. I missed the waitgroup, my bad.
runtime/v1/linux/proc/init.go
Outdated
There was a problem hiding this comment.
nit, but could we bundle exported variables together?
runtime/v2/bundle.go
Outdated
There was a problem hiding this comment.
why the separation between the base dir and the last dir in the path?
There was a problem hiding this comment.
base should always be created while mkdir should return an exists error if something is already there
There was a problem hiding this comment.
What is the reason behind doing them in 2 loops if the remove all happens on error anyway? Also what does it mean if Mkdir returns an exist on the second part, seems odd that it would both return an exists error and delete the directory.
runtime/v2/oci/service_linux.go
Outdated
There was a problem hiding this comment.
cleanup stdin if created?
runtime/v2/oci/service_linux.go
Outdated
There was a problem hiding this comment.
missing cleanup of in/outw
|
@crosbymichael - This is a really good start. I can rebase my windows side changes on this and resubmit |
e3c9050 to
b238640
Compare
Codecov Report
@@ Coverage Diff @@
## master #2434 +/- ##
==========================================
- Coverage 44.76% 44.73% -0.04%
==========================================
Files 92 93 +1
Lines 9483 9490 +7
==========================================
Hits 4245 4245
- Misses 4555 4562 +7
Partials 683 683
Continue to review full report at Codecov.
|
runtime/v2/bundle.go
Outdated
There was a problem hiding this comment.
wrapping this could be useful, I don't think the message includes the path
runtime/v2/shim.go
Outdated
There was a problem hiding this comment.
Why is this command not given the bundle id anywhere?
There was a problem hiding this comment.
its not needed and comes on the Create request
There was a problem hiding this comment.
What does this exec with the delete arg intended to do?
runtime/v2/util.go
Outdated
There was a problem hiding this comment.
Currently name will never be absolute, but is it worth considering a case where name is in the same directory as self but not in the lookup path? In that case the name might be better found at os.Join(os.Dir(self), name). Unless there is a plan here to document setting PATH on containerd for the runtime binary installation location.
There was a problem hiding this comment.
i think for now, we say that it must be in PATH. i'll remove this abs code
runtime/v2/shim.go
Outdated
There was a problem hiding this comment.
nit: either Wrap or add the ID?
runtime/v2/shim.go
Outdated
There was a problem hiding this comment.
Should we return or log this error?
runtime/v2/shim.go
Outdated
runtime/v2/shim.go
Outdated
There was a problem hiding this comment.
Do we need to check the StatusUnknown task type or have a runtime.UnknownStatus?
263acdb to
7976ef8
Compare
aeb0941 to
578358e
Compare
services/tasks/local.go
Outdated
There was a problem hiding this comment.
Should we add a default case and explanation for not using now?
8deea2c to
9a1e7b9
Compare
runtime/v2/binary.go
Outdated
There was a problem hiding this comment.
Should this be trimmed? Also wondering if we should just take the stdout rather than combined output.
You previously expressed that this extra connect could cause some additional time, although mostly negligible. Could we still pass in the socket and if the returned address doesn't match just remove it and create a new connection, or would that end up being even longer?
bb10512 to
20cf7ef
Compare
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
| option go_package = "github.com/containerd/containerd/runtime/v2/runc/options;options"; | ||
|
|
||
| message Options { | ||
| bool no_pivot_root = 1; |
There was a problem hiding this comment.
I know this isn't popular, but let's document the intent of these fields.
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
| // each container and allows reattaching to the IO and receiving the exit status | ||
| // for the container processes. | ||
| service Task { | ||
| rpc State(StateRequest) returns (StateResponse); |
There was a problem hiding this comment.
It would be good to document what each method does.
|
I don't see anything wildly controversial here. It would be good to document the methods and fields used in GRPC. |
|
@stevvooe I was going to do documentation in a separate PR. I'll take care of proto comments there as well. I wanted to make a document for shim authors, "Building a shim 101" |
|
@crosbymichael Will take a look at this PR tomorrow. Sorry for the delay. |
This readme specifies shim api and authoring docs. Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
|
Update the description and added a readme for shim authors. Runtime v2Runtime v2 introduces a first class shim API for runtime authors to integrate with containerd. Binary NamingUsers specify the runtime they wish to use when creating a container. > ctr run --runtime io.containerd.runc.v1When a user specifies a runtime name,
containerd keeps the Shim AuthoringThis section is dedicated to runtime authors wishing to build a shim. CommandsContainer information is provided to a shim in two ways.
|
|
The api LGTM overall. One thing is that can we include required events? e.g. TaskExit (required), OOMEvents (optional/required?) etc. |
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
|
@Random-Liu ok, updated the README with events. |
|
it's merged! 🎉 🙌👏👏 |
This implements the shimv2 API proposed in #2426. The tests are using the new
io.containerd.oci.v1runtime under the new TaskManger implementing the v2 APIs.Runtime v2
Runtime v2 introduces a first class shim API for runtime authors to integrate with containerd.
The shim API is minimal and scoped to the execution lifecycle of a container.
Binary Naming
Users specify the runtime they wish to use when creating a container.
The runtime can also be changed via a container update.
> ctr run --runtime io.containerd.runc.v1When a user specifies a runtime name,
io.containerd.runc.v1, they will specify the name and version of the runtime.This will be trasnlated by containerd into a binary name for the shim.
io.containerd.runc.v1->containerd-shim-runc-v1containerd keeps the
containerd-shim-*prefix so that users canps aux | grep containerd-shimto see running shims on their system.Shim Authoring
This section is dedicated to runtime authors wishing to build a shim.
It will detail how the API works and different considerations when building shim.
Commands
Container information is provided to a shim in two ways.
The OCI Runtime Bundle and on the
Createrpc request.startEach shim MUST implement a
startsubcommand.This command will launch new shims.
The start command MUST accept the following flags:
-namespacethe namespace for the container-idthe id of the container-addressthe address of the containerd's main socket-publish-binarythe binary path to publish events back to containerdThe start command, as well as all binary calls to the shim, has the bundle for the container set as the
cwd.The start command MUST return an address to a shim for containerd to issue API requests for container operations.
The start command can either start a new shim or return an address to an existing shim based on the shim's logic.
deleteEach shim MUST implement a
deletesubcommand.This command allows containerd to delete any container resources created, mounted, and/or run by a shim when containerd can no longer communicate over rpc.
This happens if a shim is SIGKILL'd with a running container.
These resources will need to be cleaned up when containerd looses the connection to a shim.
This is also used when containerd boots and reconnects to shims.
If a bundle is still on disk but containerd cannot connect to a shim, the delete command is invoked.
The delete command MUST accept the following flags:
-namespacethe namespace for the container-idthe id of the container-addressthe address of the containerd's main socket-publish-binarythe binary path to publish events back to containerdThe delete command will be executed in the container's bundle as its
cwd.Host Level Shim Configuration
containerd does not provide any host level configuration for shims via the API.
If a shim needs configuration from the user with host level information across all instances, a shim specific configuration file can be setup.
Container Level Shim Configuration
On the create request, there is a generic
*protobuf.Anythat allows a user to specify container level configuration for the shim.A shim author can create their own protobuf message for configuration and clients can import and provide this information is needed.
I/O
I/O for a container is provided by the client to the shim via fifo on Linux, named pipes on Windows, or log files on disk.
The paths to these files are provided on the
Createrpc for the initial creation and on theExecrpc for additional processes.Containers that are to be launched with an interactive terminal will have the
terminalfield set totrue, data is still copied over the files(fifos,pipes) in the same way as non interactive containers.Root Filesystems
The root filesytems for the containers is provided by on the
Createrpc.Shims are responsible for managing the lifecycle of the filesystem mount during the lifecycle of a container.
The mount protobuf message is:
Shims are responsible for mounting the filesystem into the
rootfs/directory of the bundle.Shims are also responsible for unmounting of the filesystem.
During a
deletebinary call, the shim MUST ensure that filesystem is also unmounted.Filesystems are provided by the containerd snapshotters.
Other
Unsupported rpcs
If a shim does not or cannot implement an rpc call, it MUST return a
github.com/containerd/containerd/errdefs.ErrNotImplementederror.ttrpc
ttrpc is the only currently supported protocol for shims.
It works with standard protobufs and GRPC services as well as generating clients.
The only difference between grpc and ttrpc is the wire protocol.
ttrpc removes the http stack in order to save memory and binary size to keep shims small.
It is recommended to use ttrpc in your shim but grpc support is also in development.
Closes #2426