Skip to content

[Proposal] Shim service #5742

@mxpv

Description

@mxpv

There are a couple of existing proposals that require changes to shim interface:

Changes to shim’s task interface is a sensitive topic, because there are external vendors, so we want to keep backward compatibility. However we still want to be able to move forward and offer new features.
From the past experience (Snapshotters) optional interfaces is a good way to go when we to add new stuff.
This way client can check whether particular interface is implemented, and if so, use new functionality.

Problem

In current implementation any changes on the shim side must also be reflected on containerd side.
Currently task service is the way how clients interact with shim.

So if we add sandboxes, we add or extend task service to support that.
If we add port forwarding, this also needs to be reflected on containerd side.
In the end we must end up with clean interface to work with this from client.

Every single extension would make daemon side more and more sophisticated to maintain.
In my case for sandbox API:

  • Need to generate both TTRPC and GRPC code.
  • Provide a proxy layer to client (client -> containerd -> shim)
  • Refactor existing runtime code to support both call flows (regular tasks and sandboxes ones).
  • Take care of different shim lifetime depending on whether sandbox or task shim (today shim's lifetime = task lifetime, that's no longer the case for sandboxes, where one shim can run multiple tasks)

So the containerd API must:
support all potential shim extensions and offer client API to use them.
be flexible enough to support extension permutations (sandbox or task w/ or w/o port forwarding depending on VMM behind?)

So the goal of this proposal is how to keep things simple and easy to extend.

Proposal

We push shim logic down and let clients to manage the flow of calls to shim server.
Instead containerd offers new shim service with the following endpoints:

type shimService interface {
    StartShim() // call shim --start
    DeleteShim() // call shim --delete
    GetConn(shim_id) // Get connection from lookup table
}

So containerd will take care of creating / removing new shim instances.
Once started, containerd will be tracking shim connection object.
In addition to that, containerd will take care of proper cleanup, so if a shim died, containerd will remove it from the lookup table and delete any data (basically preserve the logic we already have today).
Client can request existing connection via getConn.

Everything else is going on client side.

client asks containerd to start new shim server and returns connection handle.
From there client can check which interfaces are implemented/required to be implemented and define the call flow.

Examples.

A few examples how this can be used (in pseudocode)

1 regular containers

conn := client.StartShim(runtimeName);

taskService := NewTaskClient(conn);
task = taskService.CreateTask();

…

client.DeleteShim(task_id)

2 sandboxed containers

conn := client.StartShim(runtimeName);

taskService := NewTaskClient(conn);
sandboxService, err := NewSandboxClient(conn);
if err == ErrNotSupported {
    panic(“Requested shim doesnt support sandbox extension”);
}

sandboxService.CreateVM();
sandboxService.StartVM();

task = taskService.CreateTask(req);

sandboxService.ResizeVM();

…

task = taskService.StopTask(req);

sandboxService.StopVM();

3 Port forwarding

conn := client.StartShim(runtimeName);

taskService := NewTaskClient(conn);
portForward = NewPortForwardClient(conn)

task = taskService.CreateTask(req);

4 sandbox + port forward

You got the idea.

Pros & Cons

+ Nice and clean containerd side implementation
+ containerd is not required to know all implementation details (or they can be introduced as independent services)
+ Easy to add new shim extensions (even 3rd party)
+ Much simpler sandbox implementation (in current one runtime integration was the most painful).
+ We keep backward compatibility with current v2 shims

- Migration from current mature code base to the new one can be painful.
- Lot of API changes.

Path forward

Both task and shim services can co-exist side by side until we ready migrate the the new implementation and deprecate the old one in 2.0

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions