Skip to content

update containerd and port executor over to the container client library#2275

Merged
aaronlehmann merged 35 commits intomoby:masterfrom
ijc:containerd
Jun 30, 2017
Merged

update containerd and port executor over to the container client library#2275
aaronlehmann merged 35 commits intomoby:masterfrom
ijc:containerd

Conversation

@ijc
Copy link
Copy Markdown
Contributor

@ijc ijc commented Jun 21, 2017

This switches to the client library interfaces for pulling images, producing the spec and stdio as well as some of the lifecycle ops.

Unported are events/wait and tear down, I'm seeing some strangeness with those which I am still trying to diagnose, so for now these are using Client.TaskService (which is still an improvement) but still making direct gRPC calls. Port is now complete.

This is a pretty big change, I expect the overall diff will be pretty unreviewable, I've tried to break the commits down into individual logical changes with decent commit messages which ought to be a bit more tractable (although not always by much).

@codecov
Copy link
Copy Markdown

codecov bot commented Jun 21, 2017

Codecov Report

Merging #2275 into master will decrease coverage by 0.53%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master    #2275      +/-   ##
==========================================
- Coverage   60.96%   60.42%   -0.54%     
==========================================
  Files         126      125       -1     
  Lines       20391    20394       +3     
==========================================
- Hits        12431    12323     -108     
- Misses       6589     6688      +99     
- Partials     1371     1383      +12

}

func withNamespace(ctx context.Context) context.Context {
return namespaces.WithNamespace(ctx, "default")
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there value to using a swarm-specific namespace instead of the default namespace?

cc @stevvooe

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My plan was to make it a swarmd command line option (sooner rather than later, I just didn't get round to it). I'm not sure what the best default is.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in the commit I just pushed.

// directly attaches stdin to /dev/null.
if devNull == nil {
if devNull, err = os.Open(os.DevNull); err != nil {
return errors.Wrap(err, "opening dev null")
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"opening null device"?

if p, ok := mountPropagationReverseMap[m.BindOptions.Propagation]; ok {
propagation = p
} else {
log.G(ctx).Warningf("unknown bind mount propagation, using %q", propagation)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extra space after the ,

return err
}
log.G(ctx).Errorf("EXIT STATUS %v", exitStatus)
if err := r.adapter.remove(ctx); err != nil {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not correct to remove the container automatically on shutdown. We only remove it when the task is removed from the assignment set. This way old containers are kept around as long as the corresponding historic task is retained.

Remove should be called by reconcileTaskState once the task is no longer in the assignment set. With default settings, you may have to cycle through 5 tasks for a particular slot (or scale down) to see a removal happen.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this was still here (nb: it's the same behaviour as current master, since this series also moves the remove from r.adapter.shutdown into r.adapter.remove) while I figured out the strangeness I mentioned in the PR comment, I worked those out yesterday so I'll clean up those commits and append them here today.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is gone with the batch of changes which reworks Wait() and Shutdow().

@ijc
Copy link
Copy Markdown
Contributor Author

ijc commented Jun 22, 2017

Force push to fix Aaron's two nits, added "containerd: log tag which fits in with the others" (was using "ID", which stood out like a sore thumb, now uses "container.id" which is a better fit)

@ijc
Copy link
Copy Markdown
Contributor Author

ijc commented Jun 22, 2017

Added a pile more patches which rework Wait(), Shutdown() and ContainerStatus(), that was all the remaining open coded gRPC calls.

@ijc ijc changed the title update containerd and (partially) port executor over to the container client library update containerd and port executor over to the container client library Jun 22, 2017
@aaronlehmann
Copy link
Copy Markdown
Collaborator

lint error: agent/exec/containerd/adapter.go:282:6: exported type Status should have comment or be unexported

Ian Campbell added 19 commits June 22, 2017 18:50
Needed to add namespacing and to use the new runtime name.

Signed-off-by: Ian Campbell <ian.campbell@docker.com>
This is v1.0.0-rc6-12-g372ad78 and corresponds to what the vendored containerd
is using.

Signed-off-by: Ian Campbell <ian.campbell@docker.com>
Signed-off-by: Ian Campbell <ian.campbell@docker.com>
Signed-off-by: Ian Campbell <ian.campbell@docker.com>
The containerd client library contains a containerd.Container which will be a
more appropriate use of the name.

The existing spec function is renamed makeSpec() but after the switch to the
client library the function becomes so simple it can be inlined into create.

Signed-off-by: Ian Campbell <ian.campbell@docker.com>
…e instead

The containerd client library contains a containerd.Task which will be a more
appropriate use of the name.

Signed-off-by: Ian Campbell <ian.campbell@docker.com>
This simplifies a bunch of stuff, with more potential improvements yet to come.

Hardcodes the default namespace for now.

Signed-off-by: Ian Campbell <ian.campbell@docker.com>
Rootfs and Mounts handling is not converted yet and will follow.

Signed-off-by: Ian Campbell <ian.campbell@docker.com>
This means we no longer need to unpack the rootfs manually.

Signed-off-by: Ian Campbell <ian.campbell@docker.com>
Changes/simplifications:
 - no anon volume support, they don't really make sense in a cluster
   environment;
 - mount over the image provided mounts instead of trying to dedup;
 - don't worry about dedup of the requested mounts either;

With this there is no longer any need to obtain the image config on the client
side.

Signed-off-by: Ian Campbell <ian.campbell@docker.com>
Signed-off-by: Ian Campbell <ian.campbell@docker.com>
It is so simple now a separate function is not that useful.

Signed-off-by: Ian Campbell <ian.campbell@docker.com>
Now that rootfs and stdio are handled by containerd there is no need for this.

Signed-off-by: Ian Campbell <ian.campbell@docker.com>
It is called from controller.Prepare and this seems to better reflect its
purpose.

Signed-off-by: Ian Campbell <ian.campbell@docker.com>
Here "prepared" means the prepared() method has been called and therefore both
c.container and c.task can be assumed to be non-nil.

Signed-off-by: Ian Campbell <ian.campbell@docker.com>
Signed-off-by: Ian Campbell <ian.campbell@docker.com>
This potentially allows callers to try and cleanup partial state.

Signed-off-by: Ian Campbell <ian.campbell@docker.com>
Signed-off-by: Ian Campbell <ian.campbell@docker.com>
Signed-off-by: Ian Campbell <ian.campbell@docker.com>
Ian Campbell added 8 commits June 22, 2017 18:50
Existing tags as "node.id" and "task.id" etc.

Signed-off-by: Ian Campbell <ian.campbell@docker.com>
It's just an event, not a fatal one (at least not at this level)

Signed-off-by: Ian Campbell <ian.campbell@docker.com>
Dropping the exit code. The exitError type implements the ExitCoder interface
so callers can extract it if they want.

Signed-off-by: Ian Campbell <ian.campbell@docker.com>
The containerd Task struct contains a lot of info which callers are not
interested in and which is not provided by the Task.Status() method, which we
will shortly switch to in preference to an open coded InfoRequest gRPC.

Belt-and-braces checks that ID and Pid match will go away as part of
Task.Status switch.

Signed-off-by: Ian Campbell <ian.campbell@docker.com>
Signed-off-by: Ian Campbell <ian.campbell@docker.com>
It's unexpected and unnecessary, the orchestrator will call the Shutdown() or
Remove() methods as needed.

Signed-off-by: Ian Campbell <ian.campbell@docker.com>
Signed-off-by: Ian Campbell <ian.campbell@docker.com>
Idempotency is not a requirement of the interface, just let it return failed
(presumably some sort of "already dead" or "not found" type error).

Signed-off-by: Ian Campbell <ian.campbell@docker.com>
func NewExecutor(sock string, genericResources []*api.GenericResource) (exec.Executor, error) {
// TODO(ijc), configurable namespace.
// TODO(ijc), default to swarmd?
client, err := containerd.New(sock, containerd.WithDefaultNamespace("default"))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably have something nice here for the namespace. k8s is using k8s.io. Perhaps, swarmkit.docker.com, but I am not sure if we are spinning off this project. swarmkit is probably sufficient.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tend to think of swarmd the standalone daemon process as somewhat separate to the swarmkit toolkit. So I think I might take a two layered approach here and have cmd/swarmd/main.go define a --containerd-namespace option, defaulting to "swarmd" which is passed as an argument to NewExecutor here.

In NewExecutor if the requested namespace is "" then it'll default to swarmkit. Or maybe I'll make it an error.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in the commit I just added, I decided to make NewExecutor insist on an application provided namespace and had swarmd default to swarmd.

c.exitStatus = makeExitError(rsp.ExitStatus, "")
} else {
c.log(ctx).Debug("Task already deleted, error=%s", c.exitStatus)
var err error
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This declaration seems unnecessary.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does, looks like it is there in master and this is just moving it around, it actually disappears entirely later in the series, in containerd: Use Task.Delete() in adapter.shutdown().

A downside which I hadn't anticipated of reviewing commit by commit is that GH marks comments such as this as "Outdated" right as they are made, so they are collapsed in the webui and I only notice them via the email notification.

}

if sig == syscall.SIGTERM {
// We've tried as hard as we can.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't retry with SIGKILL?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was my intention but I mixed up my TERMs and my KILLs. Will fix.

Ian Campbell added 4 commits June 23, 2017 09:50
This means we need to send the signals ourselves, since Delete waits for the IO
to complete. Previously we were just lucky (and probably leaking some goroutines).

Signed-off-by: Ian Campbell <ian.campbell@docker.com>
Signed-off-by: Ian Campbell <ian.campbell@docker.com>
This function checks for valid options and for duplicates, but we have just
constructed opts from a static set above and can trivially verify that
everything is ok.

Signed-off-by: Ian Campbell <ian.campbell@docker.com>
Requires a newer github.com/pkg/errors due to:
    undefined: "github.com/docker/swarmkit/vendor/github.com/pkg/errors".WithStack
So update to 645ef00459ed84a119197bfb8d8205042c6df63d which is v0.8.0 as used
in current containerd.

Despite substantial changes to the containerd grpc interface the client library
isolated us from all that so no code changes required!

Signed-off-by: Ian Campbell <ian.campbell@docker.com>
@ijc
Copy link
Copy Markdown
Contributor Author

ijc commented Jun 23, 2017

Force push is s/SIGTERM/SIGKILL/ in the "big guns" bit of adapter.shutdown().

When inspecting a system with `ctr` by default you will now need to export
`CONTAINERD_NAMESPACE=swarmd`.

Signed-off-by: Ian Campbell <ian.campbell@docker.com>
@ijc
Copy link
Copy Markdown
Contributor Author

ijc commented Jun 23, 2017

CI failure is:

cd "$WORKDIR" && TMPDIR=/tmpfs make ci

🐳 fmt
🐳 bin/swarmd

command cd "$WORKDIR" && TMPDIR=/tmpfs make ci took more than 10 minutes since last output

Which hopefully can't be my fault!

@aaronlehmann
Copy link
Copy Markdown
Collaborator

It passed on the second attempt.

return c.exitStatus
case <-ctx.Done():
return ctx.Err()
case <-time.After(timeout):
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would bet set to StopGracePeriod, but I don't think that value should apply here.

        // StopGracePeriod the grace period for stopping the container before
        // forcefully killing the container.

If it was set to 0, or a very small value, this would return false "task is unkillable" errors.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume you are talking specifically about this timeout once we have brought out the big guns, for the initial attempt using StopGracePeriod is correct and proper I think (and I think you think so too, just checking).

Looking at the equivalent code on the docker side it blocks indefinitely once it has resorted to SIGKILL. In general that is probably fine, very little will survive a SIGKILL for very long but blocking in things like uninterruptible sleep on I/O can take a considerable length of time to die. I think this is something swarmkit can (and apparently does when running against docker) tolerate so the correct answer is to wait here with no timeout, other than the ctx.Done().

…own.

There is no good choice of timeout here and SIGKILL is always eventually fatal,
although the death can be deferred in some cases by long uninterruptible
sleeps.

Signed-off-by: Ian Campbell <ian.campbell@docker.com>
@aaronlehmann
Copy link
Copy Markdown
Collaborator

LGTM

1 similar comment
@stevvooe
Copy link
Copy Markdown
Contributor

LGTM

@aaronlehmann aaronlehmann merged commit 08322ef into moby:master Jun 30, 2017
@ijc ijc deleted the containerd branch June 30, 2017 10:45
andrewhsu pushed a commit to docker-archive/docker-ce that referenced this pull request Jul 14, 2017
- moby/swarmkit#2266 (support for templating Node.Hostname in docker executor)
- moby/swarmkit#2281 (change restore action on objects to be update, not delete/create)
- moby/swarmkit#2285 (extend watch queue with timeout and size limit)
- moby/swarmkit#2253 (version-aware failure tracking in the scheduler)
- moby/swarmkit#2275 (update containerd and port executor to container client library)
- moby/swarmkit#2292 (rename some generic resources)
- moby/swarmkit#2300 (limit the size of the external CA response)
- moby/swarmkit#2301 (delete global tasks when the node running them is deleted)

Minor cleanups, dependency bumps, and vendoring:
- moby/swarmkit#2271
- moby/swarmkit#2279
- moby/swarmkit#2283
- moby/swarmkit#2282
- moby/swarmkit#2274
- moby/swarmkit#2296 (dependency bump of etcd, go-winio)

Signed-off-by: Ying Li <ying.li@docker.com>
Upstream-commit: 4509a00
Component: engine
silvin-lubecki pushed a commit to silvin-lubecki/engine-extract that referenced this pull request Mar 16, 2020
- moby/swarmkit#2266 (support for templating Node.Hostname in docker executor)
- moby/swarmkit#2281 (change restore action on objects to be update, not delete/create)
- moby/swarmkit#2285 (extend watch queue with timeout and size limit)
- moby/swarmkit#2253 (version-aware failure tracking in the scheduler)
- moby/swarmkit#2275 (update containerd and port executor to container client library)
- moby/swarmkit#2292 (rename some generic resources)
- moby/swarmkit#2300 (limit the size of the external CA response)
- moby/swarmkit#2301 (delete global tasks when the node running them is deleted)

Minor cleanups, dependency bumps, and vendoring:
- moby/swarmkit#2271
- moby/swarmkit#2279
- moby/swarmkit#2283
- moby/swarmkit#2282
- moby/swarmkit#2274
- moby/swarmkit#2296 (dependency bump of etcd, go-winio)

Signed-off-by: Ying Li <ying.li@docker.com>
Upstream-commit: 4509a00
Component: engine
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants