Skip to content
This repository was archived by the owner on May 12, 2021. It is now read-only.

agent: add support for guest-hooks#365

Closed
eguzman3 wants to merge 1 commit intokata-containers:masterfrom
eguzman3:guest-hooks
Closed

agent: add support for guest-hooks#365
eguzman3 wants to merge 1 commit intokata-containers:masterfrom
eguzman3:guest-hooks

Conversation

@eguzman3
Copy link
Copy Markdown
Contributor

@eguzman3 eguzman3 commented Sep 12, 2018

Adds support for running OCI hooks within the guest.
A 'drop-in' path (guest_hook_path) is specified in
the cli configuration file and if set, the agent will
look for OCI hooks in this directory and inject them
into the container life cycle.

Old Discussion: #346, #347
Runtime Change: kata-containers/runtime#720

Fixes #348

Signed-off-by: Edward Guzman eguzman@nvidia.com

eguzman3 added a commit to eguzman3/runtime that referenced this pull request Sep 12, 2018
Blocked on (kata-containers/agent#365), will need to revendor.
Adds support for running OCI hooks within the guest. A 'drop-in'
path (guest_hook_path) is specified in the cli configuration file
and if set, the agent will look for OCI hooks in this directory
and inject them into the container life cycle.

Fixes kata-containers#348

Signed-off-by: Edward Guzman <eguzman@nvidia.com>
eguzman3 added a commit to eguzman3/runtime that referenced this pull request Sep 13, 2018
Blocked on (kata-containers/agent#365), will need to revendor.
Copy link
Copy Markdown
Member

@bergwolf bergwolf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@eguzman3 Thanks for the updates. It looks mostly good. I only have one comment inline.

Another question, since you are relying on the rootfs layout, is there a matching osbuilder PR that allows you to build a rootfs with customized hooks? Or do you simply add the hooks by hand between the rootfs creation and image creation steps? I'm asking because we might want to test this infrastructure by adding a simple hook and making sure it executes in CI, so that we can make sure the special use case is not broken by some future change.

a.sandbox.guestHooksPresent = false

if req.GuestHookPath != "" {
a.sandbox.scanGuestHooks(req.GuestHookPath)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So it is silently ignored if the path doesn't exist or if there are no hooks in the hook path, which is mostly because a user have installed a wrong rootfs image. I wonder if you want to return failure here since otherwise users have no idea if the hooks are executed or not unless they somehow check it in their container. OTOH, if you return failure here, such errors can be found early when creating the sandbox.

Copy link
Copy Markdown
Contributor Author

@eguzman3 eguzman3 Sep 14, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another question, since you are relying on the rootfs layout, is there a matching osbuilder PR that allows you to build a rootfs with customized hooks? Or do you simply add the hooks by hand between the rootfs creation and image creation steps?

Currently I am using scripts to add the hooks and other files needed for my use case between rootfs and image creation.

I wonder if you want to return failure here since otherwise users have no idea if the hooks are executed or not unless they somehow check it in their container.

What about the case where a user wants to swap between images (some with hooks, some without) without having to change the configuration option?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @bergwolf - if a user switches images, I'd say they should expect to have to update the config if one has hooks and the other doesn't.

As an absolute minimum, can you add a log call showing if the user requested hooks or not (if GuestHookPath is set or not we should log something).

I do wonder if we want to add either a second bool field in the proto file (bool require_hooks = n), or maybe even a kernel cmdline option for the agent to consume (the former is cleaner though):

That way, the agent can clearly establish whether the user wants hooks or not allowing the value (and validity) of GuestHookPath to be more clearly determined.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest we log this call for now. I was thinking if we would want to support using a different rootfs for different pods in the future similar to how we support different kernels for k8s @bergwolf @jodh-intel

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@amshinde In that case, I think the GuestHookPath config should also be passed by spec (e.g., through labels or annotations).

@eguzman3 I would prefer a fail-early strategy to let users know that they have pointed to a wrong rootfs image. For your mentioned use case, right now users have to remove GuestHookPath if they switch to a rootfs image w/o hooks. In future we might support to specify all kata configs through labels or annotations, and then there is no dependency on a fixed cli configuration and each pod can be customized.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Member

@WeiZhang555 WeiZhang555 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall it looks good,

This sounds like a good feature, it can be useful. Though I didn't reads all your previous discussions.

}

// ChangeToBundlePath changes the cwd to the bundle path defined in the OCI spec
func ChangeToBundlePath(spec *specs.Spec) (string, error) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These utils functions might be better to move out from grpc package.

You know, this package can be vendored later to kata-runtime, because kata-runtime will be client and communicate with kata-agent via GRPC protocol, they share same GRPC protocol definitions.

So if these functions do no help to runtime, which means it can't help kata-runtime do GRPC client operations, it might be better to move out from this package.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in the new PR :)

}

defer f.Close()
err = json.NewEncoder(f).Encode(&spec)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: spec is already a pointer, why need &spec here?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch @WeiZhang555 :)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in the new PR


// FindHooks searches guestHookPath for any OCI hooks for a given hookType
func FindHooks(guestHookPath, hookType string) (hooksFound []specs.Hook) {
hooksPath := path.Join(guestHookPath, hookType)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So prestart hook will be in guestHookPath/prestart and poststart hook will be in guestHookPath/poststart,
and guestHookPath is specified by kata-runtime cli configuration, right?

I think we need some documents to teach user, we can't expect user can find this hidden advanced feature.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 ;)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, but I feel this should be documented in the configuration.toml template, so in the PR against kata-containers/runtime. Do you agree?

@WeiZhang555
Copy link
Copy Markdown
Member

This implementation looks good, but there's one missing part is that we can't specify arguments for each hook, I can't be sure that if there will be any requirements for this regarding to some unknown scenarios.

Copy link
Copy Markdown

@jodh-intel jodh-intel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @eguzman3 - This is getting close now! 😄 Just a few comments remaining. As they show, we do need to make sure it is really easy to determine whether:

  • hooks were enabled.
  • the agent found hooks in the requested location.
  • the hooks were valid.
  • the hooks were usable.
  • the hooks ran.


// FindHooks searches guestHookPath for any OCI hooks for a given hookType
func FindHooks(guestHookPath, hookType string) (hooksFound []specs.Hook) {
hooksPath := path.Join(guestHookPath, hookType)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 ;)

return err
}

func isValidHook(file os.FileInfo) bool {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a nice clean function. However, I think it would be worth considering "polluting it" a bit with some log calls to help debug hook setup issues (basically a log call for every failure scenario).

a.sandbox.guestHooksPresent = false

if req.GuestHookPath != "" {
a.sandbox.scanGuestHooks(req.GuestHookPath)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @bergwolf - if a user switches images, I'd say they should expect to have to update the config if one has hooks and the other doesn't.

As an absolute minimum, can you add a log call showing if the user requested hooks or not (if GuestHookPath is set or not we should log something).

I do wonder if we want to add either a second bool field in the proto file (bool require_hooks = n), or maybe even a kernel cmdline option for the agent to consume (the former is cleaner though):

That way, the agent can clearly establish whether the user wants hooks or not allowing the value (and validity) of GuestHookPath to be more clearly determined.

}

defer f.Close()
err = json.NewEncoder(f).Encode(&spec)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch @WeiZhang555 :)

Copy link
Copy Markdown

@jodh-intel jodh-intel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @eguzman3 - This is getting close now! 😄 Just a few comments remaining. As they show, we do need to make sure it is really easy to determine whether:

  • hooks were enabled.
  • the agent found hooks in the requested location.
  • the hooks were valid.
  • the hooks were usable.
  • the hooks ran.

@jodh-intel
Copy link
Copy Markdown

Hi @eguzman3 - do you think you'll get a chance to look over the final review comments some time this week? As mentioned, this is very close now I think 😄

@jodh-intel
Copy link
Copy Markdown

Ping @eguzman3 😄

@flx42
Copy link
Copy Markdown
Contributor

flx42 commented Oct 1, 2018

@jodh-intel I will take over the patch since Edward finished his internship. Sorry for the delay!

@jodh-intel
Copy link
Copy Markdown

Hi @flx42 - no problem! Do let us know if we can help in any way.

flx42 added a commit to flx42/agent that referenced this pull request Oct 2, 2018
Adds support for running OCI hooks within the guest. A 'drop-in'
path (guest_hook_path) is specified in the cli configuration file
and if set, the agent will look for OCI hooks in this directory
and inject them into the container life cycle.

Fixes: kata-containers#348
Replaces: kata-containers#365

Co-authored-by: Edward Guzman <eguzman@nvidia.com>
Co-authored-by: Felix Abecassis <fabecassis@nvidia.com>
Signed-off-by: Edward Guzman <eguzman@nvidia.com>
Signed-off-by: Felix Abecassis <fabecassis@nvidia.com>
flx42 added a commit to flx42/agent that referenced this pull request Oct 3, 2018
Adds support for running OCI hooks within the guest. A 'drop-in'
path (guest_hook_path) is specified in the cli configuration file
and if set, the agent will look for OCI hooks in this directory
and inject them into the container life cycle.

Fixes: kata-containers#348
Replaces: kata-containers#365

Co-authored-by: Edward Guzman <eguzman@nvidia.com>
Co-authored-by: Felix Abecassis <fabecassis@nvidia.com>
Signed-off-by: Edward Guzman <eguzman@nvidia.com>
Signed-off-by: Felix Abecassis <fabecassis@nvidia.com>
@grahamwhaley
Copy link
Copy Markdown
Contributor

This one is superseded by #393 , so I'll mark it as DNM so we don't try to review or accidentally merge it ;-)

@amshinde
Copy link
Copy Markdown
Member

amshinde commented Oct 8, 2018

Closing this PR, as it has been superseded by #393.

@amshinde amshinde closed this Oct 8, 2018
@grahamwhaley
Copy link
Copy Markdown
Contributor

Nudging the CIs that failed. One is due to the nested glitch. The other had what look like maybe real errors (but anything doing apt could be an upstream timeout type error). I'll respin to see if that is reproducible or transient:

Summarizing 2 Failures:

[Fail] package manager update test check apt-get update [It] should not fail 
/tmp/jenkins/workspace/kata-containers-agent-ubuntu-16-04-PR-initrd/go/src/github.com/kata-containers/tests/integration/docker/package_manager_test.go:41

[Fail] CPUs and CPU set updating cpus and cpuset of a running container [It] should have the right number of vCPUs 
/tmp/jenkins/workspace/kata-containers-agent-ubuntu-16-04-PR-initrd/go/src/github.com/kata-containers/tests/integration/docker/cpu_test.go:432

Ran 202 of 214 Specs in 952.725 seconds
FAIL! -- 200 Passed | 2 Failed | 0 Pending | 12 Skipped --- FAIL: TestIntegration (979.86s)
FAIL

jshachm pushed a commit to jshachm/agent that referenced this pull request Nov 22, 2018
Adds support for running OCI hooks within the guest. A 'drop-in'
path (guest_hook_path) is specified in the cli configuration file
and if set, the agent will look for OCI hooks in this directory
and inject them into the container life cycle.

Fixes: kata-containers#348
Replaces: kata-containers#365

Co-authored-by: Edward Guzman <eguzman@nvidia.com>
Co-authored-by: Felix Abecassis <fabecassis@nvidia.com>
Signed-off-by: Edward Guzman <eguzman@nvidia.com>
Signed-off-by: Felix Abecassis <fabecassis@nvidia.com>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants