implement NRI plugin server to inject management CDI devices#1498
implement NRI plugin server to inject management CDI devices#1498
Conversation
933aa39 to
446d2f0
Compare
94a00c8 to
cbd8cf9
Compare
cbd8cf9 to
96b61e9
Compare
elezar
left a comment
There was a problem hiding this comment.
Thanks @tariq1890.
As discussed yesterday, I think the approach is good in general. It would be good to define what level of control we want to afford users w.r.t configuring runtimes. (e.g. Do we want to allow them to enable NRI (and CDI) and configure the runtime classes?).
| // nriCDIDeviceKey is the prefix of the key used for CDI device annotations | ||
| nriCDIDeviceKey = nriCDIDeviceDomain + "/container" | ||
| // defaultNRISocket represents the default path of the NRI socket | ||
| defaultNRISocket = "/var/run/nri/nri.sock" |
There was a problem hiding this comment.
Let's not define this here and instead ensure that we pass in the default value. (We probably want to add the socket as a CLI flag so that we COULD override it.
Nit: We also have api.DefaultSocketPath that we could use.
| return cdiDevices, nil | ||
| } | ||
|
|
||
| func getCDIDeviceFromAnnotation(annotations map[string]string, ctr string) string { |
There was a problem hiding this comment.
nit: Technically we're just getting the annotation value here -- which MAY be a comma-separated list of devices. Do we want to rename the function?
There was a problem hiding this comment.
Does getCDIDeviceNamesFromAnnotation work? Open to suggestions here
There was a problem hiding this comment.
I have changed the method names, please check now
There was a problem hiding this comment.
I've renamed the function
| nriPluginAnnotationKey := fmt.Sprintf("%s.%s", nriCDIDeviceKey, ctr) | ||
| if value, ok := annotations[nriPluginAnnotationKey]; ok { | ||
| return value | ||
| } | ||
|
|
||
| // If there isn't an exact match, we look for a wildcard character match | ||
| for annotationKey := range annotations { | ||
| if after, found := strings.CutPrefix(annotationKey, fmt.Sprintf("%s.", nriCDIDeviceKey)); found { | ||
| if after == "*" { | ||
| return annotations[annotationKey] | ||
| } | ||
| } | ||
| } | ||
|
|
||
| return "" |
There was a problem hiding this comment.
We could introduce a new constant that is equal to "nvidia.cdi.k8s.io/container." to simplify this quite a bit. Also note that we need not use strings.CutPrefix, since we known the full key for the "wildcard" annotation beforehand. (It's also possible to use a recursive implementation, but this is not required).
| nriPluginAnnotationKey := fmt.Sprintf("%s.%s", nriCDIDeviceKey, ctr) | |
| if value, ok := annotations[nriPluginAnnotationKey]; ok { | |
| return value | |
| } | |
| // If there isn't an exact match, we look for a wildcard character match | |
| for annotationKey := range annotations { | |
| if after, found := strings.CutPrefix(annotationKey, fmt.Sprintf("%s.", nriCDIDeviceKey)); found { | |
| if after == "*" { | |
| return annotations[annotationKey] | |
| } | |
| } | |
| } | |
| return "" | |
| nriPluginAnnotationKeyPrefix := nriCDIDeviceKey + "." | |
| containerNriPluginAnnotationKey := nriPluginAnnotationKeyPrefix + ctr | |
| if value, ok := annotations[containerNriPluginAnnotationKey]; ok { | |
| return value | |
| } | |
| if ctr == "*" { | |
| return "" | |
| } | |
| // If there isn't an exact match, we look for the key that matches ALL containers, indicated by the container name '*': | |
| return getDeviceFromAnnoations(annotations, "*") |
There was a problem hiding this comment.
We could introduce a new constant that is equal to "nvidia.cdi.k8s.io/container." to simplify this quite a bit. Also note that we need not use strings.CutPrefix
Agreed. I have made the changes to directly lookup the necessary keys instead of looping through the entire keyset of the map. Thanks for the suggestion!
| // defaultRuntimeName specifies the NVIDIA runtime to be use as the default runtime if setting the default runtime is enabled | ||
| defaultRuntimeName = "nvidia" | ||
| defaultHostRootMount = "/host" | ||
| defaultNRIPluginIdx = "10" |
There was a problem hiding this comment.
I based this off the device-injector implementation from the upstream nri samples
There was a problem hiding this comment.
Here are the docs for the example plugin in the upstream nri repo where they instruct users to provide 10 as the value.
cmd/nvidia-ctk-installer/main.go
Outdated
| if err != nil { | ||
| return fmt.Errorf("unable to setup runtime: %v", err) | ||
| // if NRI is enabled, we don't need to modify the container runtime config TOML. | ||
| if !o.runtimeOptions.EnableNRI { |
There was a problem hiding this comment.
The behaviour of this main funtion should NOT depend on the runtimeOptions. If we wanted to do a quick return in runtime.Setup based on the option, we can do that there.
954d67e to
b2d93d8
Compare
|
@elezar It looks like
I am thinking of just using So examples of supported annotation k-v pairs are as follows: |
87ca585 to
3bb5bce
Compare
|
@elezar Thanks for the thorough review! I have addressed your comments and the PR is ready for another round of review |
@tariq1890 👍 We have a similar convention with the addition that an omitted |
3bb5bce to
0a2bc02
Compare
|
@klihub Even better! Thank you for this suggestion. Always happy to remove code and reuse library code instead :) |
d3c4df6 to
4a28997
Compare
9078cfe to
f15030c
Compare
| err = runtime.Cleanup(c, &o.runtimeOptions, o.runtime) | ||
| if err != nil { | ||
| return fmt.Errorf("unable to cleanup runtime: %v", err) | ||
| if !o.enableNRIPlugin { |
There was a problem hiding this comment.
Sure let's keep these separate for now. I can rebase #1521 once we merge this.
elezar
left a comment
There was a problem hiding this comment.
Thanks @tariq1890 (I just realised that I had a bunch of pending comments here).
This looks good now.
| } | ||
|
|
||
| func parseCDIDevices(pod *api.PodSandbox, key, container string) []string { | ||
| cdiDeviceNames, ok := plugin.GetEffectiveAnnotation(pod, key, container) |
There was a problem hiding this comment.
Ok. So we've switched to using the format as implemented in the NRI plugin directly? That's better. We may need to update our documentation / implementation to use nvidia.cdi.k8s.io/pod for the cases where all containers need access to the device.
There was a problem hiding this comment.
Correct. Yes, I will make sure that this is reflected in our documentation
| func (p *Plugin) Start(ctx context.Context, nriSocketPath, nriPluginIdx string) error { | ||
| pluginOpts := []stub.Option{ | ||
| stub.WithPluginIdx(nriPluginIdx), | ||
| stub.WithLogger(toNriLogger{p.logger}), |
There was a problem hiding this comment.
Not for this PR: Note that the stub has a WithOnClose() option that allows one to specify a callback when connection is lost. Does it make sense to at least pass a function that logs so that we can collect data on when / how often this happens?
There was a problem hiding this comment.
Sure, I can look into that
| // if NRI is enabled, we don't need to modify the container runtime config TOML. | ||
| if opts.EnableNRI { | ||
| return nil | ||
| } |
There was a problem hiding this comment.
As I have called out, I don't know whether this is something that we want to do in the long term, but I'm OK to have this be the case for the time being. (If we also get the changes from #1521 in then a user can explicitly control this -- or at least opt back in to configuring things).
Not a blocker though.
There was a problem hiding this comment.
I have since moved this conditional out of the method, but agreed that we would no longer need that conditional after #1521 is merged
| err = runtime.Cleanup(c, &o.runtimeOptions, o.runtime) | ||
| if err != nil { | ||
| return fmt.Errorf("unable to cleanup runtime: %v", err) | ||
| if !o.enableNRIPlugin { |
There was a problem hiding this comment.
Sure let's keep these separate for now. I can rebase #1521 once we merge this.
f15030c to
bcfff63
Compare
cdesiniotis
left a comment
There was a problem hiding this comment.
@tariq1890 thanks for the work on this! This lgtm, just left a few minor typos / nits.
bcfff63 to
e31f356
Compare
Signed-off-by: Tariq Ibrahim <tibrahim@nvidia.com>
e31f356 to
93cb491
Compare
This PR adds support for an NRI Plugin which will intercept
CreateContainerevents to inject CDI devices to management containers.Here is a summary of the changes:
CreateContainerevents, check if the reserved annotation (for e.g. -"nvidia.cdi.k8s.io/container.{{container-name}}") exists in the podruntime.Setupmethod is skipped. Runtime config TOML will not be modified and the runtime will no longer need to be restartedcontainerd v1.7.30) where NRI and CDI are disabled by default, they will be have to be enabled out of a band as a pre-requisiteFuture Scope