-
Notifications
You must be signed in to change notification settings - Fork 3.8k
Support multiple cni plugin bin dirs #11311
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Skipping CI for Draft Pull Request. |
ffe6a8c to
dadc06f
Compare
dadc06f to
4065f14
Compare
be0abf4 to
d248d2d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
trying to find the reason we never did this..
anyhow I don't see a good reason to not support it outside we've been talking about switching to rpc and this would be an intermediate step in that case.
Let's add the validation and do some integration testing to make sure it's all tied in well with the go-cni and cni libs should work as cni already has the plugin dir list but still let's test it
@mikebrow Thanks for the review and suggestion, will add tests to verify the config-side change and the new behavior of supporting multiple bin dirs. Also will mark |
I remember us talking about this years ago. I’ll take a look at this once I’m home |
|
This looks good. Not sure why we didn't do this earlier? Since this function already takes a slice of string. https://github.com/containerd/go-cni/blob/f4736bb1d1b10293d9c484cbf79254a5efa1e020/opts.go#L44 |
d248d2d to
a1d7be0
Compare
a1d7be0 to
089224b
Compare
|
/retest |
|
Hi @mikebrow , I added tests checking that the two fields cannot be used together, and that we migrated bin_dir to bin_dirs during config loading. Currently we use
PTAL, let me know if you have any other comments. thanks! |
| cni.WithPluginConfDir(dir), | ||
| cni.WithPluginMaxConfNum(max), | ||
| cni.WithPluginDir([]string{c.config.NetworkPluginBinDir})) | ||
| cni.WithPluginDir(c.config.NetworkPluginBinDirs)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mikebrow actually I just realized one thing: we cannot switch the default here yet, as the old field is not yet removed.
Say if a user only uses bin_dir, we will actually set this default bin_dirs, causing both fields being used and fail our validation. (it's kind of similar to we cannot set the default config_path yet as we haven't removed the deprecated mirros. #6488).
Giving this, how about let's keep the default value unchanged? And do the switch when we remove the old field? (we will still add a deprecation warning saying bin_dir is deprecated)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wait... actually the opposite also stands. If we keep the default bin_dir unchanged. If a user set up bin_dirs instead. We will end up in the same situation 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
essentially we cannot figure out a field is set by default or explicitly by user.
We can do something like this:
- move the default to
bin_dirshere (keep this change). - if
bin_diris not empty, log a deprecation warning. then check iflen(bin_dirs) == 1 && bin_dirs[0] == "/opt/cni/bin"
2.1. if yes, that meansbin_dirsis from default and we can proceeed by merging the two.
2.2. if no, that means the user set up bothbin_dirandbin_dirs, we should error in this case.
WDYT @mikebrow
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the way it works.. just to confirm...
if the config.toml is unset (file is not present in the path) we go to the compiled in defaults.
if the config.toml is set we use what we get from the config.toml (no matter what version is used as defined in the config) and migrate to the latest version
if the request is made to generate the config defaults as output containerd config default we output that
the concern about "default" should not be an issue unless the field doesn't work when unset... but even then we should not be .. "merging" defaults over unset fields with the exception of there is no config.toml at all
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so .. if bin_dir is set in a prior version.. migrate to bin_dirs[] to become the current version ..
pls check my math :-) may require a version bump for the config for 2.1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so .. if bin_dir is set in a prior version.. migrate to bin_dirs[] to become the current version
yep that's right, no issue if a user migrates from verson 2 to 3, as bin_dirs doesn't exist there. but the issue is in latest config version 3, we have both bin_dir and bin_dirs. And there is no migration if version is already latest.
Say a user has bin_dir = /my/cni/bin in their version=3 config. Previously it will just overwrite the bin_dir's default value.
Since we move default from bin_dir to bin_dirs, now the user has below at the end of config loading:
bin_dir = /my/cni/bin` // from user's config
bin_dirs = ['opt/cni/bin'] // from our new default.
This will break our config validation which doesn't allow both fields non-empty. To the user it shouldn't break, becault they still only set up bin_dir only. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if the config.toml is set we use what we get from the config.toml (no matter what version is used as defined in the config) and migrate to the latest version
It will be:
- migrate toml content first
- get the default config value
- insert toml content into the config struct (e.g., overwrite default values)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so .. if bin_dir is set in a prior version.. migrate to bin_dirs[] to become the current version
yep that's right, no issue if a user migrates from verson 2 to 3, as
bin_dirsdoesn't exist there. but the issue is in latest config version3, we have bothbin_dirandbin_dirs. And there is no migration if version is already latest.Say a user has
bin_dir = /my/cni/binin theirversion=3config. Previously it will just overwrite thebin_dir's default value.Since we move default from
bin_dirtobin_dirs, now the user has below at the end of config loading:bin_dir = /my/cni/bin` // from user's config bin_dirs = ['opt/cni/bin'] // from our new default.This will break our config validation which doesn't allow both fields non-empty. To the user it shouldn't break, becault they still only set up
bin_dironly. :)
nod.. needs a new config "version" to migrate to otherwise user will have to do a manual migration.. to 3.0 containerd 2.0 to 3.0++ containerd 2.1 .. surprised we didn't do a bump yet missed we don't have a 4.0 config for 2.1 or some other process to handle new fields minor migrations for exclusive fields.. @samuelkarp @dmcgowan ... so hold off on merging till we get a migration solution from 2.0 to 2.1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we could hold off on the migration feature until there is a bump.. keeping it a "manual" operation .. leave the default as it was.. originally.. add the new field description in to the docs with the deprecation.. indicating the old field will be migrated with the next version of the config.. something to that effect
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mikebrow could you PTAL the latest patch? especially this file: https://github.com/containerd/containerd/pull/11311/files#diff-e8d6dbb2c75752c173c570eabb9961d5b0f3273b682788f2d7eab2f5ffd5d7d6
This should handle the case where a user-set bin_dir unexpectedly conflict with the default bin_dirs.
I think this will make the change backward compatible with 2.0, i.e., regardless if user set bin_dir or not, there is no behavior difference. And when user uses this new bin_dirs, we can still enforce the restriction that you cannot set bin_dir/bin_dirs at the same time.
bc48d42 to
1e40278
Compare
|
/test pull-containerd-node-e2e |
|
/test pull-containerd-node-e2e |
1 similar comment
|
/test pull-containerd-node-e2e |
|
/test pull-containerd-node-e2e |
|
/retest |
|
/test pull-containerd-node-e2e |
mikebrow
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
found a small nit
To make it as DEPRECATED, this PR does the following: 1. Changes config default to use `NetworkPluginBinDirs`; 2. Mark `NetworkPluginBinDir` as deprecated (in config version 3); 3. Add config migration from 2 to 3, which migrates `bin_dir` in version 2 to `bin_dirs` in version 3. Signed-off-by: Jin Dong <djdongjin95@gmail.com> [wip] add deprecation warning Signed-off-by: Jin Dong <djdongjin95@gmail.com>
1e40278 to
42effa3
Compare
mikebrow
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
/test pull-containerd-node-e2e |
containerd 2.1.0 Welcome to the v2.1.0 release of containerd! The first minor release of containerd 2.x focuses on continued stability alongside new features and improvements. This is the first time-based released for containerd. Most the feature set and core functionality has long been stable and harderened in production environments, so now we transition to a balance of timely delivery of new functionality with the same high confidence in stability and performance. * Add no_sync option to boost boltDB performance on ephemeral environments ([containerd#10745](containerd#10745)) * Add content create event ([containerd#11006](containerd#11006)) * Erofs snapshotter and differ ([containerd#10705](containerd#10705)) * Update CRI to use transfer service for image pull by default ([containerd#8515](containerd#8515)) * Support multiple cni plugin bin dirs ([containerd#11311](containerd#11311)) * Support container restore through CRI/Kubernetes ([containerd#10365](containerd#10365)) * Add OCI/Image Volume Source support ([containerd#10579](containerd#10579)) * Enable Writable cgroups for unprivileged containers ([containerd#11131](containerd#11131)) * Fix recursive RLock() mutex acquisition ([containerd/go-cni#126](containerd/go-cni#126)) * Support CNI STATUS Verb ([containerd/go-cni#123](containerd/go-cni#123)) * Retry last registry host on 50x responses ([containerd#11484](containerd#11484)) * Multipart layer fetch ([containerd#10177](containerd#10177)) * Enable HTTP debug and trace for transfer based puller ([containerd#10762](containerd#10762)) * Add support for unpacking custom media types ([containerd#11744](containerd#11744)) * Add dial timeout field to hosts toml configuration ([containerd#11106](containerd#11106)) * Expose Pod assigned IPs to NRI plugins ([containerd#10921](containerd#10921)) * Support multiple uid/gid mappings ([containerd#10722](containerd#10722)) * Fix race between serve and immediate shutdown on the server ([containerd/ttrpc#175](containerd/ttrpc#175)) * Update FreeBSD defaults and re-organize platform defaults ([containerd#11017](containerd#11017)) * Postpone cri config deprecations to v2.2 ([containerd#11684](containerd#11684)) * Remove deprecated dynamic library plugins ([containerd#11683](containerd#11683)) * Remove the support for Schema 1 images ([containerd#11681](containerd#11681)) Please try out the release binaries and report any issues at https://github.com/containerd/containerd/issues. * Derek McGowan * Phil Estes * Akihiro Suda * Maksym Pavlenko * Jin Dong * Wei Fu * Sebastiaan van Stijn * Samuel Karp * Mike Brown * Adrien Delorme * Austin Vazquez * Akhil Mohan * Kazuyoshi Kato * Henry Wang * Gao Xiang * ningmingxiao * Krisztian Litkey * Yang Yang * Archit Kulkarni * Chris Henzie * Iceber Gu * Alexey Lunev * Antonio Ojea * Davanum Srinivas * Marat Radchenko * Michael Zappa * Paweł Gronowski * Rodrigo Campos * Alberto Garcia Hierro * Amit Barve * Andrey Smirnov * Divya * Etienne Champetier * Kirtana Ashok * Philip Laine * QiPing Wan * fengwei0328 * zounengren * Adrian Reber * Alfred Wingate * Amal Thundiyil * Athos Ribeiro * Brian Goff * Cesar Talledo * ChengyuZhu6 * Chongyi Zheng * Craig Ingram * Danny Canter * David Son * Fupan Li * HirazawaUi * Jing Xu * Jonathan A. Sternberg * Jose Fernandez * Kaita Nakamura * Kohei Tokunaga * Lei Liu * Marco Visin * Mike Baynton * Qiyuan Liang * Sameer * Shiming Zhang * Swagat Bora * Teresaliu * Tony Fang * Tõnis Tiigi * Vered Rosen * Vinayak Goyal * bo.jiang * chriskery * luchenhan * mahmut * zhaixiaojuan * **github.com/Microsoft/hcsshim** v0.12.9 -> v0.13.0-rc.3 * **github.com/cilium/ebpf** v0.11.0 -> v0.16.0 * **github.com/containerd/cgroups/v3** v3.0.3 -> v3.0.5 * **github.com/containerd/containerd/api** v1.8.0 -> v1.9.0 * **github.com/containerd/continuity** v0.4.4 -> v0.4.5 * **github.com/containerd/go-cni** v1.1.10 -> v1.1.12 * **github.com/containerd/imgcrypt/v2** v2.0.0-rc.1 -> v2.0.1 * **github.com/containerd/otelttrpc** ea5083fda723 -> v0.1.0 * **github.com/containerd/platforms** v1.0.0-rc.0 -> v1.0.0-rc.1 * **github.com/containerd/ttrpc** v1.2.6 -> v1.2.7 * **github.com/containerd/typeurl/v2** v2.2.2 -> v2.2.3 * **github.com/containernetworking/cni** v1.2.3 -> v1.3.0 * **github.com/containernetworking/plugins** v1.5.1 -> v1.7.1 * **github.com/containers/ocicrypt** v1.2.0 -> v1.2.1 * **github.com/davecgh/go-spew** d8f796af33cc -> v1.1.1 * **github.com/fsnotify/fsnotify** v1.7.0 -> v1.9.0 * **github.com/go-jose/go-jose/v4** v4.0.4 -> v4.0.5 * **github.com/google/go-cmp** v0.6.0 -> v0.7.0 * **github.com/grpc-ecosystem/grpc-gateway/v2** v2.22.0 -> v2.26.1 * **github.com/klauspost/compress** v1.17.11 -> v1.18.0 * **github.com/mdlayher/socket** v0.4.1 -> v0.5.1 * **github.com/moby/spdystream** v0.4.0 -> v0.5.0 * **github.com/moby/sys/user** v0.3.0 -> v0.4.0 * **github.com/opencontainers/image-spec** v1.1.0 -> v1.1.1 * **github.com/opencontainers/runtime-spec** v1.2.0 -> v1.2.1 * **github.com/opencontainers/selinux** v1.11.1 -> v1.12.0 * **github.com/pelletier/go-toml/v2** v2.2.3 -> v2.2.4 * **github.com/petermattis/goid** 4fcff4a6cae7 **_new_** * **github.com/pmezard/go-difflib** 5d4384ee4fb2 -> v1.0.0 * **github.com/prometheus/client_golang** v1.20.5 -> v1.22.0 * **github.com/prometheus/common** v0.55.0 -> v0.62.0 * **github.com/sasha-s/go-deadlock** v0.3.5 **_new_** * **github.com/smallstep/pkcs7** v0.1.1 **_new_** * **github.com/stretchr/testify** v1.9.0 -> v1.10.0 * **github.com/tchap/go-patricia/v2** v2.3.1 -> v2.3.2 * **github.com/urfave/cli/v2** v2.27.5 -> v2.27.6 * **github.com/vishvananda/netlink** v1.3.0 -> 0e7078ed04c8 * **github.com/vishvananda/netns** v0.0.4 -> v0.0.5 * **go.etcd.io/bbolt** v1.3.11 -> v1.4.0 * **go.opentelemetry.io/auto/sdk** v1.1.0 **_new_** * **go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc** v0.56.0 -> v0.60.0 * **go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp** v0.56.0 -> v0.60.0 * **go.opentelemetry.io/otel** v1.31.0 -> v1.35.0 * **go.opentelemetry.io/otel/exporters/otlp/otlptrace** v1.31.0 -> v1.35.0 * **go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc** v1.31.0 -> v1.35.0 * **go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp** v1.31.0 -> v1.35.0 * **go.opentelemetry.io/otel/metric** v1.31.0 -> v1.35.0 * **go.opentelemetry.io/otel/sdk** v1.31.0 -> v1.35.0 * **go.opentelemetry.io/otel/trace** v1.31.0 -> v1.35.0 * **go.opentelemetry.io/proto/otlp** v1.3.1 -> v1.5.0 * **golang.org/x/crypto** v0.28.0 -> v0.36.0 * **golang.org/x/exp** aacd6d4b4611 -> 2d47ceb2692f * **golang.org/x/mod** v0.21.0 -> v0.24.0 * **golang.org/x/net** v0.30.0 -> v0.38.0 * **golang.org/x/oauth2** v0.22.0 -> v0.27.0 * **golang.org/x/sync** v0.8.0 -> v0.14.0 * **golang.org/x/sys** v0.26.0 -> v0.33.0 * **golang.org/x/term** v0.25.0 -> v0.30.0 * **golang.org/x/text** v0.19.0 -> v0.23.0 * **golang.org/x/time** v0.3.0 -> v0.7.0 * **google.golang.org/genproto/googleapis/api** 5fefd90f89a9 -> 56aae31c358a * **google.golang.org/genproto/googleapis/rpc** 324edc3d5d38 -> 56aae31c358a * **google.golang.org/grpc** v1.67.1 -> v1.72.0 * **google.golang.org/protobuf** v1.35.1 -> v1.36.6 * **k8s.io/api** v0.31.2 -> v0.32.3 * **k8s.io/apimachinery** v0.31.2 -> v0.32.3 * **k8s.io/apiserver** v0.31.2 -> v0.32.3 * **k8s.io/client-go** v0.31.2 -> v0.32.3 * **k8s.io/cri-api** v0.31.2 -> v0.32.3 * **k8s.io/kubelet** v0.31.2 -> v0.32.3 * **k8s.io/utils** 18e509b52bc8 -> 3ea5e8cea738 * **sigs.k8s.io/json** bc3834ca7abd -> 9aa6b5e7a4b3 * **sigs.k8s.io/structured-merge-diff/v4** v4.4.1 -> v4.4.2 * **tags.cncf.io/container-device-interface** v0.8.0 -> v1.0.1 * **tags.cncf.io/container-device-interface/specs-go** v0.8.0 -> v1.0.0 Previous release can be found at [v2.0.0](https://github.com/containerd/containerd/releases/tag/v2.0.0) * `containerd-<VERSION>-<OS>-<ARCH>.tar.gz`: ✅Recommended. Dynamically linked with glibc 2.35 (Ubuntu 22.04). * `containerd-static-<VERSION>-<OS>-<ARCH>.tar.gz`: Statically linked. Expected to be used on Linux distributions that do not use glibc >= 2.35. Not position-independent. In addition to containerd, typically you will have to install [runc](https://github.com/opencontainers/runc/releases) and [CNI plugins](https://github.com/containernetworking/plugins/releases) from their official sites too. See also the [Getting Started](https://github.com/containerd/containerd/blob/main/docs/getting-started.md) documentation.
Fix #10710, this PR adds a new cri CNI config
bin_dirsthat is similar tobin_dirbut accepts a list of bin dirs. This PR includes:commit 1: support his new
bin_dirsfield:bin_dirsfield;ValidateRuntimeConfig, make sure only one ofbin_dirandbin_dirsis set.commit 2: make
bin_dirsas default, and migratebin_dirtobin_dirs:bin_dirsinstead ofbin_dir.bin_diras DEPRECATED.bin_dirin version 2 tobin_dirsin version 3.bin_dir.