Consolidate network namespace handling#29993
Conversation
ti-mo
left a comment
There was a problem hiding this comment.
Thanks for picking this up! I've renamed the PR to something more descriptive and left a few preliminary comments.
b0b40c3 to
ea1d9eb
Compare
|
Back from break, should be good for review. Still working on test failures. |
64309b4 to
a6c76f2
Compare
|
/test |
|
Resynced w/main - I'm good if you all are. |
squeed
left a comment
There was a problem hiding this comment.
Looks basically good; I'd like to see it split in to a few more commits. One for the new API, one to move all test code, one for trivial changes, non-trivial changes in individual commits, and lastly one to remove old code. It would make reviewing a lot easier. Thanks!
|
Commits cad2332, 0d0b0b9, e979d13 do not match "(?m)^Signed-off-by:". Please follow instructions provided in https://docs.cilium.io/en/stable/contributing/development/contributing_guide/#developer-s-certificate-of-origin |
@squeed I at least split into 3, that helps a bit:
(mergability check will fail but HTH, ty) |
|
@bleggett Thanks for picking this back up, I fell sick a few weeks ago and didn't manage to continue. I'm taking another look at this now, there's an issue with Also, the I'll push this up soon. Edit: I've added the |
|
/test |
|
The |
|
/test |
1 similar comment
|
/test |
Ah yep, this was an overzealous attempt to fully drop the final explicit |
|
@bleggett Already done, take a look at the current Looks like CI is still failing even after fixing |
👍 Feel free to beat me to it if you want, otherwise I'll get to it in a day or two. |
|
Can I get someone to I can't repro any unit test failures vis a vis |
|
/test |
|
@ti-mo et al - tests seem good to me locally, and the conformance tests are passing now. I note that one each of the gateway API and ginkgo subtests are failing, I do not think that's related to this, but am happy to be corrected. This should be good to merge if everyone is happy. |
|
@bleggett Thanks for taking a look! What did you end up changing to address the failures? Rebasing onto main to pick up #31017 for https://github.com/cilium/cilium/actions/runs/8072988990/job/22055814369. |
This allows surfacing unexpected errors, but ignores interfaces that are already absent. Signed-off-by: Timo Beckers <timo@isovalent.com>
This was already redundant when running Cilium inside a container, since the nsfs instance inside the container is bound to the container's lifecycle. Running Cilium outside of a container is currently rather involved and not officially supported. Remove the netns cleanup code. When the container exits, cilium-health and its enclosing namespace also disappears. Follow-up commits will remove the code in package netns being called here. Signed-off-by: Timo Beckers <timo@isovalent.com>
The previous netns package had a few problems. It shelled out to iproute2, and it depended on both containernetworking/plugins/pkg/ns and vishvananda/netns, which lead to some idiosyncratic API. This commit addresses these issues and takes care of some much-needed API cleanup: - Create a new netns with New() - Open an existing pinned netns with OpenPinned() - Execute code within the netns with ns.Do() - Close with Close() Pinning network namespaces is not supported, as there is currently little reason for doing so. In case the requirement pops up again later, it can always be added. All tests now use anonymous (non-pinned) network namespaces, and the netns created for cilium-health also no longer leaves an entry in /var/run/netns. Signed-off-by: Benjamin Leggett <benjamin.leggett@solo.io> Co-authored-by: Timo Beckers <timo@isovalent.com>
|
/test |
Absolutely nothing, just resynced to |
|
TY all - apologies this took a bit of time 😅 |
Fixes: #27449
This is the first step towards fixing #27449 and #27450 - Removing dependence on
ip netnsfrom the internal netns lib, and consolidating netns creation/handling into one lib/spot, while not requiring mount pins for every netns creation, the way we currently do - most things (incl. tests) rarely need this.This PR removes a few netns libs from the repo:
containernetworking/plugins/pkg/nsvishvananda/netnsfrom all but one spot (pkg/datapath/linux/devices_controller.go, where a sister lib wants a typecast)in favor of an internal lib
"github.com/cilium/cilium/pkg/netns"which avoids pinning netnses, and only has a dep on theunixpackage.General goals of the
github.com/cilium/cilium/pkg/netnspackage:sys/unixLooking for:
pkg/netnsAPI surface/impl/general approachGenerally the intent is to make sure that code that creates netnses is solely responsible for closing them.
Code that is running out-of-process from the process that is the netns owner sometimes does need to run some closure in the netns context, but should not be burdened with netns lifecycle concerns, or be responsible for closing/cleanup, in virtually all cases. The API surface is designed to discourage the current overreliance on named-pin-based disposal/cleanup.
cc @ti-mo to make sure I interpreted his intent correctly.