libnet: add support for custom interface names#49155
Conversation
32cc271 to
115f4f1
Compare
8fb2d3e to
8042945
Compare
27e6740 to
8e26804
Compare
|
CI failures look legit |
d05cb6e to
59498e2
Compare
| ctx := setupTest(t) | ||
| apiClient := testEnv.APIClient() | ||
|
|
||
| master := "dm-dummy0" |
There was a problem hiding this comment.
The comment at the top should probably go here, as this is where that part of the test happens?
| master := "dm-dummy0" | |
| // verify the driver automatically provisions the 802.1q link (dm-dummy0.60) | |
| master := "dm-dummy0" |
There was a problem hiding this comment.
Also for fixed values, I started to use const more; I think it sometimes help glancing over a test to identify "dynamic" values (may change during the test) and fixtures;
| master := "dm-dummy0" | |
| // verify the driver automatically provisions the 802.1q link (dm-dummy0.60) | |
| const master = "dm-dummy0" |
There was a problem hiding this comment.
Ah, yeah I had these changes parked yesterday while I was trying to fix unobvious concurrency issues in the osl package. I should have put this PR back in draft mode.
I removed these comments as they don't say anything useful. It was copy/pasted from other tests -- we should probably remove them in those other tests too.
4838111 to
7e6b766
Compare
| // append it to dstPrefix. | ||
| // | ||
| // It's safe to call concurrently. | ||
| func (n *Namespace) createInterface(ctx context.Context, targetNs netns.NsHandle, srcName, dstPrefix, dstName string, options ...IfaceOption) (*Interface, netlink.Link, error) { |
There was a problem hiding this comment.
This new method is introduced to make sure the Namespace is locked between the call to generateIfaceName and append(n.iFaces, i), because generateIfaceName uses n.iFaces to generate a unique dstName.
However, the underlying network link represented by Interface need to have its final dstName set before appending because Namespace.RemoveInterface() references it and could be run concurrently with AddInterface.
waitForBridgePort and n.advertiseAddrs could introduce a lot of delay if things go wrong. So instead of locking for the whole lifespan of AddInterface, this method will append earlier than before.
moby/libnetwork/osl/interface_linux.go
Line 716 in 1463c99
|
|
||
| path := n.path | ||
| isDefault := n.isDefault | ||
| nlh := n.nlHandle |
There was a problem hiding this comment.
Not sure if we're all going to agree on this code style.
I think that by not making copies of Namespace properties, we clearly signal that: 1. unaltered versions of Namespace properties are used; 2. they're safe to access concurrently.
There was a problem hiding this comment.
Looks good - I think we can take it further (not in this PR), to end up with comments on struct members that clearly state which need mutex protection.
There was a problem hiding this comment.
I don't think there's a test that shows what's expected to happen when an explicit interface name clashes with an already allocated name ... perhaps worth adding?
# docker network create n0
# docker run --rm -ti --network n0 --name c1 busybox
# docker network create n1
# docker network connect --driver-opt com.docker.network.endpoint.ifname=eth0 n1 c1
Error response from daemon: failed to add interface vethbdb03af to sandbox: error renaming interface "vethbdb03af" to "eth0": file exists
Because we don't have a "join order" for networks, a command like this will sometimes work, and sometimes won't ... just depending on which network connection happens to handled first.
# docker run --rm -ti --network name=n0,driver-opt=com.docker.network.endpoint.ifname=eth0 --network name=n1 --name c1 alpine
So, if I want the "n0" endpoint to be called "eth0" ... I either have to connect it first, then separately connect others (without specifying multiple networks in the run), or explicitly name all the interfaces, or use a different default prefix for the network. Maybe that's a limitation we need to document?
Perhaps it's also worth adding examples like the ones above of how to use the new endpoint option in the changelog note (or, just a mention that it goes in the driver-opt option)?
|
|
||
| path := n.path | ||
| isDefault := n.isDefault | ||
| nlh := n.nlHandle |
There was a problem hiding this comment.
Looks good - I think we can take it further (not in this PR), to end up with comments on struct members that clearly state which need mutex protection.
thaJeztah
left a comment
There was a problem hiding this comment.
Just leaving as comment; I tried digging in, but it's a lot of complexity and I'm wondering if we (also see some of my comments) can get this done without changing so many signatures (especially because it appears some of those already have the information available in some form or another)
libnetwork/osl/interface_linux.go
Outdated
| // AddInterface adds an existing Interface to the sandbox. The operation will | ||
| // rename from the Interface SrcName to DstName as it moves, and reconfigure | ||
| // the interface according to the specified settings. If dstPrefix is provided, | ||
| // but not dstName, AddInterface will auto-generate a unique suffix and append | ||
| // it to dstPrefix. |
There was a problem hiding this comment.
Nit: we should try to avoid future tense more for these; this part confused me, because dstName is not an argument here; is this because it's passed through options?;
if
dstPrefixis provided, but notdstName, AddInterface will auto-generate a unique suffix and append it todstPrefix.
Perhaps something like this would work (Ideally we'd mention SomeOption here to set the name, instead of referring to IfaceOption.dstName 🤔);
// AddInterface adds an existing Interface to the sandbox. The operation renames
// the Interface SrcName to DstName as it moves, and reconfigures the interface
// according to the specified settings. By default, AddInterface auto-generates
// a unique interface using the given dstPrefix as a prefix, unless a custom
// interface-name is passed as an option and set through [IfaceOption.dstName].Some confusing logic in the code as well, but I probably need to look closer at that to understand what exactly it does, because it looks like for the default network, dstName is ignored?
if n.isDefault {
i.dstName = i.srcName
} else {
i.dstName = fmt.Sprintf("%s%d", dstPrefix, n.nextIfIndex[dstPrefix])
n.nextIfIndex[dstPrefix]++
}There was a problem hiding this comment.
Woops looks like this comment slipped in the wrong commit - that's why you're getting confused by dstName being mentioned here.
I rewrote the comment to use present tense, but since this function is left untouched until the next (2nd) commit I rather removed that change from the 1st one.
Some confusing logic in the code as well, but I probably need to look closer at that to understand what exactly it does, because it looks like for the default network, dstName is ignored?
Yes, this function is confusing... isDefault here isn't "the default network" but the "default network namespace" - in other words, the "host" netns / network. There might be good reasons for that but I don't know tbh.
There was a problem hiding this comment.
Gotcha! More stuff to untangle in our backlog 😂
| } | ||
|
|
||
| if err := d.Join(ctx, nid, epid, sb.Key(), ep, sb.Labels()); err != nil { | ||
| if err := d.Join(ctx, nid, epid, sb.Key(), ep, ep.generic, sb.Labels()); err != nil { |
There was a problem hiding this comment.
See my other comment as well; it looks like the information needed is all captured in the endpoint itself (which is passed and must implement the driverapi.JoinInfo interface; that interface already has a InterfaceNames() method (but not sure if that's related); just wondering if instead of passing 2 separate properties from the same ep, we could just pass ep and have it expose the information that's needed (either through the existing JoinInfo interface method, or perhaps that needs to either be expanded, or a separate GenericInfoProvider (whatever) interface that can be implemented optionally?
(I notice sandbox has a Labels() method, although I don't think it's currently part of an interface)
moby/libnetwork/driverapi/driverapi.go
Lines 147 to 152 in 250378a
There was a problem hiding this comment.
it looks like the information needed is all captured in the endpoint itself (which is passed and must implement the driverapi.JoinInfo interface; that interface already has a InterfaceNames() method (but not sure if that's related)
It definitely is related, but data flows the other way around. That is, netdrivers call InterfacesNames().SetNames(srcName, dstPrefix, dstName) on driverapi.JoinInfo to set the interface names.
Libnetwork's sbJoin method does the following:
- Call the netdriver
Joinmethod with theEndpointpassed as 2nd argument (implementor ofdriverapi.JoinInfo). The driver sets a bunch of values on the libnet'sEndpointInterfacethrough the setters defined indriverapi.JoinInfo. - Then call
populateNetworkResourceswhich builds a list ofosloptions based on the values set onEndpointInterfaceat the previous step, and then call theAddInterface()method defined in theoslpackage with those options.
So, to implement your suggestion, I'd have to either break the unidirectional nature of driverapi.JoinInfo, or add yet another interface. I think both are equally bad and undesirable - and long term plan is to get rid of those interfaces.
(I notice sandbox has a Labels() method, although I don't think it's currently part of an interface)
The custom interface name is an endpoint option, not a sandbox option.
There was a problem hiding this comment.
The custom interface name is an endpoint option, not a sandbox option.
Ah, sorry, I meant "add a similar accessor to the Endpoint interface, like the Sandbox interface has, but as discussed in our call, we should consider all these internal, and have the flexibility of changing things / shuffle things around later.
| func GetIfname(opts map[string]interface{}) string { | ||
| if opts == nil { | ||
| return "" | ||
| } | ||
| opt, ok := opts[Ifname] | ||
| if !ok { | ||
| return "" | ||
| } | ||
| ifname, ok := opt.(string) | ||
| if !ok { | ||
| return "" | ||
| } | ||
| return ifname | ||
| } |
There was a problem hiding this comment.
This can be a one-liner;
| func GetIfname(opts map[string]interface{}) string { | |
| if opts == nil { | |
| return "" | |
| } | |
| opt, ok := opts[Ifname] | |
| if !ok { | |
| return "" | |
| } | |
| ifname, ok := opt.(string) | |
| if !ok { | |
| return "" | |
| } | |
| return ifname | |
| } | |
| func GetIfname(opts map[string]interface{}) string { | |
| ifName, _ := opts[Ifname].(string) | |
| return ifName | |
| } |
e.g.; https://go.dev/play/p/9QdxgApB7qE
package main
import "testing"
func GetIfname(opts map[string]interface{}) string {
opt, _ := opts["com.docker.network.endpoint.ifname"].(string)
return opt
}
func TestMaybeNil(t *testing.T) {
ifName := GetIfname(nil)
t.Log(ifName)
ifName = GetIfname(map[string]interface{}{"com.docker.network.endpoint.ifname": true})
t.Log(ifName)
ifName = GetIfname(map[string]interface{}{"com.docker.network.endpoint.ifname": "eth0"})
t.Log(ifName)
}There was a problem hiding this comment.
But with the above taken into account, I'm even wondering if we should just inline it into where it's used 🙈
There was a problem hiding this comment.
Right 🤦
As for inlining, I don't have strong opinions but I feel like having a dedicated accessor that takes care of casting the option somewhat hides the ugliness of these map[string]interface{} and make callers' code slightly easier to read.
There was a problem hiding this comment.
Yeah, it's fine for now to keep a Method / Function for this. We need to have a close look at all the fuzzy genericopts in a wider scope, and see if we can reduce the back-and-forth changing of strongly typed arguments to a generic one (and back! in some cases)
| // restore interfaces | ||
| for iface, opts := range interfaces { | ||
| i, err := newInterface(n, iface.SrcName, iface.DstPrefix, opts...) | ||
| i, err := newInterface(n, iface.SrcName, iface.DstPrefix, iface.DstName, opts...) |
There was a problem hiding this comment.
This commit doesn't compile, because newInterface signature was not modified (yet)?
I'm actually wondering; dstName is an optional option, and because of that probably should not be a positional argument (which more commonly are used for required arguments); is there a reason we cannot pass interface name as part of opts ... ?
There was a problem hiding this comment.
This commit doesn't compile, because newInterface signature was not modified (yet)?
Rebase issue - this should be fixed now.
I'm actually wondering; dstName is an optional option, and because of that probably should not be a positional argument (which more commonly are used for required arguments); is there a reason we cannot pass interface name as part of opts ... ?
Both dstPrefix and dstName are now optional params - either one has to be provided.
|
|
||
| for _, i := range tbox.Interfaces() { | ||
| err = s.AddInterface(context.Background(), i.SrcName(), i.DstName(), | ||
| err = s.AddInterface(context.Background(), i.SrcName(), i.dstPrefix, i.DstName(), |
There was a problem hiding this comment.
These don't compile in this commit, because AddInterface signature was not (yet) changed
Perhaps an idea; could we change newInterface() to accept a name only, not a prefix ? It looks like the current logic is;
- call
newInterface(), which constructs the interface, but only "partially" - after it's created, we get the prefix (or name), and based on that, we patch it up
It feels more organic to feed the interface-name into it when constructing (which could even be i.DstName() to take an argument, to enumerate suffixes), e.g. i.DstName(n.SuffixEnumerator) (where SuffixEnumerator would give the next suffix to use
There was a problem hiding this comment.
These don't compile in this commit, because AddInterface signature was not (yet) changed
Rebase issue - this should be fixed now.
It feels more organic to feed the interface-name into it when constructing (which could even be i.DstName() to take an argument, to enumerate suffixes), e.g. i.DstName(n.SuffixEnumerator) (where SuffixEnumerator would give the next suffix to use
Namespace.RestoreInterfaces(), which is the other function calling newInterface, does something wildly different with the output of newInterface(). That code is fragile IMO, so I don't think this PR is the right place to change what newInterface() does, and I think your suggestion for SuffixEnumerator would actually bring more unnecessary complexity.
libnetwork/osl/interface_linux.go
Outdated
| if strings.HasPrefix(i.DstName(), prefix) { | ||
| s := i.DstName()[len(prefix):] | ||
| if v, err := strconv.Atoi(s); err == nil { | ||
| suffixes = append(suffixes, v) | ||
| } | ||
| } |
There was a problem hiding this comment.
This can use CutPrefix to do it all at once;
| if strings.HasPrefix(i.DstName(), prefix) { | |
| s := i.DstName()[len(prefix):] | |
| if v, err := strconv.Atoi(s); err == nil { | |
| suffixes = append(suffixes, v) | |
| } | |
| } | |
| if s, ok := strings.CutPrefix(i.DstName(), prefix); ok { | |
| if v, err := strconv.Atoi(s); err == nil { | |
| suffixes = append(suffixes, v) | |
| } | |
| } |
- at the end? i.e.;
- use
ethas prefix - then use
eth-as prefix
Which, when checking for eth as prefix would also contain interfaces prefixed with eth-, but now we'd be parsing, e.g. eth-1 as strconv.Atoi(-1), so -1 ?
The "pick first available slot" further down would then pick 0;
suffixes := []int{-1, 0, 1}
for i := 0; i < len(suffixes); i++ {
if i != suffixes[i] {
return prefix + strconv.Itoa(i)
}
}There was a problem hiding this comment.
Right, negative suffixes are now ignored.
libnetwork/osl/interface_linux.go
Outdated
| if len(suffixes) == 0 { | ||
| return prefix + "0" | ||
| } | ||
|
|
||
| sort.Ints(suffixes) | ||
|
|
||
| for i := 0; i < len(suffixes); i++ { | ||
| if i != suffixes[i] { | ||
| return prefix + strconv.Itoa(i) | ||
| } | ||
| } |
There was a problem hiding this comment.
This last bit of code looks to be to inject a suffix in the first possible slot; is it important to have consecutive numbers, or would it be fine to just pick "max + 1" ?
I was also playing to see if we could skip this part, which I think we could if we know the first loop already produced only consecutive numbers; basically, if the highest number matches the number of interfaces, I think we can tell "they must've been consecutive"
eth-1 -> suffix=-1)
func (n *Namespace) generateIfaceName(prefix string) string {
var suffixes []int
var maxSuffix int
for _, i := range n.iFaces {
if s, ok := strings.CutPrefix(i.DstName(), prefix); ok {
if v, err := strconv.Atoi(s); err == nil {
if v > maxSuffix {
maxSuffix = v
}
suffixes = append(suffixes, v)
}
}
}
if len(suffixes) == 0 {
return prefix + "0"
}
// maximum suffix matches length of interfaces, which means that
// all interfaces with the given prefix were numbered without
// gaps; pick the next one.
if len(suffixes) == maxSuffix {
return prefix + strconv.Itoa(maxSuffix+1)
}
sort.Ints(suffixes)
for i := 0; i < len(suffixes); i++ {
if i != suffixes[i] {
return prefix + strconv.Itoa(i)
}
}
return prefix + strconv.Itoa(len(suffixes))
}There was a problem hiding this comment.
This last bit of code looks to be to inject a suffix in the first possible slot; is it important to have consecutive numbers, or would it be fine to just pick "max + 1" ?
That could become an issue if "max + 1" overflows the maximum number of characters for interface names (IFNAMSIZ = 16 chars), eg. prefix = 'mylonglongif' / highest suffix = '9999'.
By looking for the lowest slot, this would be handled correctly (granted there's no 9999 networks attached to that container, in which case the issue isn't on our side).
I added a comment to clarify this.
I was also playing to see if we could skip this part, which I think we could if we know the first loop already produced only consecutive numbers; basically, if the highest number matches the number of interfaces, I think we can tell "they must've been consecutive"
Actually that code doesn't test if there's no gaps in suffixes. Doing that would be slightly more involved, at which point I'm not sure this gives any benefit given the added complexity. I'd prefer to keep this code simple, stupid.
| }{ | ||
| {prefix: "test", names: []string{"test0", "test1"}, want: "test2"}, | ||
| {prefix: "test", names: []string{"test0", "test2"}, want: "test1"}, | ||
| {prefix: "test", names: []string{"test2"}, want: "test0"}, |
There was a problem hiding this comment.
See my other comment; not sure how realistic it is to happen, but we should probably have test-cases with mixed prefixes, and one that has a - in the prefix; for example, this will fail, and pick eth0;
| {prefix: "test", names: []string{"test2"}, want: "test0"}, | |
| {prefix: "test", names: []string{"test2"}, want: "test0"}, | |
| {prefix: "eth", names: []string{"eth-0", "eth-1", "eth0"}, want: "eth1"}, |
There was a problem hiding this comment.
Done - added a test case for negative numbers.
There was an edge case around -0 being treated as 0 - it's now handled correctly.
| containerVethPrefix = network.config.ContainerIfacePrefix | ||
| } | ||
| if err := iNames.SetNames(endpoint.srcName, containerVethPrefix); err != nil { | ||
| if err := iNames.SetNames(endpoint.srcName, containerVethPrefix, netlabel.GetIfname(epOpts)); err != nil { |
There was a problem hiding this comment.
Overall, I'm a bit on the fence on the approach; I know we have the labels as "fuzzy options", but making it part of the option-type seems to be digging in even deeper (map[string]interface{} being the API), and I'm wondering if we can have a more targeted approach (WithInterfaceName(name), or WithInterfaceName(prefix, name) (if we want that option to either set prefix or name), where we parse the labels early on 🤔
There was a problem hiding this comment.
I'm not sure to see where this hypothetical option would be placed / passed around as that doesn't seem to fit anywhere in the current design.
Basically, epOpts being a set of (mostly) user-defined options passed to the netdriver isn't that terrible since these are handled exclusively by the netdriver (or should be), and libnetwork should assume no knowledge about their semantics and how / if they're going to be handled.
If you're suggesting that this option is passed by libnetwork to the netdriver, then libnetwork has to know about that label and that would break that ^. Plus, it'd go against your suggestion to change less signatures.
If you're suggesting that it should be passed by the netdriver to libnetwork, that's probably overkill for passing just a single option. I'm not a big fan of coming up with a new design for the sake of a single option with no clear plans to reuse the same mechanism for other options. We can look into this in the future but I doubt it'd be used by anything else.
7e6b766 to
3e2c9e6
Compare
|
Force pushing some changes to check if the integration test |
|
Okay, so the issue I was seeing locally was caused by 'Docker VMM'. Tests should be green now.
Done.
Done. I added a recommendation regarding colliding prefixes. |
Before this commit, `Interface.dstName` was used to store the dest ifname prefix, and then the final ifname. Signed-off-by: Albin Kerouanton <albinker@gmail.com>
To support this, a new netlabel is added: `com.docker.network.endpoint.ifname`. It gives the ability to specify the interface name to be set by netdrivers when the interface is added / moved into the container's network namespace. All builtin netdrivers support it. Signed-off-by: Albin Kerouanton <albinker@gmail.com>
At first glance, it seemed like the Namespace lock was held to make local copies of Namespace properties, but all those properties all safe to access concurrently. So, un-alias those props and reduce the scope of the mutex lock. This helps understanding where locking is really needed. Signed-off-by: Albin Kerouanton <albinker@gmail.com>
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
| // can be added dynamically. | ||
| type Namespace struct { | ||
| path string | ||
| path string // path is the absolute path to the network namespace. It is safe to access it concurrently. |
There was a problem hiding this comment.
As none of these are exported, it would be possible to shuffle things round, and perhaps we should look if we can split fields into immutable fields (set when constructing), and mutable fields (requiring the my mutex)
3e2c9e6 to
5a703c2
Compare
Fixes:
- What I did
To support this a new netlabel is added:
com.docker.network.endpoint.ifname.It gives the ability to specify the interface name to be set by netdrivers when the interface is added / moved into the container's network namespace.
All builtin netdrivers support it.
- How I did it
First commit refactor
libnetwork/oslto add a new fielddstPrefix. So far, its fielddstNamewas used to store both the name prefix and the auto-generated interface name. When an interface was created,dstNamewould take the prefix and later on that field would be mutated to take the final name. Now, thedstPrefixis stored separately and thedstNameis the final interface name (either auto-generated, or statically defined).The second commit is the actual feature implementation.
The third commit refactors how iface names are generated. We now scan all the interfaces tied to a
Namespaceto find the first gap in numeric suffixes. This allows users to statically define ifnames that match generated names. For instance, a user can seteth2for a specific endpoint, and libnetwork will use [0, 1] and [3, ...] for dynamic ifname assignment.- How to verify it
- Description for the changelog