Use alternative network interface names (Network interface naming changed in 28.0.0)

### Description

Related to:
- https://github.com/docker/compose/issues/12776
- https://github.com/moby/moby/pull/47406
- https://github.com/moby/moby/pull/49155
- https://github.com/docker/compose/issues/12740
- https://github.com/moby/moby/pull/48936

To summarise some of the discussion on https://github.com/docker/compose/issues/12776 ... 
- Before moby 28.0.0, for a given compose config, interface names in containers were always assigned to the same network endpoints.
- In moby 28.0.0:
  - That changed unexpectedly, interface names became unpredictable by-default.
  - New option `com.docker.network.endpoint.ifname` was added, making it possible to explicitly assign an interface name.
  - New option `GwPriority` was added to make it possible to determine which network provides a container's default gateway.

This issue is for discussion of the issue and next steps.

#### How the change came about

Multiple network endpoints can be described in the API's [container create](https://docs.docker.com/reference/api/engine/version/v1.49/#tag/Container/operation/ContainerCreate) request. But field `EndpointsConfig` is an unordered map, not a list. So, no predictable order for interface naming (`eth0`, `eth1`, ...) can be implied by a create request.

(Compose switched from using separate `NetworkConnect` to supplying multiple endpoints in the create request. But, that had no effect on network interface naming, because the connect calls were made before the container was started. Endpoints were accumulated internally in the same way.)

Interface names are allocated by libnetwork, during a call to `populateNetworkResources`. The code has changed a bit but, [before](https://github.com/moby/moby/blob/3e03c979dab82bd8dcc3c80c7f2ec81e7bf846b2/daemon/container_operations.go#L516-L531) and [since](https://github.com/moby/moby/blob/19ccb75c628251f3a5885c6da6827fcc1970e873/daemon/container_operations.go#L432-L442) 28.0.0, the daemon adds network connections to a container during `sbJoin`, by iterating over a temporary map that's populated from the API's map. There's a call to `populateNetworkResources` during that map iteration - so, definitely no predictable ordering there, endpoints have been through two Go maps.

But, there's another call to `populateNetworkResources` from `SetKey` - [before](https://github.com/moby/moby/blob/b7186bdfc8537f43769199ecef8a74884c4993b1/libnetwork/sandbox_linux.go#L175-L179) and [since](https://github.com/moby/moby/blob/56a7817b2d8e4e20d6aa81e5d43d072717c71519/libnetwork/sandbox_linux.go#L201-L205) 28.0.0. That call is made by iterating over libnetwork's list of endpoints `sb.Endpoints` - which is ordered.

Before 28.0.0, `SetKey` was a callback from the OCI prestart hook. Now, it's called once container task creation is complete and the container's configuration can be inspected (making it possible to check whether to allocate IPv6 addresses). The call to `sbJoin` now happens after `SetKey`.

Before 28.0.0, the `SetKey` call was made later than the `sbJoin` and it did the work - `sbJoin` returned early because there was no OS Sandbox yet.

Since 28.0.0, the `sbJoin` call is made later than `SetKey` and it does the work - `SetKey` takes no action (in this case) because `sb.Endpoints` has not yet been populated by `sbJoin` calls.

So, `populateNetworkResources` was called in `sb.Endpoints` order. Now, it's called in a random order.

### The old order

But, the `sb.Endpoints` order has also changed ... the Endpoints are stored in gateway priority order, defined by `Endpoint.Less` [before](https://github.com/moby/moby/blob/80d00132170eeed69f2c05deb74a7202f4b37975/libnetwork/sandbox.go#L677-L683) 28.0.0, and [since](https://github.com/moby/moby/blob/7c52c4d92e4fe584e2d25209a54c7d07c24baee1/libnetwork/sandbox.go#L651-L659).

Before 28.0.0, in `sb.Endpoints` - non-gateway, external, dual-stack networks came first - then IPv4-only, internal and gateway networks. Networks with the same properties were sorted lexicographically.

Since 28.0.0, there's way to set gateway priority explicitly and that takes priority. Then dual stack networks are preferred over single stack (because we now have IPv6-only networks as well as IPv4-only), and the other rules were unchanged.

Ordering by gateway priority doesn't make sense as a way to determine network interface naming. While it would have made things stable for a fixed set of network connections, adding another network would have re-ordered things, and therefore renamed interfaces, apparently unpredictably - lexicographical ordering would sometimes have helped but not always (depending on network configurations). It wasn't even good for gateway selection, which is why we added a way to be explicit about gateway priority.

As @akerouanton [described here](https://github.com/compose-spec/compose-spec/pull/552#issuecomment-2583288043) - we shouldn't conflate gateway priority and endpoint naming.

#### What to do about it?

(As discussed in the networking maintainers call, 6th May.)

The only way to keep interface naming stable for those who relying on it pre-28.0.0 would be to preserve the original `Endpoint.Less` ordering indefinitely, even though it's not really fit for purpose - names may change on container restart, or add a new network to the config and the names may all change.

We now have a way to explicitly name interfaces for users who need predictable names.

In a given configuration, by using the new interface naming option to assign the interface names that would have been assigned by 28.0.0, networks will reliably have the same names before and after 28.0.0. (With no need to supply different compose configuration files, because the interface-naming label will just be ignored by pre-28.0.0 builds. So, backwards compatibility is possible for configurations that need it.)

Release 28.0.0 shipped on 20th Feb, and the issue was reported on 25th April. So, either there are a low number of affected users, or most affected users looked at the release notes and realised they could use the new interface naming to solve the problem.

Because the interface naming order isn't (and wasn't) guaranteed across container restarts and seemingly innocuous configuration changes, it's probably better to be explicitly unpredictable - to encourage use of explicit naming when the naming matters. We can support that indefinitely, more easily and reliably than supporting the old/unintended ordering.

So - there's no plan to restore the old sort order.

We considered going the other way, without explicit names perhaps interface numbering should be completely random to make it clear to users that it has to be configured to be stable. But, being able to rely on a container with a single interface having an `eth0` is reasonable, we don't need to change that.

@corhere suggested using [alternative network names](https://lwn.net/Articles/794289/) to give interfaces a second name, without the usual netdev restrictions ... a container can only have a single endpoint in each network, so we'd use the network name (if possible, else not assign an alternative name, to avoid collisions or ambiguity). Then, with no additional config needed, interfaces would have predictable names ... that enhancement seems worth investigating.

### Reproduce

n/a

### Expected behavior

_No response_

### docker version

```bash
28.0.0
```

### docker info

```bash
n/a
```

### Additional Info

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use alternative network interface names (Network interface naming changed in 28.0.0) #49935

Description

How the change came about

The old order

What to do about it?

Reproduce

Expected behavior

docker version

docker info

Additional Info

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Use alternative network interface names (Network interface naming changed in 28.0.0) #49935

Description

Description

How the change came about

The old order

What to do about it?

Reproduce

Expected behavior

docker version

docker info

Additional Info

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions