Skip to content

Use saner default address pools#47737

Open
iBug wants to merge 2 commits intomoby:masterfrom
iBug-forks:default-address-pools
Open

Use saner default address pools#47737
iBug wants to merge 2 commits intomoby:masterfrom
iBug-forks:default-address-pools

Conversation

@iBug
Copy link

@iBug iBug commented Apr 22, 2024

What I did

Replaced the default address pools with a saner set.

Previously the default pools covered literally the entire 172.16.0.0/12 range and the entire 192.168.0.0/16 range, creating subnets as large as /16 or /20 from these spaces. In reality, few use cases could use up even a tiny portion of such large address spaces. Not to mention that Linux bridge has a 1024-port limit, rendering any subnet larger than /22 impractical.

In actual cases, the previous default pools are more likely to cause trouble when users spin up multiple Docker Compose projects, as by default each Compose project creates (at least) one network from these pools, and the 172.16.0.0/12 range is quickly exhausted after just a few Compose projects. This is already sufficient to cause breakage if the ranges overlap with VPN or corporate network (1, 2, ...).

This PR changes the default address pools to 172.18.0.0/16 split to /24, rendering 256 available networks for Compose projects (and others that let docker network create allocate subnets) while minimizing conflict with existing network facilities.

I understand that there's already "default-address-pools" in daemon.json, but why not provide a sane default value that works better for most people, instead of forcing every sysadmin to repeat the setting for every Docker daemon installation?

How I did it

Replaced the default address pools with a saner set.

How to verify it

Unit tests are also updated and they should pass.

Description for the changelog

Replaced the default address pools with a saner set.

A picture of a cute animal (not mandatory but encouraged)

image

Signed-off-by: iBug <git@ibugone.com>
@akerouanton
Copy link
Member

Thanks for your contribution!

We discussed exactly that last week with one of my colleage. I also think it doesn't make sense to allocate a whole /16 to bridge networks since the underlying interface doesn't support more than 1024 ports and in most cases there are less than 254 containers connected to the same network.

However, it's not a good idea to remove all the other prefixes. We can't predict which ones will overlap with other networking software / network setups. For instance, 192.168.0.0/16 is often used in local networks, but 172.16.0.0/12 could be found too. Hence why both prefixes are currently included. So please, don't change the list of address pools.

For the Size change, I think it'd be a breaking change for the small portion of users who do use the default bridge network to run more than 254 containers. So we'll have to discuss that further.

We're also in the process of redesigning the default IPAM allocator to fix bugs related to IPv6. One interesting idea emerged: give users the ability to specify what subnet size they want. Maybe we'll need to wait for this to be implemented before we can change the default subnet size.

@iBug
Copy link
Author

iBug commented Apr 22, 2024

it's not a good idea to remove all the other prefixes

So I've restored the previous pool range but reduced the subnet mask from (16 or 20) to 22. This should have minimal impact on existing users while mostly resolving the original issue of over-allocating from the RFC 1918 ranges. If 172.18.0.0/16 is available on a particular host, the first 64 subnets shouldn't escape this range.

the small portion of users who do use the default bridge network to run more than 254 containers

I didn't touch the default 172.17.0.0/16 range - it remains at /16. Only future IPAM allocations are having their range reduced to 22. This is the code from the first commit of my PR:

localScopeDefaultNetworks = []*NetworkToSplit{
    {"172.17.0.0/16", 16}, // <-- remains 16
    {"172.18.0.0/16", 24},
}

@vvoland vvoland added the area/networking Networking label Apr 22, 2024
@thaJeztah thaJeztah added the area/networking/ipam Networking label Apr 22, 2024
@thaJeztah
Copy link
Member

Looks like the second commit is missing a DCO sign-off, causing CI to fail

@akerouanton
Copy link
Member

We discussed this PR during the last libnet maintainers call. We think it's a good idea in general, however we won't merge it right away for a few reasons:

  1. While /22 aligns with the kernel limit for bridge interfaces, it's not easy for non network savvy users to read and to identify what subnets an IP adress is part of ;
  2. Thus it'd be better to use /24 by default. However, with the current allocator, there's no way for users to override that without relying on static allocation. If we do change that default right now, we're going to break some users ;
  3. With the new IPAM allocator we're working on, this constraint should be lifted -- users will be able to specify what subnet size they want and still profit from dynamic allocation ;
  4. We'd prefer to have the default subnet size to be hardcoded once, instead of per address pool.

Signed-off-by: iBug <git@ibugone.com>
@iBug iBug force-pushed the default-address-pools branch from 9abfa8b to 1a6ba91 Compare April 27, 2024 11:35
@iBug
Copy link
Author

iBug commented Apr 27, 2024

Looks like the second commit is missing a DCO sign-off

@thaJeztah I force-pushed that commit off. Additionally I changed the split target for 172.17.0.0/16 from 16 to 22, so now all subnets have /22 mask.

it's not easy for non network savvy users to read and to identify what subnets an IP address is part of

@akerouanton That's only going to happen if the user attaches more than 254 containers into the same subnet, otherwise all containers will have their IP addresses in the same /24 block (e.g. 172.17.4.2/22 to 172.17.4.255/22). I don't see it "not easy to identify", nor do I see it likely that a "non network-savvy user" would do this in the first place.

@yorickdowne
Copy link

I think /22 makes a lot of sense, as well. Non-network-savvy users are not likely to either use more than 254 containers in one subnet or even care what the container IP is.

Users with a medium familiarity with networking are OK with the host range being e.g. .4.2 through .7.254. And those would be the only ones even attempting to connect that many containers.

@thompson-shaun
Copy link
Contributor

Might close in favor of #50114 assuming that covers all features in this PR as well.

@danegsta
Copy link
Contributor

Might close in favor of #50114 assuming that covers all features in this PR as well.

Main difference is that 50114 keeps the existing defaults, but allows users to request a specific subnet size from the default pools (so I could ask for a /24 or /22 if I knew that was all I needed, but the pool could still support larger subnets if required).

@thompson-shaun thompson-shaun moved this from Discussing to Parked in 🔦 Maintainer spotlight Sep 25, 2025
@thompson-shaun
Copy link
Contributor

@akerouanton did we have an alternative for this?

cc @robmry

@akerouanton
Copy link
Member

@thompson-shaun Yeah, I have a WIP branch implementing a daemon-global default subnet size but I didn't want to add it to the v29 milestone to not overload it. Since it's a breaking change it needs to be put behind a feature flag until v30. I'll add it to my todolist to not forget about it 🙈

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

Status: Parked

Development

Successfully merging this pull request may close these issues.

9 participants