Skip to content

[DNM] Implements IP utilization tracking for networks#50450

Closed
aepifanov wants to merge 9 commits intomoby:masterfrom
aepifanov:dev/ip-utilization/main
Closed

[DNM] Implements IP utilization tracking for networks#50450
aepifanov wants to merge 9 commits intomoby:masterfrom
aepifanov:dev/ip-utilization/main

Conversation

@aepifanov
Copy link
Contributor

@aepifanov aepifanov commented Jul 18, 2025

- What I did

This PR introduces IP utilization tracking for Docker networks by extending the network inspection API to include IPAM state. This enhancement allows users to monitor how many IP addresses are allocated per subnet and IP range pool within a network.
It covers only IPv4.

- How I did it

Design IP-utilization

  1. Added IP allocation counters to track usage within:
    • The entire subnet
    • The defined IP range pool
  2. Introduced new API types to represent network State, including nested IPAM state information
  3. Implemented support and test coverage for the following network drivers:
    • bridge
    • macvlan
    • ipvlan
    • overlay

Example output from docker network inspect:

> docker network inspect my
[
    {
        "Name": "my",
....
        "IPAM": {
            "Driver": "default",
            "Options": {},
            "Config": [
                {
                    "Subnet": "192.168.10.0/24",
                    "IPRange": "192.168.10.128/31",
                    "Gateway": "192.168.10.1",
                    "AuxiliaryAddresses": {
                        "my-host": "192.168.10.128"
                    }
                }
            ]
        },
        "State": {
            "IPAM": {
                "192.168.10.0/24": {
                    "AllocatedIPsInSubnet": 4,
                    "AvailableIPsInIPRangePool": 1
                }
            }
        },
....
    }
]
  • AllocatedIPsInSubnet: Number of addresses allocated in the subnet. If the value exceeds uint64, it is capped at the maximum uint64 value.

  • AvailableIPsInIPRangePool: Number of available addresses for allocating in the IP range pool.

Depends on:

IP utilization in Swarm

- How to verify it

Run docker network inspect <network> and verify the new State.IPAM section reflects real-time allocation counts for each subnet configuration.

- Human readable description for the release notes

Implements IP utilization tracking for Docker networks.
Adds IP allocation counters to network inspect responses.

- A picture of a cute animal (not mandatory but encouraged)

image

@aepifanov aepifanov force-pushed the dev/ip-utilization/main branch from f523212 to bd0d363 Compare July 18, 2025 22:00
@aepifanov aepifanov force-pushed the dev/ip-utilization/main branch 4 times, most recently from 458003d to 381897c Compare July 22, 2025 15:47
@corhere corhere added this to the 29.0.0 milestone Jul 22, 2025
@corhere corhere added kind/feature Functionality or other elements that the project doesn't currently have. Features are new and shiny area/networking/ipam Networking impact/api impact/changelog impact/documentation labels Jul 22, 2025
Copy link
Contributor

@corhere corhere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How come you need both the PoolData.allocatedIPsInPool counters and (*addrset.AddrSet).Selected()? Aren't all reserved IPs added to an AddrSet? I would have expected .Selected() to be sufficient.

@aepifanov
Copy link
Contributor Author

@corhere

How come you need both the PoolData.allocatedIPsInPool counters and (*addrset.AddrSet).Selected()? Aren't all reserved IPs added to an AddrSet? I would have expected .Selected() to be sufficient.

allocatedIPsInPool counts the IP addresses allocated specifically from the IP range pool
allocatedIPsInSubnet tracks allocations from the entire subnet

@corhere
Copy link
Contributor

corhere commented Jul 23, 2025

allocatedIPsInPool counts the IP addresses allocated specifically from the IP range pool
allocatedIPsInSubnet tracks allocations from the entire subnet

Could you please elaborate further? I still do not understand why the information from the AddrSet is incomplete.

@aepifanov
Copy link
Contributor Author

allocatedIPsInPool counts the IP addresses allocated specifically from the IP range pool
allocatedIPsInSubnet tracks allocations from the entire subnet

Could you please elaborate further? I still do not understand why the information from the AddrSet is incomplete.

allocatedIPsInSubnet matches 1:1 with .Selected() from the AddrSet, as it counts all IPs allocated from the entire subnet.
allocatedIPsInPool is calculated based on matches against the configured ip-range, which is defined at the PoolData level and stored as a children map:

// PoolData contains the configured pool data
type PoolData struct {
addrs *addrset.AddrSet
children map[netip.Prefix]struct{}
// Whether to implicitly release the pool once it no longer has any children.
autoRelease bool
}

@corhere
Copy link
Contributor

corhere commented Jul 23, 2025

Okay, I think I understand. I am not entirely thrilled with there being two sources of truth for the IP address allocations, but you may have had a good reason to take this approach. Which alternative solutions did you consider and why did the approach of maintaining a running counter win out?

@aepifanov aepifanov force-pushed the dev/ip-utilization/main branch 3 times, most recently from c90744c to 9972c42 Compare July 25, 2025 16:07
@aepifanov
Copy link
Contributor Author

@corhere I've done the following changes from the last your review:

  1. removed ipv6 support
  2. reversed the counter for ip-range pool from used to available
  3. extened the integratio tests with using aux-addresses
  4. changed the State.IPAM format to:
        "State": {
            "IPAM": {
                "192.168.10.0/24": {
                    "AllocatedIPsInSubnet": 5,
                    "AvailableIPsInIPRangePool": 1
                }
        },

@aepifanov aepifanov requested a review from corhere July 25, 2025 17:14
Copy link
Contributor

@corhere corhere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does not look like the tests cover all the corner cases.

Comment on lines +181 to +196
bits := as.pool.Addr().BitLen() - as.pool.Bits()
// If there are no bitmaps, all addresses are unselected.
if len(as.bitmaps) == 0 {
return 0
}
bms := uint64(1)
// If the subnet is bigger than Bitmap's capacity, calculate the number of bitmaps we have.
if bits > maxBitsPerBitmap {
bms = 1 << (bits - maxBitsPerBitmap)
}

// Calculate the number of selected addresses in each bitmap.
for _, bm := range as.bitmaps {
if bms > 0 {
bms--
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the purpose of the bits and bms variables? Neither one is used in the summation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Outdated code. Removed.

Comment on lines +182 to +185
// If there are no bitmaps, all addresses are unselected.
if len(as.bitmaps) == 0 {
return 0
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: premature optimization. The loop body would execute zero times, leading to the correct answer of zero addresses selected without any wasted cycles.


func newPoolData(pool netip.Prefix) *PoolData {
h := addrset.New(pool)
func newPoolData(pool, sub netip.Prefix) *PoolData {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the rationale behind changing the signature of newPoolData?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea was to handle usedInRangePool counter during the PoolData initialization steps, which required the sub info as well.
But it's outdated code for now.

bits := pool.Addr().BitLen() - pool.Bits()
if !pool.Addr().Is4() || bits > 1 {
h.Add(pool.Addr())
pd.RequestAddress(pool, netip.Prefix{}, netip.Addr{}, nil)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is this equivalent to pd.addrs.Add(pool.Addr()) when you aren't passing pool.Addr() as a parameter?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After re-implementation of usedInIPRangePool calculation it's returned back.

children map[netip.Prefix]struct{}

// The number of addresses allocated in the pool.
allocatedIPsInPool uint64
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From our earlier discussions I understand that this is not actually the count of addresses allocated in the nw as a whole, since that would be duplicating (PoolData).addrs.Selected(). What is counting, then? I don't see how a single integer could be used to report how many addresses are available for dynamic allocation in a specific sub without interference from other subs under the same nw shared with other container networks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch!
I've re-implemented usedInIPRangePool handling method to the direct calculation of the selected addresses in a bitmap for a specified range, which allows to get the actual IPAM state for any range.

// and should no longer be modified.
func (a *Allocator) GetAllocatedIPs(poolID string) (allocatedIPsInSubnet, allocatedIPsInPool uint64, err error) {
log.G(context.TODO()).Debugf("GetAllocatedIPs(%s)", poolID)
k, err := PoolIDFromString(poolID)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see k.ChildSubnet consumed anywhere. How can the count of addresses allocated in that range be looked up without knowing which range to count?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the current implementation the range is taken from the PoolID which contains it.

if !cidr.Addr().Is4() {
return 0, fmt.Errorf("IPAM state only supports IPv4 CIDRs, got %s", cidr)
}
return uint64(1) << (32 - cidr.Bits()), nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can do better than hardcode a constant:

Suggested change
return uint64(1) << (32 - cidr.Bits()), nil
return uint64(1) << (cidr.Addr().BitLen() - cidr.Bits()), nil

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

as, err := newAddrSpace(tc.predefined)
assert.NilError(t, err)

err = as.allocateSubnet(tc.subnet, tc.ipRange)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where are you testing the cases with multiple ip-range allocations under a single subnet? Be sure to cover both the disjoint ranges (e.g. --subnet 192.168.0.0/24 --ip-range 192.168.0.0/25, --subnet 192.168.0.0/24 --ip-range 192.168.0.128/25) and overlapping ranges (e.g. --subnet 192.168.0.0/24 --ip-range 192.168.0.0/25, --subnet 192.168.0.0/24 --ip-range 192.168.0.64/26) cases

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added test-cases.

@aepifanov aepifanov force-pushed the dev/ip-utilization/main branch 4 times, most recently from 6f5873e to 391b429 Compare August 4, 2025 12:33
@aepifanov aepifanov force-pushed the dev/ip-utilization/main branch 5 times, most recently from 1369797 to e6bf31d Compare August 5, 2025 12:22
@aepifanov aepifanov requested a review from corhere August 5, 2025 15:10
@aepifanov aepifanov changed the title Implements IP utilization tracking for networks [DNM] Implements IP utilization tracking for networks Aug 5, 2025
Comment on lines +161 to +189
netState := &network.NetworkState{}
if n.State != nil {
if err := json.Unmarshal(n.State.Value, netState); err != nil {
log.G(context.TODO()).WithError(err).Warnf("Failed to unmarshal network state for network %s", n.ID)
}
}
if netState.IPAM == nil {
netState = nil
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The daemon instance converting the swarmapi.Network to a network.Inspect might not be the same process, let alone the same version, as the one that produced the swarmapi.Network value. Could you please future-proof the conversion to gracefully handle network state values of unexpected types? Having the consumer assume a particular schema of the Any value hampers our flexibility to change how the network state is encoded.

Comment on lines +224 to +232
for remaining > 0 {
if cur.count <= remaining {
selected += uint64(bits.OnesCount32(cur.block)) * cur.count
remaining -= cur.count
cur = cur.next
} else {
selected += uint64(bits.OnesCount32(cur.block)) * remaining
remaining = 0
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recent versions of Go now have convenient max and min builtins.

Suggested change
for remaining > 0 {
if cur.count <= remaining {
selected += uint64(bits.OnesCount32(cur.block)) * cur.count
remaining -= cur.count
cur = cur.next
} else {
selected += uint64(bits.OnesCount32(cur.block)) * remaining
remaining = 0
}
}
for remaining > 0 {
n := min(cur.count, remaining)
selected += uint64(bits.OnesCount32(cur.block)) * n
remaining -= n
cur = cur.next
}

Comment on lines +241 to +242
// CalculateSelected calculates the number of selected bits in the range [start, end].
func (h *Bitmap) CalculateSelected(start, end uint64) (uint64, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bikeshedding: name functions based on their outputs and side effects, not on irrelevant implementation details. It makes no difference to the caller whether the number of set bits is calculated on the fly, cached, memoized, indexed, etc., so long as it gets the correct answer.

Suggested change
// CalculateSelected calculates the number of selected bits in the range [start, end].
func (h *Bitmap) CalculateSelected(start, end uint64) (uint64, error) {
// OnesCount calculates the number of selected bits in the range [start, end].
func (h *Bitmap) OnesCount(start, end uint64) (uint64, error) {

Comment on lines +665 to +666
func hiBlockMask(bitPos uint64) uint32 {
if bitPos >= uint64(blockLen) {
return 0
}
if bitPos == 0 {
return blockFull
}
return (blockFirstBit >> (bitPos - 1)) - 1
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
func hiBlockMask(bitPos uint64) uint32 {
if bitPos >= uint64(blockLen) {
return 0
}
if bitPos == 0 {
return blockFull
}
return (blockFirstBit >> (bitPos - 1)) - 1
}
func hiBlockMask(bitPos uint64) uint32 {
return blockFull >> bitPos
}

assert.Check(t, is.Equal(uint32(0xffffffff), loBlockMask(33)))
}

func TestCalculateSelected(t *testing.T) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would trust the implementation a lot more if CalculateSelected was tested by setting up the preconditions through the public interface of bitmap. By directly constructing the sequences it's possible that an incorrect assumption in the code could be carried through to the test cases such that the tests don't catch the bug. This test as written would not catch if you mixed up the bit-endianness of the blocks, for instance.


ipamState := map[string]network.IPAMState{}
for _, ii := range ipam4Infos {
cidr, is, err := defaultipam.GetIPAMStateForPoolID(ii.PoolID, ipam)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is generic libnetwork code for any network calling a function from the default IPAM driver? What happens if the IPAM driver for the network is not defaultipam?

go.mod Outdated

replace github.com/moby/moby/client => ./client

replace github.com/moby/swarmkit/v2 v2.0.0 => github.com/aepifanov/swarmkit/v2 v2.0.0-20250805114649-a1269cd7a2a8
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't merge this PR with the Swarmkit dependency replaced with your fork. Please convert this PR to draft.

@aepifanov aepifanov force-pushed the dev/ip-utilization/main branch 4 times, most recently from 3be6f3b to e1cbc0e Compare August 7, 2025 13:17
Comment on lines +178 to +183
if env.Version > 1 && env.Data != nil {
log.G(context.TODO()).WithError(err).Warnf("Try to unmarshal it anyway since the version is higher %d than supported %d and might be backward compatible for network %s", env.Version, 1, n.ID)
if err := json.Unmarshal(env.Data, netState); err != nil {
log.G(context.TODO()).WithError(err).Warnf("Failed to unmarshal network state for network %s", n.ID)
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the point of versioning the type if you're just going to proceed anyway? Why would we bump the version if it's backwards-compatible with v1? And if the type is not backwards-compatible, it's a different data type by definition, so changing the type name would be the more appropriate action than bumping the version.

netState := &network.NetworkState{}
if n.State != nil {
env := network.StringifyEnvelope{}
if err := json.Unmarshal(n.State.Value, &env); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does the StringifyEnvelope help to enable us to switch to another codec, e.g. protobuf or msgpack, for the state value in the future? It looks like this will lock us into JSON forever.

if err := json.Unmarshal(n.State.Value, &env); err != nil {
log.G(context.TODO()).WithError(err).Warnf("Failed to unmarshal network state for network %s", n.ID)
} else if env.Type != network.NetworkStateType {
log.G(context.TODO()).WithError(err).Warnf("network state for network %s has unexpeted message type %s", n.ID, env.Type)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What can we infer about how this manager node relates to the leader when this code path is taken? Is there something actionable we could include in the log message?

Comment on lines +1012 to +1014
if len(state.IPAM) != 0 {
stringifyState, err := json.Marshal(&state)
if err != nil {
return errors.Wrapf(err, "failed to marshal network state for network %s", net.ID)
}
// Marshal the network state to a JSON string and store it in the network's State field.
// This allows avoiding further changes to the Swarm API and preserves backward compatibility in future releases.
envelopedState := networktypes.StringifyEnvelope{
Type: networktypes.NetworkStateType,
Version: 1,
Data: stringifyState,
}
stringifyEnvelopeState, err := json.Marshal(&envelopedState)
if err != nil {
return errors.Wrapf(err, "failed to marshal network state for network %s", net.ID)
}
net.State = &gogotypes.Any{
TypeUrl: "types.docker.com/NetworkState",
Value: stringifyEnvelopeState,
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please co-locate the marshalling code in the same package as the corresponding unmarshalling code. I recommend putting it in daemon/cluster/convert/network.go, the same file as the unmarshaller and same package as all the other code which converts between Engine and Swarm API types.

EnableIPv4 bool // EnableIPv4 represents whether IPv4 is enabled
EnableIPv6 bool // EnableIPv6 represents whether IPv6 is enabled
IPAM IPAM // IPAM is the network's IP Address Management
State *NetworkState `json:",omitempty"` // State represents the state of the network
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New fields added to API responses need to be gated on the API version so that people developing clients are forced to set the API version correctly and therefore have their clients fail fast when connecting to incompatible engines. Please modify the API endpoints to clear this field on API versions less than the latest (v1.52)

EnableIPv4 bool // EnableIPv4 represents whether IPv4 is enabled
EnableIPv6 bool // EnableIPv6 represents whether IPv6 is enabled
IPAM IPAM // IPAM is the network's IP Address Management
State *NetworkState `json:",omitempty"` // State represents the state of the network
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please update the Swagger spec and API changelog with this addition

@aepifanov aepifanov force-pushed the dev/ip-utilization/main branch from e1cbc0e to 4ffa31b Compare August 8, 2025 14:56
Adds a new method:

- CalculatedSelected(start, end uint64) (uint64, error)

    Returns the number of selected bits within the specified bit range
    (start to end, inclusive). Useful for calculating IP utilization
    within a given range.

Signed-off-by: Andrey Epifanov <aepifanov@mirantis.com>
Adds the following new methods:

- Selected() uint64

    Returns the number of selected addresses in the set.

- CalculateSelectedInRange(netip.Prefix) (uint64, error)

    Returns the number of selected addresses in the ip-range.

- CalculateSelected() (uint64, error)

    Returns the number of selected addresses in the set.

Signed-off-by: Andrey Epifanov <aepifanov@mirantis.com>
Adds a new method:

- GetAllocatedIPs(nw, ipr netip.Prefix) (allocatedIPsInSubnet, allocatedIPsInPool uint64, err error)

    Returns the number of addresses allocated in both the subnet and the ip-range pool.

Signed-off-by: Andrey Epifanov <aepifanov@mirantis.com>
@aepifanov aepifanov force-pushed the dev/ip-utilization/main branch from 4ffa31b to 9401ebf Compare August 11, 2025 10:07
Extend network inspection with State object which includes IPAM state with ip utilization details.

Add a new method to the Ipam interface:

- GetAllocatedIPs(poolID string) (allocatedIPsInSubnet, allocatedIPsInPool uint64, err error)

    Returns the number of addresses allocated in both the subnet and the ip-range pool.

Signed-off-by: Andrey Epifanov <aepifanov@mirantis.com>
Signed-off-by: Andrey Epifanov <aepifanov@mirantis.com>
Signed-off-by: Andrey Epifanov <aepifanov@mirantis.com>
Signed-off-by: Andrey Epifanov <aepifanov@mirantis.com>
Adds support for inspecting overlay networks with a State object
that includes IPAM state and IP utilization details.

To enable this, the cnmallocator is extended with a new
UpdateNetworkState() method, used in a callback from the Swarm API
server to populate IPAM data in the network State object. This
State object is passed to the API server as a blob, avoiding
further changes to the Swarm API and preserving backward
compatibility in future releases.

Signed-off-by: Andrey Epifanov <aepifanov@mirantis.com>

# Conflicts:
#	go.mod
Signed-off-by: Andrey Epifanov <aepifanov@mirantis.com>
@corhere
Copy link
Contributor

corhere commented Sep 15, 2025

@corhere corhere closed this Sep 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/networking/ipam Networking impact/api impact/changelog impact/documentation kind/feature Functionality or other elements that the project doesn't currently have. Features are new and shiny

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants