[DNM] Implements IP utilization tracking for networks#50450
[DNM] Implements IP utilization tracking for networks#50450aepifanov wants to merge 9 commits intomoby:masterfrom
Conversation
f523212 to
bd0d363
Compare
458003d to
381897c
Compare
corhere
left a comment
There was a problem hiding this comment.
How come you need both the PoolData.allocatedIPsInPool counters and (*addrset.AddrSet).Selected()? Aren't all reserved IPs added to an AddrSet? I would have expected .Selected() to be sufficient.
|
Could you please elaborate further? I still do not understand why the information from the AddrSet is incomplete. |
moby/daemon/libnetwork/ipams/defaultipam/structures.go Lines 18 to 25 in addc373 |
|
Okay, I think I understand. I am not entirely thrilled with there being two sources of truth for the IP address allocations, but you may have had a good reason to take this approach. Which alternative solutions did you consider and why did the approach of maintaining a running counter win out? |
c90744c to
9972c42
Compare
|
@corhere I've done the following changes from the last your review:
|
corhere
left a comment
There was a problem hiding this comment.
It does not look like the tests cover all the corner cases.
| bits := as.pool.Addr().BitLen() - as.pool.Bits() | ||
| // If there are no bitmaps, all addresses are unselected. | ||
| if len(as.bitmaps) == 0 { | ||
| return 0 | ||
| } | ||
| bms := uint64(1) | ||
| // If the subnet is bigger than Bitmap's capacity, calculate the number of bitmaps we have. | ||
| if bits > maxBitsPerBitmap { | ||
| bms = 1 << (bits - maxBitsPerBitmap) | ||
| } | ||
|
|
||
| // Calculate the number of selected addresses in each bitmap. | ||
| for _, bm := range as.bitmaps { | ||
| if bms > 0 { | ||
| bms-- | ||
| } |
There was a problem hiding this comment.
What's the purpose of the bits and bms variables? Neither one is used in the summation.
There was a problem hiding this comment.
Outdated code. Removed.
| // If there are no bitmaps, all addresses are unselected. | ||
| if len(as.bitmaps) == 0 { | ||
| return 0 | ||
| } |
There was a problem hiding this comment.
Nit: premature optimization. The loop body would execute zero times, leading to the correct answer of zero addresses selected without any wasted cycles.
|
|
||
| func newPoolData(pool netip.Prefix) *PoolData { | ||
| h := addrset.New(pool) | ||
| func newPoolData(pool, sub netip.Prefix) *PoolData { |
There was a problem hiding this comment.
What is the rationale behind changing the signature of newPoolData?
There was a problem hiding this comment.
The idea was to handle usedInRangePool counter during the PoolData initialization steps, which required the sub info as well.
But it's outdated code for now.
| bits := pool.Addr().BitLen() - pool.Bits() | ||
| if !pool.Addr().Is4() || bits > 1 { | ||
| h.Add(pool.Addr()) | ||
| pd.RequestAddress(pool, netip.Prefix{}, netip.Addr{}, nil) |
There was a problem hiding this comment.
How is this equivalent to pd.addrs.Add(pool.Addr()) when you aren't passing pool.Addr() as a parameter?
There was a problem hiding this comment.
After re-implementation of usedInIPRangePool calculation it's returned back.
| children map[netip.Prefix]struct{} | ||
|
|
||
| // The number of addresses allocated in the pool. | ||
| allocatedIPsInPool uint64 |
There was a problem hiding this comment.
From our earlier discussions I understand that this is not actually the count of addresses allocated in the nw as a whole, since that would be duplicating (PoolData).addrs.Selected(). What is counting, then? I don't see how a single integer could be used to report how many addresses are available for dynamic allocation in a specific sub without interference from other subs under the same nw shared with other container networks.
There was a problem hiding this comment.
Nice catch!
I've re-implemented usedInIPRangePool handling method to the direct calculation of the selected addresses in a bitmap for a specified range, which allows to get the actual IPAM state for any range.
| // and should no longer be modified. | ||
| func (a *Allocator) GetAllocatedIPs(poolID string) (allocatedIPsInSubnet, allocatedIPsInPool uint64, err error) { | ||
| log.G(context.TODO()).Debugf("GetAllocatedIPs(%s)", poolID) | ||
| k, err := PoolIDFromString(poolID) |
There was a problem hiding this comment.
I don't see k.ChildSubnet consumed anywhere. How can the count of addresses allocated in that range be looked up without knowing which range to count?
There was a problem hiding this comment.
In the current implementation the range is taken from the PoolID which contains it.
daemon/libnetwork/network.go
Outdated
| if !cidr.Addr().Is4() { | ||
| return 0, fmt.Errorf("IPAM state only supports IPv4 CIDRs, got %s", cidr) | ||
| } | ||
| return uint64(1) << (32 - cidr.Bits()), nil |
There was a problem hiding this comment.
We can do better than hardcode a constant:
| return uint64(1) << (32 - cidr.Bits()), nil | |
| return uint64(1) << (cidr.Addr().BitLen() - cidr.Bits()), nil |
| as, err := newAddrSpace(tc.predefined) | ||
| assert.NilError(t, err) | ||
|
|
||
| err = as.allocateSubnet(tc.subnet, tc.ipRange) |
There was a problem hiding this comment.
Where are you testing the cases with multiple ip-range allocations under a single subnet? Be sure to cover both the disjoint ranges (e.g. --subnet 192.168.0.0/24 --ip-range 192.168.0.0/25, --subnet 192.168.0.0/24 --ip-range 192.168.0.128/25) and overlapping ranges (e.g. --subnet 192.168.0.0/24 --ip-range 192.168.0.0/25, --subnet 192.168.0.0/24 --ip-range 192.168.0.64/26) cases
6f5873e to
391b429
Compare
1369797 to
e6bf31d
Compare
daemon/cluster/convert/network.go
Outdated
| netState := &network.NetworkState{} | ||
| if n.State != nil { | ||
| if err := json.Unmarshal(n.State.Value, netState); err != nil { | ||
| log.G(context.TODO()).WithError(err).Warnf("Failed to unmarshal network state for network %s", n.ID) | ||
| } | ||
| } | ||
| if netState.IPAM == nil { | ||
| netState = nil | ||
| } |
There was a problem hiding this comment.
The daemon instance converting the swarmapi.Network to a network.Inspect might not be the same process, let alone the same version, as the one that produced the swarmapi.Network value. Could you please future-proof the conversion to gracefully handle network state values of unexpected types? Having the consumer assume a particular schema of the Any value hampers our flexibility to change how the network state is encoded.
| for remaining > 0 { | ||
| if cur.count <= remaining { | ||
| selected += uint64(bits.OnesCount32(cur.block)) * cur.count | ||
| remaining -= cur.count | ||
| cur = cur.next | ||
| } else { | ||
| selected += uint64(bits.OnesCount32(cur.block)) * remaining | ||
| remaining = 0 | ||
| } | ||
| } |
There was a problem hiding this comment.
Recent versions of Go now have convenient max and min builtins.
| for remaining > 0 { | |
| if cur.count <= remaining { | |
| selected += uint64(bits.OnesCount32(cur.block)) * cur.count | |
| remaining -= cur.count | |
| cur = cur.next | |
| } else { | |
| selected += uint64(bits.OnesCount32(cur.block)) * remaining | |
| remaining = 0 | |
| } | |
| } | |
| for remaining > 0 { | |
| n := min(cur.count, remaining) | |
| selected += uint64(bits.OnesCount32(cur.block)) * n | |
| remaining -= n | |
| cur = cur.next | |
| } |
daemon/libnetwork/bitmap/sequence.go
Outdated
| // CalculateSelected calculates the number of selected bits in the range [start, end]. | ||
| func (h *Bitmap) CalculateSelected(start, end uint64) (uint64, error) { |
There was a problem hiding this comment.
Bikeshedding: name functions based on their outputs and side effects, not on irrelevant implementation details. It makes no difference to the caller whether the number of set bits is calculated on the fly, cached, memoized, indexed, etc., so long as it gets the correct answer.
| // CalculateSelected calculates the number of selected bits in the range [start, end]. | |
| func (h *Bitmap) CalculateSelected(start, end uint64) (uint64, error) { | |
| // OnesCount calculates the number of selected bits in the range [start, end]. | |
| func (h *Bitmap) OnesCount(start, end uint64) (uint64, error) { |
| func hiBlockMask(bitPos uint64) uint32 { | ||
| if bitPos >= uint64(blockLen) { | ||
| return 0 | ||
| } | ||
| if bitPos == 0 { | ||
| return blockFull | ||
| } | ||
| return (blockFirstBit >> (bitPos - 1)) - 1 | ||
| } |
There was a problem hiding this comment.
| func hiBlockMask(bitPos uint64) uint32 { | |
| if bitPos >= uint64(blockLen) { | |
| return 0 | |
| } | |
| if bitPos == 0 { | |
| return blockFull | |
| } | |
| return (blockFirstBit >> (bitPos - 1)) - 1 | |
| } | |
| func hiBlockMask(bitPos uint64) uint32 { | |
| return blockFull >> bitPos | |
| } |
| assert.Check(t, is.Equal(uint32(0xffffffff), loBlockMask(33))) | ||
| } | ||
|
|
||
| func TestCalculateSelected(t *testing.T) { |
There was a problem hiding this comment.
I would trust the implementation a lot more if CalculateSelected was tested by setting up the preconditions through the public interface of bitmap. By directly constructing the sequences it's possible that an incorrect assumption in the code could be carried through to the test cases such that the tests don't catch the bug. This test as written would not catch if you mixed up the bit-endianness of the blocks, for instance.
daemon/libnetwork/network.go
Outdated
|
|
||
| ipamState := map[string]network.IPAMState{} | ||
| for _, ii := range ipam4Infos { | ||
| cidr, is, err := defaultipam.GetIPAMStateForPoolID(ii.PoolID, ipam) |
There was a problem hiding this comment.
Why is generic libnetwork code for any network calling a function from the default IPAM driver? What happens if the IPAM driver for the network is not defaultipam?
go.mod
Outdated
|
|
||
| replace github.com/moby/moby/client => ./client | ||
|
|
||
| replace github.com/moby/swarmkit/v2 v2.0.0 => github.com/aepifanov/swarmkit/v2 v2.0.0-20250805114649-a1269cd7a2a8 |
There was a problem hiding this comment.
We can't merge this PR with the Swarmkit dependency replaced with your fork. Please convert this PR to draft.
3be6f3b to
e1cbc0e
Compare
daemon/cluster/convert/network.go
Outdated
| if env.Version > 1 && env.Data != nil { | ||
| log.G(context.TODO()).WithError(err).Warnf("Try to unmarshal it anyway since the version is higher %d than supported %d and might be backward compatible for network %s", env.Version, 1, n.ID) | ||
| if err := json.Unmarshal(env.Data, netState); err != nil { | ||
| log.G(context.TODO()).WithError(err).Warnf("Failed to unmarshal network state for network %s", n.ID) | ||
| } | ||
| } |
There was a problem hiding this comment.
What's the point of versioning the type if you're just going to proceed anyway? Why would we bump the version if it's backwards-compatible with v1? And if the type is not backwards-compatible, it's a different data type by definition, so changing the type name would be the more appropriate action than bumping the version.
daemon/cluster/convert/network.go
Outdated
| netState := &network.NetworkState{} | ||
| if n.State != nil { | ||
| env := network.StringifyEnvelope{} | ||
| if err := json.Unmarshal(n.State.Value, &env); err != nil { |
There was a problem hiding this comment.
How does the StringifyEnvelope help to enable us to switch to another codec, e.g. protobuf or msgpack, for the state value in the future? It looks like this will lock us into JSON forever.
daemon/cluster/convert/network.go
Outdated
| if err := json.Unmarshal(n.State.Value, &env); err != nil { | ||
| log.G(context.TODO()).WithError(err).Warnf("Failed to unmarshal network state for network %s", n.ID) | ||
| } else if env.Type != network.NetworkStateType { | ||
| log.G(context.TODO()).WithError(err).Warnf("network state for network %s has unexpeted message type %s", n.ID, env.Type) |
There was a problem hiding this comment.
What can we infer about how this manager node relates to the leader when this code path is taken? Is there something actionable we could include in the log message?
| if len(state.IPAM) != 0 { | ||
| stringifyState, err := json.Marshal(&state) | ||
| if err != nil { | ||
| return errors.Wrapf(err, "failed to marshal network state for network %s", net.ID) | ||
| } | ||
| // Marshal the network state to a JSON string and store it in the network's State field. | ||
| // This allows avoiding further changes to the Swarm API and preserves backward compatibility in future releases. | ||
| envelopedState := networktypes.StringifyEnvelope{ | ||
| Type: networktypes.NetworkStateType, | ||
| Version: 1, | ||
| Data: stringifyState, | ||
| } | ||
| stringifyEnvelopeState, err := json.Marshal(&envelopedState) | ||
| if err != nil { | ||
| return errors.Wrapf(err, "failed to marshal network state for network %s", net.ID) | ||
| } | ||
| net.State = &gogotypes.Any{ | ||
| TypeUrl: "types.docker.com/NetworkState", | ||
| Value: stringifyEnvelopeState, | ||
| } | ||
| } |
There was a problem hiding this comment.
Please co-locate the marshalling code in the same package as the corresponding unmarshalling code. I recommend putting it in daemon/cluster/convert/network.go, the same file as the unmarshaller and same package as all the other code which converts between Engine and Swarm API types.
| EnableIPv4 bool // EnableIPv4 represents whether IPv4 is enabled | ||
| EnableIPv6 bool // EnableIPv6 represents whether IPv6 is enabled | ||
| IPAM IPAM // IPAM is the network's IP Address Management | ||
| State *NetworkState `json:",omitempty"` // State represents the state of the network |
There was a problem hiding this comment.
New fields added to API responses need to be gated on the API version so that people developing clients are forced to set the API version correctly and therefore have their clients fail fast when connecting to incompatible engines. Please modify the API endpoints to clear this field on API versions less than the latest (v1.52)
| EnableIPv4 bool // EnableIPv4 represents whether IPv4 is enabled | ||
| EnableIPv6 bool // EnableIPv6 represents whether IPv6 is enabled | ||
| IPAM IPAM // IPAM is the network's IP Address Management | ||
| State *NetworkState `json:",omitempty"` // State represents the state of the network |
There was a problem hiding this comment.
Please update the Swagger spec and API changelog with this addition
e1cbc0e to
4ffa31b
Compare
Adds a new method:
- CalculatedSelected(start, end uint64) (uint64, error)
Returns the number of selected bits within the specified bit range
(start to end, inclusive). Useful for calculating IP utilization
within a given range.
Signed-off-by: Andrey Epifanov <aepifanov@mirantis.com>
Adds the following new methods:
- Selected() uint64
Returns the number of selected addresses in the set.
- CalculateSelectedInRange(netip.Prefix) (uint64, error)
Returns the number of selected addresses in the ip-range.
- CalculateSelected() (uint64, error)
Returns the number of selected addresses in the set.
Signed-off-by: Andrey Epifanov <aepifanov@mirantis.com>
Adds a new method:
- GetAllocatedIPs(nw, ipr netip.Prefix) (allocatedIPsInSubnet, allocatedIPsInPool uint64, err error)
Returns the number of addresses allocated in both the subnet and the ip-range pool.
Signed-off-by: Andrey Epifanov <aepifanov@mirantis.com>
4ffa31b to
9401ebf
Compare
Extend network inspection with State object which includes IPAM state with ip utilization details.
Add a new method to the Ipam interface:
- GetAllocatedIPs(poolID string) (allocatedIPsInSubnet, allocatedIPsInPool uint64, err error)
Returns the number of addresses allocated in both the subnet and the ip-range pool.
Signed-off-by: Andrey Epifanov <aepifanov@mirantis.com>
Signed-off-by: Andrey Epifanov <aepifanov@mirantis.com>
Signed-off-by: Andrey Epifanov <aepifanov@mirantis.com>
Signed-off-by: Andrey Epifanov <aepifanov@mirantis.com>
Adds support for inspecting overlay networks with a State object that includes IPAM state and IP utilization details. To enable this, the cnmallocator is extended with a new UpdateNetworkState() method, used in a callback from the Swarm API server to populate IPAM data in the network State object. This State object is passed to the API server as a blob, avoiding further changes to the Swarm API and preserving backward compatibility in future releases. Signed-off-by: Andrey Epifanov <aepifanov@mirantis.com> # Conflicts: # go.mod
Signed-off-by: Andrey Epifanov <aepifanov@mirantis.com>
9401ebf to
32d4174
Compare
- What I did
This PR introduces IP utilization tracking for Docker networks by extending the network inspection API to include IPAM state. This enhancement allows users to monitor how many IP addresses are allocated per subnet and IP range pool within a network.
It covers only IPv4.
- How I did it
Design IP-utilization
State, including nestedIPAMstate informationbridgemacvlanipvlanoverlayExample output from
docker network inspect:AllocatedIPsInSubnet: Number of addresses allocated in the subnet. If the value exceeds
uint64, it is capped at the maximumuint64value.AvailableIPsInIPRangePool: Number of available addresses for allocating in the IP range pool.
Depends on:
IP utilization in Swarm
- How to verify it
Run
docker network inspect <network>and verify the newState.IPAMsection reflects real-time allocation counts for each subnet configuration.- Human readable description for the release notes
- A picture of a cute animal (not mandatory but encouraged)