Fix possible overlapping IPs#42432
Conversation
|
@dperny could you update the commit message with your description? |
There was a problem hiding this comment.
Do we need to check if e.nodeObj is not nil here? (I guess it would panic in that case)
There was a problem hiding this comment.
Also wondering if there's concurrency here (if a (RW)Mutex is needed for accessing e.nodeObj (no idea, but putting it here as a comment 😅
There was a problem hiding this comment.
Good catch on the nil-check. e.nodeObj will be nil on the first call to Configure. Incredibly embarrassed to have missed that.
I've added a comment explaining why nodeObj does not need to be guarded by a mutex.
|
@dperny needs a rebase, now that libnetwork was removed into this repository |
There was a problem hiding this comment.
| "github.com/docker/libnetwork" | |
| "github.com/docker/docker/libnetwork" |
A node is no longer using its load balancer IP address when it no longer has tasks that use the network that requires that load balancer. When this occurs, the swarmkit manager will free that IP in IPAM, and may reaassign it. When a task shuts down cleanly, it attempts removal of the networks it uses, and if it is the last task using those networks, this removal succeeds, and the load balancer IP is freed. However, this behavior is absent if the container fails. Removal of the networks is never attempted. To address this issue, I amend the executor. Whenever a node load balancer IP is removed or changed, that information is passedd to the executor by way of the Configure method. By keeping track of the set of node NetworkAttachments from the previous call to Configure, we can determine which, if any, have been removed or changed. At first, this seems to create a race, by which a task can be attempting to start and the network is removed right out from under it. However, this is already addressed in the controller. The controller will attempt to recreate missing networks before starting a task. Signed-off-by: Drew Erny <derny@mirantis.com>
|
@thaJeztah I bumped this and changed the libnetwork import path. It should be good to go. |
thaJeztah
left a comment
There was a problem hiding this comment.
LGTM
sorry, though I already approved
|
@cpuguy83 @AkihiroSuda ptal |
- What I did
Fix a possibility where overlapping IP addresses could exist as a result of the node failing to clean up its old loadbalancer IPs.
Fixes #40989, supercedes #41062.
closes #41062
- How I did it
A node is no longer using its load balancer IP address when it no longer has tasks that use the network that requires that load balancer. When this occurs, the swarmkit manager will free that IP in IPAM, and may reassign it.
When a task shuts down cleanly, it attempts removal of the networks it uses, and if it is the last task using those networks, this removal will succeed:
https://github.com/dperny/docker/blob/abe0a10c6839ce301bc5a0e43bde4633af01f50d/daemon/cluster/executor/container/controller.go#L378-L384
However, this behavior is absent if the container fails. Removal of the network is never attempted.
To address this issue, I amend the executor. Whenever a node load balancer IP is removed or changed, that information is passed to the executor by way of the Configure method. By keeping track of the set of node NetworkAttachments from the previous call to Configure, we can determine which, if any, have been removed or changed.
At first, this seems to create a race, by which a task can be attempting to start and the network removed right out from under it. However, closer inspection of the code shows that this is already addressed in the controller:
https://github.com/dperny/docker/blob/abe0a10c6839ce301bc5a0e43bde4633af01f50d/daemon/cluster/executor/container/controller.go#L207-L225
The controller will attempt again to create the network as long as the network is missing.
- How to verify it
I'm not certain how to write a sane test for this. I've mostly verified it by logic.
- Description for the changelog
Fixed a bug where swarm could have duplicate IPs when the last task on a network on a node failed.