Skip to content

Allow macvlan endpoint to start with parent down#49630

Merged
akerouanton merged 1 commit intomoby:masterfrom
robmry:macvlan_parent_down
Mar 13, 2025
Merged

Allow macvlan endpoint to start with parent down#49630
akerouanton merged 1 commit intomoby:masterfrom
robmry:macvlan_parent_down

Conversation

@robmry
Copy link
Contributor

@robmry robmry commented Mar 11, 2025

- What I did

When macvlan's parent interface is down it's not possible to send NA messages - so, it's not possible to start a container with an endpoint in that network. Allow that, to make it possible to start a standby-container in a high availability setup where the interface is down until it's needed.

- How I did it

Ignore ENETUNREACH from an NA send if no multicast route is found.

- How to verify it

New integration test. Without the fix, it fails with ...

=== FAIL: arm64.integration.network.macvlan TestParentDown (0.46s)
    macvlan_test.go:697: assertion failed: error is not nil: Error response from daemon: failed to set up container networking: failed to add interface vethfeda1c6 to sandbox: failed to advertise addresses: write ip ::1->ff02::1: sendmsg: network is unreachable

- Human readable description for the release notes

- Allow container startup when an endpoint is attached to a macvlan network where the parent interface is down.

When a macvlan's parent interface is down it's not possible
to send NA messages. So, ignore the error.

Signed-off-by: Rob Murray <rob.murray@docker.com>
@robmry robmry added this to the 28.0.2 milestone Mar 11, 2025
@robmry robmry self-assigned this Mar 11, 2025
@robmry robmry marked this pull request as ready for review March 11, 2025 17:01
@robmry robmry requested a review from akerouanton March 11, 2025 17:01
Copy link
Contributor

@vvoland vvoland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM, but would appreciate a second review from @akerouanton

return true
}
log.G(ctx).WithField("", maxWait).Warn("No route for neighbour advertisement")
return false
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a container is attached to multiple macvlan interfaces whose parent is down when the container starts, the current implementation will incur 200ms of delay per down parent.

It's good to not barf out, but I'm wondering if we should have a mechanism to skip this polling altogether, while still trying to send gratuituous ARPs/NAs 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can possibly remove the polling completely - as the comment for waitForMcastRoute says ...

// In CI, the NA send failed with "write ip ::1->ff02::1: sendmsg: network is unreachable".
// That error has not been seen since addition of the check that the veth's parent bridge port
// is forwarding, so that may have been the issue. But, in case it's a timing problem that's
// only less-likely because of delay caused by that check, make sure the route exists.

In CI the failures weren't predictable, just flaky. So, if we do remove it, perhaps the flakes will come back at a lower rate. So, particularly for a patch release, I don't think it's worth the risk of removing it - we may not see test failures before it ships.

(For the use-case described in the issue, a 200ms delay in startup of a standby server probably isn't a big issue. I think it's probably quite unusual to start a container with lots of 'down' macvlan interfaces. We can investigate further if it is an issue though.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, particularly for a patch release, I don't think it's worth the risk of removing it

Agree. We can probably remove this polling, but that's no big deal anyway.

apiClient := testEnv.APIClient()

const tap = "dummytap0"
res := icmd.RunCommand("ip", "tuntap", "add", "mode", "tap", tap)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Genuine question: why a tap instead of a dummy (like other tests)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without the fix, the test passes with a dummy interface - I didn't investigate why.

The interfaces end up with different flags though ...

25: dummy0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 92:83:68:73:fd:f3 brd ff:ff:ff:ff:ff:ff
27: tap0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 4e:2a:e8:1f:95:48 brd ff:ff:ff:ff:ff:ff

@akerouanton akerouanton merged commit d1ecb3b into moby:master Mar 13, 2025
167 checks passed
@robmry robmry deleted the macvlan_parent_down branch April 2, 2025 10:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

28.0.1: With the link down on a macvlan interface, container won't start

3 participants