nftables: preserve filter-FORWARD drop policy on restart/upgrade#50576
nftables: preserve filter-FORWARD drop policy on restart/upgrade#50576robmry wants to merge 2 commits intomoby:masterfrom
Conversation
The daemon sets the host's filter-FORWARD firewall chain policy to DROP if it enables IP forwarding. With iptables, that policy is preserved over a daemon restart (the chain is not deleted, it's a built-in chain). But, with nftables, the daemon's table is re-created on restart, so the filter-FORWARD drop policy is lost. So, when setting the policy to "drop" after enabling IP forwarding, record that in the datastore. Every subsequent daemon restart will then set the policy to drop (for iptables or nftables). To stop the daemon from setting the policy, it must be started with option '--ip-forward-no-drop'. Signed-off-by: Rob Murray <rob.murray@docker.com>
086e527 to
fa2d5c4
Compare
fa2d5c4 to
b5683d1
Compare
| // If there are no old rules to clean up, return nil. | ||
| func NewCleaner(ctx context.Context, config firewaller.Config) firewaller.FirewallCleaner { | ||
| clean := func(ipv iptables.IPVersion, enabled bool) bool { | ||
| clean := func(ipv iptables.IPVersion, enabled bool) (bool, bool) { |
There was a problem hiding this comment.
(bool, bool) 🙈
This could use a helper struct to describe each bool.
(or at least use named returns)
There was a problem hiding this comment.
I've named the returns.
When a pre-29.0 daemon has been running, and it set the iptables filter-FORWARD policy to DROP because it enabled IP forwarding, if the daemon is stopped, upgraded to >=29.0 and nftables is enabled... The iptables policy will still be DROP. The newly started daemon won't need to enable IP forwarding, and there won't be a record in the datastore indicating that it needs to be set. So, when migrating from iptables to nftables, check the iptables filter-FORWARD policy. If it's DROP, set the nftables policy to drop, and note that in the datastore for future restarts. Then, set the iptables policy to ACCEPT, so that it doesn't drop packets for published ports (because the allow rules for them will only be in the nftables chain). Signed-off-by: Rob Murray <rob.murray@docker.com>
b5683d1 to
44b22f5
Compare
akerouanton
left a comment
There was a problem hiding this comment.
As I explained to you yesterday, I'm uneasy with this change, and with the whole idea of setting a default policy ourselves. The heuristic we use proved to be brittle in the past, this is getting better with this PR, but I think there are still edge-cases, some being probably hard to solve.
If we fail to set the default policy in some legitimate cases, this could be considered a security issue on our side, and worth a CVE. But if we end up in this situation, that would be caused by us drawing a target on our back… I don't like that.
treating the introduction of nftables support as an opportunity to make a significant change.
Do you think we'll have a better opportunity? From my standpoint, nftables support being introduced, and experimental, this is the best opportunity we've to try alternatives and gather feedback through GH discussion (or by any other means).
If we decide to test something else in the next release, wouldn't it be surprising for nftables users that we change how the default policy is set? Is this likely to delay nftables GA? IMO it'd be less surprising to do it right now.
But, it's been moby/libnetwork#2450 and moby/libnetwork#2460 due to fallout including this issue in the Debian tracker.
As you mentioned, that revert doesn't mention whether an equivalent of --ip-forward-no-drop was considered, but now we've it. So maybe it's not as bad as before to set the default DROP policy unconditionally.
If that default policy breaks something, it's either because another virtualization / containerization software is installed, or because the machine is multi-homed and routing across interfaces stops working. Depending on the install order, the system state at first startup, etc… that breakage could happen once the machine is rebooted, so maybe weeks, months, or even years after the Engine was installed, making the whole thing harder to root cause.
As I tried to explain yesterday, if we're in one of these scenarios, then there's a high probability that there's a sysadmin in the chair. Let them decide what they want to do instead of trying to be smart and decide in place of them through a dubious (and brittle) heuristic.
Deciding what the default policy is (or even if one is needed) takes sysadmins to consider the host's interfaces, containerization / virt software installed, which flows coming from VMs and destined to containers should be allowed or denied, etc. These aren't under Engine's control, so we can't take these under consideration.
By setting the default DROP policy unconditionally, we're at least very consistent, and break everything on first start. But, these excerpts from the Debian issue you linked to are a good reason to actually not set any default policy at all:
HUH??? Since when is it okay for software that is not advertised as firewall software to modify the sysadmin's implicit or explicit iptables setup for IP packets completely unrelated to said software?
Docker is hostile to other applications
So…
Alternatively, go the other way - there's certainly a case to be made that Docker shouldn't enable IP forwarding on the host, or change the firewall policy to drop
I think enabling IP forwarding automatically isn't really the issue here — it's required by the bridge driver to operate.
However, we don't depend on the default policy anymore, so the only reason we still set it automatically is to ensure that we don't create security issues on users' system. For the iptables backend, it'd be pretty bad to change that behavior now because some users may currently depend on us setting it (knowingly or not), so we'd be creating security holes on upgrade.
We could consider that it's not our job to set the default policy, and if users are facing one of the risky scenarios I listed above, they must be knowledgeable enough to exercise due diligence, and carefully choose the nft rules they want for the rest of their system. FWIW that's what libvirt, lxc, podman / CNI, etc. do (at least, AFAIK): enable forwarding, but don't set any default policy.
If we want to experiment with this approach, or the other one (unconditionally set the default DROP policy), we may still want to store whether the Engine enabled forwarding such that we can switch to your approach in a later version.
| if ipv == firewaller.IPv4 { | ||
| return ic.filterForwardDrop4 | ||
| } | ||
| return ic.filterForwardDrop4 |
There was a problem hiding this comment.
| return ic.filterForwardDrop4 | |
| return ic.filterForwardDrop6 |
| if !t.Exists("filter", "FORWARD", "-j", DockerForwardChain) && | ||
| !t.Exists("filter", "FORWARD", "-j", isolationChain1) { | ||
| return false | ||
| return false, false |
There was a problem hiding this comment.
So, if I upgrade the Engine and switch to nftables backend, but between Engine stop and re-start, decide to flush my chains for whatever reason (maybe I want to switch something else from iptables-nft to nft), the Engine is going to restart in nftables mode and won't set the default DROP policy.
Is this desirable?
| if err := nftApply(ctx, table); err != nil { | ||
| return err | ||
| } | ||
| return nil |
| ipvKey = ipForwarding6 | ||
| } | ||
| if err := d.storeUpdate(ctx, &filterForwardDrop{ipv: ipvKey}); err != nil && !errors.Is(err, datastore.ErrKeyModified) { | ||
| log.G(ctx).WithError(err).Error("Cannot persist IP forwarding enabled flag") |
There was a problem hiding this comment.
Let's say the partition storing /var/lib/docker is full, the Engine starts and sets ip_forward=1. Then this call to storeUpdate fails with ENOSPC, user figures out what's going on and fixes the issue, and restarts their Engine (without rebooting). They will end up with no default policy.
Another scenario, the Engine crashes after it sets ip_forward=1, but before this operation completes. This will end up with default policy too.
Is this desirable ?
| return err | ||
| } | ||
|
|
||
| return d.initStore() |
There was a problem hiding this comment.
If I upgrade to v29.0, stays on the iptables fwbackend at first, but manually set the default policy to ACCEPT and then decide to switch to nftables, migrateFilterForwardDrop will not touch the default policy, but initStore() will re-set DROP.
Is this acceptable?
| // the daemon is first started. On restarts with nftables, the policy is lost when | ||
| // the nftables table is reconstructed. So, it needs to be re-set here.) | ||
| func (d *driver) initForwardingPolicy(ctx context.Context) error { | ||
| ffds, err := d.store.List(&filterForwardDrop{}) |
There was a problem hiding this comment.
Users deleting /var/lib/docker, or /var/lib/docker/network, or local-kv.db is probably rare but not unhearded of (see this issue for instance). If this happens, ip_forward will be enabled, and there won't be any persisted state, so this code path will consider that the default policy should be ACCEPT.
Is that acceptable?
|
To be clear, if other maintainers think it's okay to take this route, then fine - let's do that. |
- What I did
See the linked issue for a description of the problems.
One option would be to always set the filter-FORWARD policy to "drop" for nftables - treating the introduction of nftables support as an opportunity to make a significant change. But, it's been tried before and reverted due to fallout including this issue in the Debian tracker. Before #48594, when ip6tables was enabled, the policy for IPv6 was always set to DROP - and that caused problems when we enabled ip6tables by default. So, that PR aligned the IPv6 behaviour with the long standing IPv4 behaviour. (However, #48594 also added daemon option
--ip-forward-no-drop, which could now be used to deal with those problems. So, a new install of Docker would potentially break things on the host but, after some debugging, that problem could be fixed by manually setting policy back to accept, adding daemon option--ip-forward-no-drop, and restarting it. (But, I think Docker might be in the bin before then in many cases!)Alternatively, go the other way - there's certainly a case to be made that Docker shouldn't enable IP forwarding on the host, or change the firewall policy to drop, unless it's told to. So, the daemon could error out if configured with ip forwarding enabled (the default), but forwarding isn't enabled in the kernel. It could then offer separate options for the user to tell it to enable forwarding, and set the firewall's forwarding policy to "drop". But, that's quite a significant change for existing users, on a host where the daemon had previously been enabling forwarding, it would appear to be fine after upgrade, then just fail to run after a reboot. So, this needs more consideration, but it's probably where we should be heading.
For the time being at least - try to preserve the existing behaviour ... only set the firewall policy to drop after enabling IP forwarding, and don't set the policy if forwarding was already enabled. But, when setting the policy to drop, remember that for the next daemon restart - it'll then be set to drop on each subsequent daemon restart (including after a reboot, even if IP forwarding was enabled by something else before dockerd started that time). That deals with the nftables restart case, and makes it less susceptible to changes in startup ordering... if Docker's messing with the policy and that's unwanted, it'll always do that so problems should be easier to track down. To stop it from setting the policy to drop on subsequent restarts, option
--ip-forward-no-dropwill be needed.- How I did it
First commit ...
Set filter-FORWARD drop on restart after enabling IP forwarding
The daemon sets the host's filter-FORWARD firewall chain policy to DROP if it enables IP forwarding.
With iptables, that policy is preserved over a daemon restart (the chain is not deleted, it's a built-in chain). But, with nftables, the daemon's table is re-created on restart, so the filter-FORWARD drop policy is lost.
So, when setting the policy to "drop" after enabling IP forwarding, record that in the datastore. Every subsequent daemon restart will then set the policy to drop (for iptables or nftables).
To stop the daemon from setting the policy, it must be started with option '--ip-forward-no-drop'.
Second commit ...
Migrate iptables filter-FORWARD policy on upgrade
When a pre-29.0 daemon has been running, and it set the iptables filter-FORWARD policy to DROP because it enabled IP forwarding, if the daemon is stopped, upgraded to >=29.0 and nftables is enabled...
The iptables policy will still be DROP. The newly started daemon won't need to enable IP forwarding, and there won't be a record in the datastore indicating that it needs to be set.
So, when migrating from iptables to nftables, check the iptables filter-FORWARD policy. If it's DROP, set the nftables policy to drop, and note that in the datastore for future restarts. Then, set the iptables policy to ACCEPT, so that it doesn't drop packets for published ports (because the allow rules for them will only be in the nftables chain).
- How to verify it
Manual test:
iptables -L FORWARD,ip6tables -L FORWARD).filter-FORWARDchains is "drop" (nft list ruleset).Updated unit and integration tests.
- Human readable description for the release notes