nftables: never enable IP forwarding on the host#50646
Conversation
efcbb53 to
bff3648
Compare
4b920c3 to
ed345d2
Compare
|
@AkihiroSuda - one of the commits here enables the IP forwarding sysctls in rootlesskit's netns ... With The dockerd-rootless change here enables forwarding whether using iptables or nftables. I don't think there's any harm in that, there aren't any non-Docker interfaces to forward between (apart from the netns's main Not sure whether I've put the sysctl commands in the best place? |
akerouanton
left a comment
There was a problem hiding this comment.
One nit, otherwise LGTM. Thanks!
ed345d2 to
53853b4
Compare
austinvazquez
left a comment
There was a problem hiding this comment.
Changes LGTM on green CI run. At a glance, the current failures are all related to known flaky test or unknown CI issue.
|
Needs rebase for CI fixes, then g2g. 👍🏼 |
53853b4 to
5b7dc6d
Compare
thaJeztah
left a comment
There was a problem hiding this comment.
code looks sane to me; left some ramblings (as usual). Not merging as I wasn't sure if you wanted additional review from Cory
LGTM
| // to iptables built-in chains. So, if cleanup is needed, give the cleaner to | ||
| // the nftabler. Then, it'll use it to delete old rules as networks are restored. | ||
| fw.(firewaller.FirewallCleanerSetter).SetFirewallCleaner(iptabler.NewCleaner(ctx, config)) | ||
| fw.SetFirewallCleaner(iptabler.NewCleaner(ctx, config)) |
There was a problem hiding this comment.
Not for this PR, just that my curiosity led me there; this looks to be the only place where we call fw.SetFirewallCleaner to a non nil value, and the constructor is above.
Which made me curious if this should be done as part of the constructor, possibly with a DisableCleaner(), DestroyCleaner, StopCleaning() (or whatever) on the Nftabler type.
There was a problem hiding this comment.
As soon as I get a chance - I'll keep the cleaner in the bridge driver so the firewaller implementation(s) don't need to call it (review comment from @akerouanton) ... it's a bit simpler this way, because the current firewaller has all the context for deleting old rules, the bridge driver will need to fish it out somehow. But, Albin's right, if we ever add a new firewaller it shouldn't really need to call the cleaner itself.
| line, err := os.ReadFile(path) | ||
| if err != nil { | ||
| return false, err | ||
| return false, fmt.Errorf("reading '%s': %w", path, err) |
There was a problem hiding this comment.
Curious; was information missing in the error, or did you pre-emptively add this? Because os.ReadFile (and other file operations from stdlib) should usually return a fs.PathError / os.PathError, which already decorates the error with "Op" (e.g. open ), Path, and the error; https://github.com/golang/go/blob/768c51e368e05739854c1413fdf7fd129d01e482/src/io/fs/fs.go#L250-L255
// PathError records an error and the operation and file path that caused it.
type PathError struct {
Op string
Path string
Err error
}There was a problem hiding this comment.
Oh, probably preemptive - I'll get rid of it.
| f.ffd = ipv | ||
| func (f *ffDropper) FilterForwardDrop(_ context.Context, ipv firewaller.IPVersion) error { | ||
| switch ipv { | ||
| case firewaller.IPv4: |
There was a problem hiding this comment.
Ah! The other IPVersion type 😂
Not for this PR, but it got me curious; is this always either/or, or could be "both"? Considering that the other type has that as an option (so if it could set both values at once);
moby/daemon/libnetwork/types/types.go
Lines 17 to 23 in 0b1b5bf
There was a problem hiding this comment.
For the firewaller, it's always either/or ... with iptables/ip6tables the rules are always separate. I decided to keep it that way in nftables too (although it has an inet type as well as ip/ip6, so one table can deal with both families).
But, for us, there's not much advantage in trying to combine them. The rules for bridge networks can be configured quite differently for v4/v6 ... for example, NAT for v4 and direct routing for v6.
Signed-off-by: Rob Murray <rob.murray@docker.com>
Signed-off-by: Rob Murray <rob.murray@docker.com>
For nftables only, never enable IP forwarding on the host. Instead, return an error on network creation if forwarding is not enabled, required by a bridge network, and --ip-forward=true. If IPv4 forwarding is not enabled when the daemon is started with nftables enabled and other config at defaults, the daemon will exit when it tries to create the default bridge. Otherwise, network creation will fail with an error if IPv4/IPv6 forwarding is not enabled when a network is created with IPv4/IPv6. It's the user's responsibility to configure and secure their host when they run Docker with nftables. Signed-off-by: Rob Murray <rob.murray@docker.com>
Signed-off-by: Rob Murray <rob.murray@docker.com>
Signed-off-by: Rob Murray <rob.murray@docker.com>
Signed-off-by: Rob Murray <rob.murray@docker.com>
5b7dc6d to
2fff6b4
Compare
- What I did
When Docker is using iptables and it enables IP forwarding, unless daemon option
--ip-forward-no-dropis true, it sets the iptables filter-FORWARD policy to DROP.When running with nftables - don't do that, raise an error if forwarding is needed but not enabled, unless
--ip-forward=false.On a host without IPv4 forwarding enabled, nftables firewall backend, and other settings at defaults - the daemon will fail to start when it tries to create the default bridge. It will fail to restore other networks, but those errors are only logged, not sure why. In other cases, for example
--bridge=noneor creating an IPv6 network when IPv6 global forwarding is disabled, network creation will fail with an error.Log a warning when migrating from iptables to nftables when the iptables policy is DROP, because it'll drop packets that have been accepted by nftables rules (so, for example, published ports won't work).
In 29.0, nftables is opt-in, and will be documented as "experimental". So, if we get push-back on this or come up with better ideas, we'll be able to change it.
Docs will need an update to say it's the user's responsibility to update their firewall rules to block unwanted forwarding between non-Docker network interfaces.
- How I did it
check-config.shto report on the IP forwarding sysctls.- How to verify it
And, updated tests.
- Human readable description for the release notes