Skip to content

Conversation

@robmry
Copy link
Contributor

@robmry robmry commented Oct 7, 2024

- What I did

The current behaviour for IPv4, in setupIPForwarding, when the daemon runs with "iptables": true is:

  • if the daemon is running with --ip-forward (the default), and
  • if the kernel's net.ipv4.ip_forward = 0 when the daemon starts, then
  • enable net.ipv4.ip_forward and set the FORWARD chain's default policy to DROP.

The current behaviour for IPv6, when "ip6tables": true is to unconditionally set the filter FORWARD chain's policy to DROP. Because ip6tables has been enabled by-default in 27.x releases, this has caused issues on hosts that relied on an ACCEPT policy to make other things work. (For example, if ip6tables wasn't explicitly disabled, even if no IPv6 networks were created, dockerd set the ip6tables filter-FORWARD policy to DROP).

- How I did it

Still set the filter-FORWARD policy to DROP if enabling IP forwarding.

  • This happens on startup for IPv4, when the default bridge network is configured. For IPv6, it happens when an IPv6 network is created.
  • If IP forwarding was enabled by something else before docker starts, the policy is left as it was. This is the old behaviour for IPv4, but a change for IPv6.
  • Added daemon option --ip-filter-forward-drop=<bool> ... the default is true and means that the policy will be set to drop if the daemon enables IP forwarding (as described above). If false docker will not update the policy:
    •   --ip-forward                              Enable IP forwarding in system configuration (default true)
        --ip-forward-no-drop                      Do not set the filter-FORWARD policy to DROP when enabling IP forwarding
      

- How to verify it

New integration tests.

- Description for the changelog

- docker will now only set the `ip6tables` policy for the `FORWARD` chain in the `filter` table to `DROP` if
  it enables IP forwarding on the host itself (sysctls `net.ipv6.conf.all.forwarding` and
  `net.ipv6.conf.default.forwarding`). This is now aligned with existing IPv4 behaviour.
  - If IPv6 forwarding is enabled on your host, but you were depending on docker to set the ip6tables
    filter-FORWARD policy to `DROP`, you may need to update your host's configuration to make sure it
    is secure.
  - If docker enables IP forwarding on your host and you do not want it to set the filter-FORWARD policy
    to `DROP`, use daemon command line option `--ip-forward-no-drop` or `"ip-forward-no-drop": true`
    in `daemon.json`.
- Direct routed access to container ports that are not exposed using `-p`/`--publish` is now blocked
   in the `DOCKER` chain.
  - If the default filter-FORWARD policy was previously left at `ACCEPT` on your host, and direct routed
    access to a container's unpublished ports from a remote host is still required, options are:
    - Publish the ports you need.
    - Re-create the network with `--gateway_mode_ipv4=routed` and/or `--gateway_mode_ipv6=routed`.
      - No NAT rules will be set up to map host addresses/ports to containers, but published ports will be
        accessible from remote hosts if routing has been set up in the network.
    - (FIXME - mention `nat-unprotected`, to be implemented in an upcoming PR.)
- Container ports published to host addresses will continue to be accessible via those host
  addresses, using NAT or the userland proxy.
- Unpublished container ports continue to be directly accessible from the docker host via the
  container's IP address.

@robmry
Copy link
Contributor Author

robmry commented Oct 18, 2024

Hmm, apologies - some last minute second thoughts about this ...

If we do enable forwarding for IPv4, and don't set the policy to DROP - this PR means the bridge networks will be protected ... but enabling forwarding might expose other-stuff on the host. We'd have to let users know they need to set the policy to DROP, if needed. But, a lot won't know whether it's needed, or won't notice the warning.

Perhaps we should continue to set the policy to DROP if we enable IPv4 forwarding - and do the same for IPv6; only set the policy if we enable forwarding, rather than unconditionally if ip6tables=true.

Then we'd need to document that "if you don't want us to break other things by setting the DROP policy, you need to enable ip forwarding before starting the daemon"?

We could also add daemon options to say what the filter-FORWARD policies should be for v4/v6 (or, alternatively and maybe-better, whether to leave them alone). Then, there'd be no need for users to work out how to manually configure IP forwarding before starting the daemon.

This PR will still be a big improvement because an ACCEPT policy will no longer leave bridge networks exposed, but setting the policy to DROP would be s safer default - even if the default breaks things. We can write some loud documentation about what sort of things might break with a DROP policy, and how to stop docker from setting it.

@robmry robmry force-pushed the 48365_iptables_forward_policy branch from 5933022 to 25cc9ae Compare October 22, 2024 10:57
@robmry robmry changed the title Don't change the iptables filter-FORWARD policy Don't depend on the iptables filter-FORWARD policy Oct 22, 2024
@robmry robmry marked this pull request as draft October 22, 2024 11:00
@robmry robmry force-pushed the 48365_iptables_forward_policy branch from 25cc9ae to 045c8e1 Compare October 22, 2024 11:37
akerouanton
akerouanton previously approved these changes Oct 28, 2024
}

// Enable IPv4 forwarding only if it is not already enabled
if ipv4ForwardData[0] != '1' {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can early-return if the sysctl is already set to 1.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


// When enabling ip_forward set the default policy on forward chain to drop.
if wantFilterForwardDrop {
if err := setFilterForwardDrop(iptables.IPv4); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if we should invert the logic - set the default policy, and then toggle the sysctl. This would remove the need for recovery code.

Copy link
Contributor Author

@robmry robmry Oct 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Setting the sysctl will fail if, for example, "/proc/sys/net" is mounted read-only and, by the time this is happening, if iptables is enabled we'll already have tried to mess with the rules (so the sysctl setting is perhaps more likely to fail).

Swapping them would just mean having to undo the policy change if the sysctl update fails. So I'm not sure it'd help - and it's probably best to do the thing that's most likely to fail first, because then the other thing won't toggle. (But perhaps I missed the point?!)

}()

// Get current IPv6 all forwarding setup
ipv6ForwardDataAll, err := os.ReadFile(ipv6ForwardConfAll)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it'd be benefitial here too to read both files, early-return if both are already set, and if not then start with ipt default policy and then set both unconditionnally. That'd make the code simpler I think.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above - we'd still need to undo anything done so-far, so I'm not sure it'd help?

But I've put the undo-sysctl code in the right place now (oops), so perhaps it makes more sense?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's existing code, but curious; we fail this function if we fail to read, but we don't return an error when failing to update it? Could that have situations where we would not actually be updating the config? (Are those save to be ignored?).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like for IPv4 we have a separate utility (configureIPForwarding()). Wondering if (not for this PR), we should have that utility handle the "changed" or "not changed" case, e.g. something like;

func configureIPForwarding(file string, enable bool) (changed bool, _ error) {
	data, err := os.ReadFile(file)
	if err != nil {
		return false, fmt.Errorf("cannot read IP forwarding setup: %w", err)
	}
	if enabled := data[0] == '1'; enabled == enable {
		return false, nil
	}
	var val byte
	if enable {
		val = '1'
	}
	if err := os.WriteFile(file, []byte{val, '\n'}, 0o644); err != nil {
		return false, fmt.Errorf("cannot write IP forwarding setup: %w", err)
	}
	return true, nil
}

That would return a boolean to indicate we changed anything, but an error when failing to either read or update (but that depends a bit on my previous comment).

The above wraps the errors, so we could detect os.IsnotFound() situations (if those are ok to be ignored).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, yes ... things won't work properly if IPv6 forwarding is needed and it can't be enabled, and I can't see why the error handling should be different to IPv4 - so I'll do that, thank you.


func TestSetupIPForwarding(t *testing.T) {
// Read current setting and ensure the original value gets restored
defer netnsutils.SetupTestOSContext(t)()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

// host, it also sets the iptables filter-FORWARD policy to DROP (unless it's
// told not to).
func TestFilterForwardPolicy(t *testing.T) {
skip.If(t, testEnv.DaemonInfo.OSType == "windows", "no bridge driver or iptables on Windows")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
skip.If(t, testEnv.DaemonInfo.OSType == "windows", "no bridge driver or iptables on Windows")

That's not needed, we're in bridge_linux_test.go.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point!

@akerouanton akerouanton dismissed their stale review October 28, 2024 16:48

Wrong button

@robmry robmry force-pushed the 48365_iptables_forward_policy branch 3 times, most recently from 4d3d6e8 to 053c454 Compare October 29, 2024 10:22
}()

// Get current IPv6 all forwarding setup
ipv6ForwardDataAll, err := os.ReadFile(ipv6ForwardConfAll)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's existing code, but curious; we fail this function if we fail to read, but we don't return an error when failing to update it? Could that have situations where we would not actually be updating the config? (Are those save to be ignored?).

}()

// Get current IPv6 all forwarding setup
ipv6ForwardDataAll, err := os.ReadFile(ipv6ForwardConfAll)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like for IPv4 we have a separate utility (configureIPForwarding()). Wondering if (not for this PR), we should have that utility handle the "changed" or "not changed" case, e.g. something like;

func configureIPForwarding(file string, enable bool) (changed bool, _ error) {
	data, err := os.ReadFile(file)
	if err != nil {
		return false, fmt.Errorf("cannot read IP forwarding setup: %w", err)
	}
	if enabled := data[0] == '1'; enabled == enable {
		return false, nil
	}
	var val byte
	if enable {
		val = '1'
	}
	if err := os.WriteFile(file, []byte{val, '\n'}, 0o644); err != nil {
		return false, fmt.Errorf("cannot write IP forwarding setup: %w", err)
	}
	return true, nil
}

That would return a boolean to indicate we changed anything, but an error when failing to either read or update (but that depends a bit on my previous comment).

The above wraps the errors, so we could detect os.IsnotFound() situations (if those are ok to be ignored).

Comment on lines 63 to 70
if err := os.WriteFile(ipv6ForwardConfDefault, []byte{'1', '\n'}, forwardConfPerm); err != nil {
log.G(context.TODO()).Warnf("Unable to enable IPv6 default forwarding: %v", err)
}
configUpdated = true
defer func() {
if retErr != nil {
if err := os.WriteFile(ipv6ForwardConfDefault, []byte{'0', '\n'}, forwardConfPerm); err != nil {
log.G(context.TODO()).WithError(err).Error("Disabling IPv6 default forwarding failed")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The WriteFile error above is ignored (only logged), but here we try to write to it again in the defer. Wondering if the first write should be returned as well (there's nothing to "undo" if we failed to write, and it's also likely that if we failed to write the first time that it'll fail the second time?)

The configUpdated also may be incorrect here, because if we failed to write, then no config was updated 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, yes - I've aligned IPv4/IPv6 error handling as you suggested above.

return procSetting
}

func writeIPForwardingSetting(t *testing.T, chars []byte) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This utility is only used in a single test; possibly (with the other idea of having a utility for this), we could reuse that utility in the tests as well;

_, err := configureIPForwarding(ipv4ForwardConf, true)
if err != nil {
    t.Fatalf(err)
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

"os"
)

const disableIPv6ConfPerm = 0o644
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps use the same const as for IPv4 (forwardConfPerm) or even consider removing the const entirely.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed.

"UserlandProxyPath": config.BridgeConfig.UserlandProxyPath,
"Rootless": config.Rootless,
"EnableIPForwarding": config.BridgeConfig.EnableIPForward,
"EnableFilterForwardDrop": config.BridgeConfig.EnableFilterForwardDrop,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we have docs for this in man/dockerd.8.md wondering if we should look at that in a follow-up 🤔

Is this an option that can be "live reloaded" ? (i.e., enabled/disabled when reloading the daemon config?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added to the manpage.

For config reload ... it's not, like --ip-forward and --iptables/--ip6tables. I think they're all similar in that they affect host config when a network's created, and that host config won't be un-done if the daemon config is changed.

Perhaps someone might want to fix the filter-FORWARD policy manually after creating a network, then reload daemon config so that it's not changed again when the next network is created. But, I think the same applies to IP forwarding and iptables and it's probably best that they all have the same behaviour?

flags.BoolVar(&conf.BridgeConfig.EnableIP6Tables, "ip6tables", true, "Enable addition of ip6tables rules")
flags.BoolVar(&conf.BridgeConfig.EnableIPForward, "ip-forward", true, "Enable net.ipv4.ip_forward")
flags.BoolVar(&conf.BridgeConfig.EnableIPForward, "ip-forward", true, "Enable IP forwarding in system configuration")
flags.BoolVar(&conf.BridgeConfig.EnableFilterForwardDrop, "ip-filter-forward-drop", true, "Set the filter-FORWARD policy to DROP when enabling IP forwarding")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs an update in man/dockerd.8.md and the corresponding daemon reference docs in docker/cli

Also curious; should enabling this without the other option be an error-condition? Or is it fine to ignore?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added to the manpage.

I think it's ok to ignore - it says it'll only do-something if the daemon enables forwarding itself, and it won't if --ip-forward=false (or if iptables/ip6tables is disabled).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I didn't look at this closely; I just realised we're introducing a new boolean flag, but with true as default. I realise we already have some those, but if we consider this default to be true in the foreseeable future, then perhaps we should have false as default and name the flag accordingly.

Having false makes the UX more usable (instead of --boolean-flag=false), but also can help prevent issues when reading config from daemon.json where a missing field would default to "false" when unmarshaling.

(just a quick comment; about to jump in a call)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, yes - the closely related --iptables, --ip6tables and --ip-forward flags all default to true ... I guessed that was probably deliberate! They're all "batteries included" flags ... you need to leave them enabled to make things work, unless you know about the innards and have a good reason to turn them off.

So, I'm not sure a default-false option fits well with those others and, if we do change it, it'd need to be called something like --no-ip-filter-forward-drop. Then it's a bit too double-negative-y, "--no-ip-filter-forward-drop=false" means it will set the filter-FORWARD policy to DROP if it needs to.

(Unless anyone has a better idea for the flag name, it is already a bit cumbersome ... but I thought it was good to start it with "--ip..." so that it's close to those other flags in help text, and I think it'll still make sense for native nftables.)

I don't think there's an issue with merging CLI and daemon.json options, in the same way as for the other options - no option is allowed to be specified in both, if it's in the JSON it'll override the CLI's default. If not present in the JSON or the CLI, the CLI's default wins.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After further discussion in Slack, I've changed the option to --ip-forward-no-drop.

No engineers were harmed in the making of this decision!

@robmry robmry force-pushed the 48365_iptables_forward_policy branch 2 times, most recently from 211a4d0 to b632994 Compare November 6, 2024 14:22
@robmry robmry force-pushed the 48365_iptables_forward_policy branch 2 times, most recently from 1ea9b75 to 41c86b8 Compare November 7, 2024 18:43
@robmry
Copy link
Contributor Author

robmry commented Nov 7, 2024

Rebased - and added a check of config.EnableIPv4 before enabling IPv4 forwarding, now that flag has been merged to master.

Copy link
Member

@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but found some minor issues in the fourth commit; so ok to merge after those are addressed

man/dockerd.8.md Outdated
[**--insecure-registry**[=*[]*]]
[**--ip**[=*0.0.0.0*]]
[**--ip-forward**[=**true**]]
[**--ip-filter-forward-drop**[=**true**]]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like you forgot updating this one 🙈

man/dockerd.8.md Outdated
**--ip-forward-no-drop**=**true**|**false**
When **false**, the default, if Docker enables IP forwarding itself (see
**--ip-forward**), and **--iptables** or **--ip6tables** are enabled, it
will also set the default policy for the FORWARD chain in the iptables
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit; avoid future tense; this could be "it also sets the ..."

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done - there are a few existing examples of that in the text, I guess we should tidy those too (--default-address-pool, --ip-forward, --ipv6, --userns-remap, zfs.fsname).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yes, we could make a pass at doing so. We also still need to change our packaging to use the one from this repository in stead of the copy that still lives in docker/cli 🙈

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, right, there are two docker.md.8's! So we should be keeping them in step for now?

man/dockerd.8.md Outdated
Comment on lines 304 to 305
When **true**, and when IP forwarding is already enabled, Docker will
not modify the default policy of the FORWARD chain.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here; "Docker does not modify ..."

Also make it Linux-only, as the bridge driver is Linux only and
all of the tests had skips for Windows.

Signed-off-by: Rob Murray <rob.murray@docker.com>
Signed-off-by: Rob Murray <rob.murray@docker.com>
Before this change, for IPv4:
- sysctl net.ipv4.ip_forward was enabled during bridge driver
  initialisation, if:
  - not already enabled
  - ip-forward=true, and
  - iptables=true.
- the filter-FORWARD chain's policy was set to DROP, if the daemon
  updated the sysctl.
- if setting the policy failed, the sysctl change was reverted.

But, for IPv6:
- sysctls net.ipv6.conf.[default|all].forwarding were both enabled
  when creating the first IPv6-enabled network, if:
  - they weren't already enabled,
  - ip-forward=true, and
  - ip6tables=true.
- the filter-FORWARD chain's policy was set to DROP when creating
  an IPv4 enabled bridge network (inc. the default bridge), if:
  - ip6tables=true.
  - (this happened whether or not the daemon would ever enable
    IPv6 forwarding, or even create an IPv6 network.)

The bridge driver no longer needs the default policy to be DROP to
implement its own port-filtering rules. But, enabling IP forwarding
without setting the filter-FORWARD policy to DROP would potentially
be a security risk.

This change aligns IPv4 and IPv6 behaviours:
- only try to set the sysctls when creating a bridge network that
  needs them (for IPv4, that's still during daemon init because
  the default bridge is IPv4 enabled).
- only check/set the filter-FORWARD policy after updating sysctls.
- if the filter-FORWARD policy can't be set, treat it as an error
  and revert sysctl changes.

We enabled ip6tables by default in 27.0. Setting the filter-FORWARD
policy to DROP even when no IPv6 enabled network was created
caused issues for some users. In particular, those running with
iptables=false suddenly got the IPv6 DROP policy enabled (which
broke unrelated services on the host). This change solves that by
only setting the policy when necessary.

Signed-off-by: Rob Murray <rob.murray@docker.com>
The daemon no longer depends on the iptables/ip6tables filter-FORWARD
chain's policy being DROP in order to implement its port filtering
rules.

However, if the daemon enables IP forwarding in the host's system
config, by default it will set the policy to DROP to avoid potential
security issues for other applications/networks.

If docker does need to enable IP forwarding, but other applications
on the host require filter-FORWARD's policies to be ACCEPT, this
option can be used to tell the daemon to leave the policy unchanged.
(Equivalent to enabling IP forwarding before starting the daemon,
but without needing to do that.)

Signed-off-by: Rob Murray <rob.murray@docker.com>
Signed-off-by: Rob Murray <rob.murray@docker.com>
@robmry robmry force-pushed the 48365_iptables_forward_policy branch from 41c86b8 to 3dea9fd Compare November 11, 2024 12:13
@thaJeztah thaJeztah merged commit e53b1fa into moby:master Nov 11, 2024
@robmry robmry deleted the 48365_iptables_forward_policy branch November 12, 2024 12:09
aevesdocker pushed a commit to docker/docs that referenced this pull request Feb 20, 2025
## Description

Updates for moby 28.0 networking.

## Related issues or tickets

Series of commits ...
- Fix description of 'inhibit_ipv4'
- Not changed in moby 28.0, updated to clarify difference from (new)
IPv6-only networks.
- Updates to default bridge address config
  - moby/moby#48319
- Describe IPv6-only network config
  - moby/moby#48271
  - docker/cli#5599
- Update description of gateway modes
  - moby/moby#48594
  - moby/moby#48596
  - moby/moby#48597
- Describe gateway selection in the networking overview.
  - docker/cli#5664
- Describe gateway mode `isolated`
  - moby/moby#49262

## Reviews

<!-- Notes for reviewers here -->
<!-- List applicable reviews (optionally @tag reviewers) -->

- [ ] Technical review
- [ ] Editorial review
- [ ] Product review

---------

Signed-off-by: Rob Murray <rob.murray@docker.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Docker IPv6 (ip6tables) support breaks unrelated bridges

3 participants