Skip to content

18.09 breaks containers name resolution for non default networks on systems with systemd-resolved #38243

@superseed

Description

@superseed

Description

Briefly:

After upgrading to 18.09 from 18.06, on a system with systemd-resolved, containers attached to a bridge network that is not the default docker bridge can't resolve some names. Those names are the ones that can only resolved through a nameserver that is not "the first one".

More details from investigating:

Some background first: I've been using non-default bridge networks almost exclusively because of the known bad interaction between systemd-resolved's /etc/resolv.conf and the default docker network. In my case, I have a 192.168.* nameserver accessible through an interface, and another nameserver accessible through a vpn interface, resolving names internal to the vpn. With the default bridge, I would invariably end up with 8.8.8.8 in my container's /etc/resolv.conf which meant no internal name resolution. Using bridges other than the default, however, this was fixed perfectly thanks to docker's amazing internal DNS feature, which delegated to my host's resolver, systemd-resolved on 127.0.0.53 as instructed per my host's /etc/resolv.conf.

Now, as an aside, I see that the bad interaction between systemd-resolved and the default network has been "fixed" in 18.09 by copying systemd-resolved's stub resolv.conf instead of the one in /etc/ with 127.0.0.53 and stripping the local resolver. This is nice but it won't solve my problem (two interfaces with one nameserver each) though, because the stub resolv.conf is incomplete due to limitations of this config format: it ends up containing the first nameserver (192.168.*) and not the vpn interface one. I get half of the cookies, basically. But it's an improvement anyway, and I know it's impossible to fix completely without changing the default network's behavior (using the internal DNS).

To get to the point: I don't know if docker's internal DNS behavior was borked during the process of implementing the aforementioned fix, but I notice that now, name resolution requests emitted by the internal DNS (so, coming from a container on a non-default bridge network) are not addressed to 127.0.0.53 as they should (as per my host's /etc/resolv.conf) but rather to the 192.168.* nameserver, as per the stub resolv.conf which is incomplete. This means name resolution for names accessible only through the second nameserver is broken. In essence: the internal DNS uses the wrong nameserver, it should use the one in /etc/resolv.conf.

Final remark

Could we please have a daemon.json option to activate the same behavior on the default bridge as non-default bridges? I understand that breakage needs to be avoided and hence this behavior can't be had by default, but it would be nice to be able to ask the daemon for it. In my case, it would make building images much smoother :)

Steps to reproduce the issue:

  1. Upgrade 18.06 to 18.09 on a systemd-resolved based system with multiple nameservers (on different interfaces?)
  2. Start a container on a non-default bridge network
  3. Try resolving names accessible only from one or the other nameserver

Describe the results you received:

In my case, it notably breaks pip install because internal pypi on vpn.

Describe the results you expected:

Not having a name resolution error!

Output of docker version:

Client:
 Version:           18.09.0-ce
 API version:       1.39
 Go version:        go1.11.2
 Git commit:        4d60db472b
 Built:             Fri Nov  9 00:05:34 2018
 OS/Arch:           linux/amd64
 Experimental:      false

Server:
 Engine:
  Version:          18.09.0-ce
  API version:      1.39 (minimum version 1.12)
  Go version:       go1.11.2
  Git commit:       4d60db472b
  Built:            Fri Nov  9 00:05:11 2018
  OS/Arch:          linux/amd64
  Experimental:     false

Output of docker info:

Containers: 30
 Running: 1
 Paused: 0
 Stopped: 29
Images: 462
Server Version: 18.09.0-ce
Storage Driver: btrfs
 Build Version: Btrfs v4.19 
 Library Version: 102
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: bridge host macvlan null overlay
 Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 9f2e07b1fc1342d1c48fe4d7bbb94cb6d1bf278b.m
runc version: 079817cc26ec5292ac375bb9f47f373d33574949
init version: fec3683
Security Options:
 seccomp
  Profile: default
Kernel Version: 4.19.2-arch1-1-ARCH
Operating System: Arch Linux
OSType: linux
Architecture: x86_64
CPUs: 4
Total Memory: 15.43GiB
Name: tomfoolery
ID: 5OC7:CGAE:42HJ:OE2W:Y5HG:GFUR:4QBD:XKF3:5V5S:OXDQ:PEUR:5PP6
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: false

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions