Refactor 'resolv.conf' generation.#47041
Conversation
e33623e to
1a0ce49
Compare
corhere
left a comment
There was a problem hiding this comment.
We have a perfect opportunity to put the new ResolvConf and related code into an internal package, which would free us from any compatibility promises. We should do that!
libnetwork/resolvconf/resolvconf.go
Outdated
| s.Overrides = append(s.Overrides, "options") | ||
| } | ||
|
|
||
| templText := `{{if .Comments}}{{with .Md.HeaderComment}}{{.}} |
There was a problem hiding this comment.
Thoughts on moving the template to its own file and embedding it into the program?
There was a problem hiding this comment.
Done ... although I'm not really sure it's an improvement, now it's easier to miss changes to members of ResolvConf that are used by the template (because they're not searchable-for in the same file), and those don't show up until the template's used.
libnetwork/resolvconf/resolvconf.go
Outdated
| // WriteFile generates content and writes it to path. If hashPath is non-zero, it | ||
| // also writes a file containing a hash of the content, to enable UserModified() | ||
| // to determine whether the file has been modified. | ||
| func (rc *ResolvConf) WriteFile(path, hashPath string, perm os.FileMode) error { |
There was a problem hiding this comment.
Aside from wrapping a call to rc.Generate(), this method is only tangentially related to the ResolvConf type.
| func (rc *ResolvConf) WriteFile(path, hashPath string, perm os.FileMode) error { | |
| func WriteFile(content []byte, path, hashPath string, perm os.FileMode) error { |
There was a problem hiding this comment.
I put it in the package, wrapping Generate(), for a couple of reasons ...
The old exported package functions use it - so, putting it outside the package would have meant duplicating the generate and write.
But, mostly, sandbox_dns_unix code depends on not-writing the hash file after it's been set up for the internal nameserver ... after that happens, it's never updated. (If it was, it'd break if the container was later connected to the default network.) That wasn't at all obvious, so I wanted to make it explicit by using a write method that requires a missing hash-file path - rather than just a missing update.
Perhaps the mistake was exporting Generate(). For now, I've un-exported it - but that can change if we want to make the package internal and remove the old methods.
There was a problem hiding this comment.
The old exported package functions use it - so, putting it outside the package would have meant duplicating the generate and write.
So? What's the big deal writing a half-dozen lines of similar code in two places?
contents, err := rc.Generate(true)
if err != nil { return err }
if err := somepkg.WriteFile(contents, path, hashPath, 0o644); err != nil {
return err
}
sandbox_dns_unixcode depends on
This isn't sandbox_dns_unix code. It is a distinct exported package. Code that is not sandbox_dns_unix may have no need for writing a digest of a resolv.conf file to disk. It may not even want to write the generated resolv.conf to disk at all! Maybe it wants to write it into a tarball it's writing, or return it in an RPC response.
I would be much less picky about the shape of the API if this code becomes exclusively an implementation detail of sandbox_dns_unix such that we would be free to make breaking changes at any time to type ResolvConf, its methods, or related functions.
libnetwork/resolvconf/resolvconf.go
Outdated
| rc.options = fields[1:] | ||
| default: | ||
| // Copy anything that's not a recognised directive. | ||
| rc.other += line + "\n" |
There was a problem hiding this comment.
Iterative string concatenation is O(n^2). Why not make other a []string and concatenate when rendering the template?
libnetwork/sandbox_dns_unix.go
Outdated
| if len(sb.extDNS) == 0 { | ||
| sb.setExternalResolvers(currRC, resolvconf.IPv4, false) | ||
| // Work out whether ndots has been set from host config or overrides. | ||
| _, sb.ndotsSet = rc.Option("ndots:") |
There was a problem hiding this comment.
The default is equivalent to ndots:1, so should we treat an explicit ndots:1 the same as not setting the option at all?
There was a problem hiding this comment.
Any value in the host's resolv.conf acts as an override for our ndots:0. So, treating ndots:1.
sb.ndotsSet is used in the resolver here ...
Lines 425 to 429 in 43ffb1e
... which looks like a mistake.
The comment says If the user sets ndots > 0 explicitly but the test is r.backend.NdotsSet(), which is true for ndots:0.
It's another thing for a follow-up fix.
| // fmt.Print(string(content)) | ||
| assert.Check(t, is.DeepEqual(string(content), tc.expContent)) | ||
| assert.Check(t, is.DeepEqual(rc.ExtNameServers(), tc.expExtServers, | ||
| cmp.Comparer(func(a, b ExtDNSEntry) bool { return a == b }))) |
There was a problem hiding this comment.
You shouldn't have to supply a custom Comparer unless you need to customize the definition of equality to be something other than == (a.k.a. (reflect.Value).Equal)
There was a problem hiding this comment.
Without it ...
resolvconf_unix_test.go:1143: assertion failed: cannot handle unexported field at {[]resolvconf.ExtDNSEntry}[0].Addr.addr:
"net/netip".Addr
consider using a custom Comparer; if you control the implementation of type, you can also consider using an Exporter, AllowUnexported, or cmpopts.IgnoreUnexported
There was a problem hiding this comment.
Oops, my bad, of course DeepEqual can't use direct == comparison as it wouldn't be able to make a field-by-field diff without recursively comparing fields. Setting a custom comparer for ExtDNSEntry would also prevent it from doing a field-by-field diff of its fields. The documented right answer is cmpopts.EquateComparable(netip.Addr{})
There was a problem hiding this comment.
Ahh - that's handy, thank you ... it's new in go-cmp 0.6.0 - even more handily, that's in master as of last week. So, I'll rebase and do that.
|
Thank you for the review @corhere.
Are you thinking we should remove the old So, we could ditch those functions, it seems unlikely anyone's using the package as it was. But, I'm not sure what the opportunity is... nothing's really changed, we could have made it internal at any point? |
1a0ce49 to
0042ecc
Compare
I would not mind if we removed the old |
a7ed9a3 to
20842f7
Compare
Got it... I've moved the new stuff to I'm not sure if using the same name is a bad idea, but it seems like the right name so I kept it. Happy to change it though. |
20842f7 to
5c875fd
Compare
corhere
left a comment
There was a problem hiding this comment.
Package name libnetwork/internal/resolvconf is exactly what I would have suggested 👍
There was a problem hiding this comment.
I was hoping that we'd get syntax highlighting if this was moved into its own file... but Go templates is not one of the supported syntaxes in https://github.com/github-linguist/linguist so we won't get highlighting on GitHub regardless of file name.
You made a good point about locality. I phrased my review comment as a question because I genuinely was unsure whether it was the right pick and wanted to start a conversation. I support you if you want to move the template back inline.
There was a problem hiding this comment.
Ok, yes - I've put it back inline.
| // SetHeader stores a comment to be included at the top of the generated | ||
| // resolv.conf file. No formatting or checking is done on the string, it must | ||
| // include a '#' (or a ';') in the first column of each line. |
There was a problem hiding this comment.
| // SetHeader stores a comment to be included at the top of the generated | |
| // resolv.conf file. No formatting or checking is done on the string, it must | |
| // include a '#' (or a ';') in the first column of each line. | |
| // SetHeader sets the content to be included verbatim at the top of the | |
| // generated resolv.conf file. No formatting or checking is done on the | |
| // string. It must be valid resolv.conf syntax. (Comments must have '#' | |
| // or ';' in the first column of each line). |
| // OverrideNameServers replaces the current set of nameservers. Invalid | ||
| // addresses are silently discarded (on the assumption that addresses | ||
| // have already been validated, higher up the stack). | ||
| func (rc *ResolvConf) OverrideNameServers(nameServers []string) { |
There was a problem hiding this comment.
| func (rc *ResolvConf) OverrideNameServers(nameServers []string) { | |
| func (rc *ResolvConf) OverrideNameServers(nameServers []netip.Addr) { |
I'd really like to refactor stringly-typed IP addresses out of libnetwork. I figure we can accomplish this iteratively by writing new code to accept only pre-parsed addresses and parsing at the boundary between existing code and new code.
| } | ||
| rc.nameServers = filtered | ||
| if len(rc.nameServers) == 0 { | ||
| log.G(context.TODO()).Info("No non-localhost DNS nameservers are left in resolv.conf. Using default external servers") |
There was a problem hiding this comment.
Is the daemon log necessary, given that a comment will be injected into the rendered resolv.conf with essentially the same message?
There was a problem hiding this comment.
The log message is from the original resolvconf, and it'll still appear for FilterResolvDNS - which doesn't include comments in its output. So, I think it's best to leave it in.
| if errors.Is(err, fs.ErrNotExist) { | ||
| return false, nil | ||
| } | ||
| return false, errors.Wrapf(err, "Failed to read hash file %s", rcHashPath) |
There was a problem hiding this comment.
There was a problem hiding this comment.
Removed the capital letters.
| } | ||
| } | ||
|
|
||
| func a2s(addrs []netip.Addr) []string { |
There was a problem hiding this comment.
It's unfortunate the standard library does not have a generic slices.Map function. (Yet. See https://go.dev/issue/61898) So I whipped up #47209 for you.
var a2s = sliceutil.Mapper(netip.Addr.String)|
|
||
| content, err := os.ReadFile(path) | ||
| assert.NilError(t, err) | ||
| // fmt.Print(string(content)) |
There was a problem hiding this comment.
That's useful; don't comment it out!
| // fmt.Print(string(content)) | |
| t.Logf("%s", content) |
There was a problem hiding this comment.
It was useful for grabbing the expected output when writing the test - but it'll show up in the DeepEqual output if the test fails ... done anyway.
libnetwork/resolvconf/resolvconf.go
Outdated
| // | ||
| // Its functions are implemented in terms of 'libnetwork/internal/resolvconf', which is specialised | ||
| // for use by libnetwork's use-cases. |
There was a problem hiding this comment.
| // | |
| // Its functions are implemented in terms of 'libnetwork/internal/resolvconf', which is specialised | |
| // for use by libnetwork's use-cases. |
What are you doing? The package doc comment is externally visible! Documenting implementation details makes them part of the API contract!
libnetwork/resolvconf/resolvconf.go
Outdated
| defaultPath = "/etc/resolv.conf" | ||
| // alternatePath is a path different from defaultPath, that may be used to resolve DNS. See Path(). | ||
| alternatePath = "/run/systemd/resolve/resolv.conf" | ||
| newrc "github.com/docker/docker/libnetwork/internal/resolvconf" |
There was a problem hiding this comment.
Avoid renaming imports except to avoid a name collision. There is nothing else named resolvconf in this scope.
| newrc "github.com/docker/docker/libnetwork/internal/resolvconf" | |
| "github.com/docker/docker/libnetwork/internal/resolvconf" |
There was a problem hiding this comment.
Ok, I thought referring to (other) resolvconf in resolvconf might be confusing. But, done.
|
|
||
| // with IPv6 enabled, and no non-localhost servers, Google defaults (both IPv4+IPv6) should be added | ||
| ns0 = "\nnameserver 8.8.8.8\nnameserver 8.8.4.4\nnameserver 2001:4860:4860::8888\nnameserver 2001:4860:4860::8844" | ||
| ns0 = "nameserver 8.8.8.8\nnameserver 8.8.4.4\nnameserver 2001:4860:4860::8888\nnameserver 2001:4860:4860::8844\n" |
There was a problem hiding this comment.
You could apply strings.TrimSpace to result.Content to make the tests not be a change-detector for differences in leading and trailing whitespace.
There was a problem hiding this comment.
Done - rejigged that test as a list of testcases.
5c875fd to
b225411
Compare
b225411 to
210fe47
Compare
corhere
left a comment
There was a problem hiding this comment.
Partial review; I probably missed things
fa05a0d to
3faac62
Compare
corhere
left a comment
There was a problem hiding this comment.
Fully reviewed! Overwriting the host's resolv.conf notwithstanding, everything else looks good to me.
| // Replace the test host's resolv.conf with one that only contains a loopback address. | ||
| // TODO(robmry) - there must be a better way! |
There was a problem hiding this comment.
Have resolvconf.Path() return the value of a magic environment variable if set and start a test daemon up with that env var set? Or (this might be too clever to be practical) add an env var to set a "sysroot" path which gets prepended to all hardcoded host-absolute paths which would need to be overridden in tests. I stole this idea from GCC and clang; it's part of their strategy for supporting cross-compilation without the hassle of chroots.
Obviously, the environment variable would be for integration-testing purposes only. We have prior art for using environment variables this way.
Lines 196 to 199 in cff4f20
Another possible approach would be to unshare a mount namespace, bind-mount over /etc/resolv.conf and run a test daemon in that namespace. But the environment variable override idea is easier to implement and reason about.
| Force: true, | ||
| }) | ||
|
|
||
| assert.Check(t, is.Equal(result.Stdout.String(), `# Generated by Docker Engine. |
There was a problem hiding this comment.
This might be a good place to use https://pkg.go.dev/gotest.tools/v3@v3.5.1/golden instead of having to update the expected string manually whenever the template is changed.
There was a problem hiding this comment.
For this single integration test, I think it's probably more fiddly to work out how to run the test with "-update" to generate the expected output than to just replace the text.
But I have now used it for the 40-ish unit tests in libnetwork/internal/resolvconf.
|
|
||
| content, err := os.ReadFile(path) | ||
| assert.NilError(t, err) | ||
| // fmt.Print(string(content)) |
There was a problem hiding this comment.
| // fmt.Print(string(content)) |
|
|
||
| content, err := os.ReadFile(path) | ||
| assert.NilError(t, err) | ||
| // fmt.Print(string(content)) |
There was a problem hiding this comment.
| // fmt.Print(string(content)) |
libnetwork/libnetwork_linux_test.go
Outdated
| t.Helper() | ||
| actual = stripCommentsRE.ReplaceAllLiteral(actual, []byte{}) | ||
| if !bytes.Equal(actual, expected) { | ||
| t.Fatalf("Got:\n%s\nExpected:\n%s", actual, expected) |
There was a problem hiding this comment.
Even if you're avoiding gotest.tools in this file for a reason, you could still call https://pkg.go.dev/github.com/google/go-cmp/cmp#Diff directly to be nice to whoever has to diagnose test failures.
|
|
||
| ctx := setupTest(t) | ||
|
|
||
| // Replace the test host's resolv.conf with one that only contains a loopback address. |
There was a problem hiding this comment.
Yikes! Things could go very wrong if an unsuspecting developer tried to run the integration suite outside of a container. If no better way is implemented, could you add a sanity check to skip this test if it's not run from inside a container?
_, err := os.Stat("/.dockerenv")
skip.If(t, err == nil, "outside a container")There was a problem hiding this comment.
Yes, it's grim - but, not my idea! It was already happening in a few tests ...
moby/libnetwork/libnetwork_linux_test.go
Lines 1799 to 1809 in 810ef4d
moby/libnetwork/libnetwork_linux_test.go
Lines 1876 to 1886 in 810ef4d
moby/libnetwork/libnetwork_linux_test.go
Lines 1954 to 1964 in 810ef4d
So, I'll do the env-var thing, and fix those too.
There was a problem hiding this comment.
Eek, that's really scary! Thank you!
There was a problem hiding this comment.
I deleted test "TestResolvConf" - as I described in now-deleted comments, it wasn't clear what it was trying to test. And, whatever it was trying to test is covered elsewhere.
I tidied up the other two by turning them into a table-driven test ... they didn't need to overwrite the host's resolv.conf or use the new env-var, they could just use libnetwork.OptionOriginResolvConfPath.
3faac62 to
fad457a
Compare
c5fd29e to
cdadd99
Compare
corhere
left a comment
There was a problem hiding this comment.
resolvconf_test.go has tests which depend on global state set up by other tests. We can't have interdependent unit tests.
| // defaultPath. | ||
| func Path() string { | ||
| detectSystemdResolvConfOnce.Do(func() { | ||
| // Allow tests to point at their own resolv.conf file. |
There was a problem hiding this comment.
resolvconf_test.go has tests which depend on global state set up by other tests. We can't have interdependent unit tests.
I don't think so - nothing's sharing the instance of the daemon that's started by the test, which does this...
d := daemon.New(t, daemon.WithEnvVars("DOCKER_TEST_RESOLV_CONF_PATH="+etchostsPath))
d.StartWithBusybox(ctx, t, "--experimental", "--ip6tables")
defer d.Stop(t)
There was a problem hiding this comment.
Did you reply to the right comment? You quoted my review comment in a reply to this unrelated inline comment
There was a problem hiding this comment.
Oh... right - I didn't notice I'd accidentally left in that comment, so your "how" question looked related to the your comment about global state.
I thought defaultPath was the state you were referring to. But, it can't affect other tests, because the daemon's started with the env var set just for this test. Which probably means I missed the point, what global state were you looking at?
(Initially I tried to read the env-var here, to initialise pathAfterSystemdDetection to whatever it said. But, I'd forgotten the path is overridden at daemon level via libnetwork options. So, I put the env-var read there instead, and failed to delete that comment. I'll do that now.)
There was a problem hiding this comment.
what global state were you looking at?
golden.NormalizeCRLFToLF
(Initially I tried to read the env-var here, to initialise
pathAfterSystemdDetectionto whatever it said. But, I'd forgotten the path is overridden at daemon level via libnetwork options. So, I put the env-var read there instead, and failed to delete that comment. I'll do that now.)
Perfect! I was confused by the orphaned comment.
cb65a15 to
e7254f9
Compare
|
Some failures in CI; some of them look to be due to output formatting (empty lines?) |
|
oh! perhaps |
e7254f9 to
d8aedd1
Compare
Yes, seems to be, it's getting a bit irritating! The test works for me locally on Windows, but not here. The I added a There's no Fallbacks I can think of are to go back to not-using |
|
@robmry |
|
@robmry I just discovered this; would that do the job? moby/vendor/gotest.tools/v3/golden/golden.go Lines 32 to 51 in d2f12e6 |
|
🙈 I thought it came op recently, forgot it was here; sorry! |
Replace regex matching/replacement and re-reading of generated files with a simple parser, and struct to remember and manipulate the file content. Annotate the generated file with a header comment saying the file is generated, but can be modified, and a trailing comment describing how the file was generated and listing external nameservers. Always start with the host's resolv.conf file, whether generating config for host networking, or with/without an internal resolver - rather than editing a file previously generated for a different use-case. Resolves an issue where rewrites of the generated file resulted in default IPv6 nameservers being unnecessarily added to the config. Signed-off-by: Rob Murray <rob.murray@docker.com>
d8aedd1 to
beb97f7
Compare
Ohh - ha ... thank you, I couldn't see that at all! |
|
Sorry for necrobumping but I am confused about the intended behavior here. According to the OP it seems that dns entries in daemon.json or passed as options using --dns=.... should always be copied but I am observing that if there is a 127.0.0.53 entry in /etc/resolv.conf it completely ignores any other dns configuration. Is that intentional? |
It doesn't sound right ... but I can't reproduce what you describe. If you think there's a bug, please could you raise it as a new issue, so we can see which build you're using, the commands you're using, and the output you're seeing (including the contents of |
|
Thanks for the quick response, bug report created here: |
|
I had a lengthy discussion with chatGPT about this as well, linking for context (asking some stupid questions just to get it to understand the context, excuse the noise). |

- What I did
The process for
resolv.confgeneration was complicated and did a lot of things that weren't obvious at a glance:Sandbox.setupDNS():--dns,--dns-search,--dns-option) are specified:resolv.confas 'ext nameservers'.FilterResolvDNS()- remove localhost/loopback nameservers, then add default nameservers if none remain.Sandbox.updateDNS():resolv.confhas been modified (hash file mismatch).FilterResolvDNS()on the container's currentresolv.conf. This time, removing IPv6 addresses if the endpoint isn't IPv6-enabled.Sandbox.rebuildDNS():nameservers,options,searchfrom the container's currentresolv.conf.optionsif not already specified.nameserversfrom file container's file again, filtering for only-IPv6 this time.resolv.confcontaining the internal resolver, IPv6nameservers, modifiedoptions, andsearch.Sandbox.updateDNS(), possibly intentionally).Each step is implemented in package
resolvconfusing regular expressions to read and reread the file.The result was a bit unpredictable, and it was difficult to see how a file had been constructed.
#46968 (comment) is an issue with a host
resolv.confcontaining only a localhost resolver (systemd's127.0.0.53), and an IPv6 enabled custom network. In theSandbox.setupDNS()step default IPv4 and IPv6 nameservers were added, but the IPv6 servers (which were never required) were not removed inSandbox.rebuildDNS(). The workaround for that problem works because setting one of the override flags has an unexpectedly dramatic effect on the process.So, refactored that process, with the aim of making a lot of subtle behaviour more explicit.
Annotated the generated file with a trailing comment describing how the file was generated and listing external nameservers (and an header comment that says the file is generated). For example:
Changes in behaviour are:
resolv.confhas been edited, the file won't be modified to include the internal resolver.resolv.confmodified, then connected to a custom network. But, theresolv.confupdate was likely to have been for a reason, and this won't be an issue once the internal resolver's started for the default bridge network.resolv.confif the sandbox currently has any IPv6 endpoints (previously, because of the way generated config was rewritten, this depended only on the first endpoint to have joined)./etc/resolv.confonly contains loopback addresses, and a--dns-searchor--dns-optionis given, the loopback addresses end up in the container's config, where they won't work. The default nameservers are now used instead.resolv.conffile,;is now treated as a a line-comment marker, as well as#.- How I did it
Replace regex matching/replacement and re-reading of generated files with a simple parser, and struct to remember and manipulate the file content.
Always start with the host's resolv.conf file, whether generating config for host networking, or with/without an internal resolver - rather than editing a file previously generated file.
Maintained the
resolvconfinterface for by rewriting functions in terms of the new struct/methods. Added convenience functions that do the necessary manipulation for legacy networking and the internal resolver (simplifying the sandbox code).I've tried to preserve existing quirks, with TODO comments. For example:
optionsshould accumulate over multiple directives.resolv.confstill depends on the order of endpoints joining.resolv.confare lines with#or;in the first column. Currently the rest of a line is ignored wherever they appear (which means we could miss directives the host will obey, particularly inoptions).Explicitly parcelling up the legacy-network behaviour should make it easier/safer to remove.
- How to verify it
New unit tests, and integration/regression test for the closed ticket.
- Description for the changelog
Refactor 'resolv.conf' generation.