Skip to content

Refactor 'resolv.conf' generation.#47041

Merged
thaJeztah merged 1 commit intomoby:masterfrom
robmry:46968_refactor_resolvconf
Feb 29, 2024
Merged

Refactor 'resolv.conf' generation.#47041
thaJeztah merged 1 commit intomoby:masterfrom
robmry:46968_refactor_resolvconf

Conversation

@robmry
Copy link
Contributor

@robmry robmry commented Jan 8, 2024

- What I did

The process for resolv.conf generation was complicated and did a lot of things that weren't obvious at a glance:

  • On sandbox creation, Sandbox.setupDNS():
    • If host-networking, and no DNS overrides (--dns, --dns-search, --dns-option) are specified:
      • copy the host's resolv.conf.
    • Else if any overrides:
      • write a new file containing the given overrides, plus values regex'd from the file for the not-given options
      • save nameservers as 'ext nameservers', in case an internal resolver is started later.
    • Else:
      • Remember resolvers from the host's resolv.conf as 'ext nameservers'.
      • FilterResolvDNS() - remove localhost/loopback nameservers, then add default nameservers if none remain.
    • Write the container's file, and its hash.
  • When an endpoint joins the sandbox, Sandbox.updateDNS():
    • Do nothing if any overrides were specified.
    • Do nothing if the container's resolv.conf has been modified (hash file mismatch).
    • Run FilterResolvDNS() on the container's current resolv.conf. This time, removing IPv6 addresses if the endpoint isn't IPv6-enabled.
    • Write the container's file, and its hash.
  • When the internal resolver starts, Sandbox.rebuildDNS():
    • Unconditionally (whether or not the file has been modified)...
    • Read nameservers, options, search from the container's current resolv.conf.
    • Add "ndots:0" to the options if not already specified.
    • If no external resolvers have already been set up, use IPv4 nameservers read from the file.
    • Read nameservers from file container's file again, filtering for only-IPv6 this time.
    • Build a new resolv.conf containing the internal resolver, IPv6 nameservers, modified options, and search.
    • Write the file but not its hash (preventing further updates by Sandbox.updateDNS(), possibly intentionally).

Each step is implemented in package resolvconf using regular expressions to read and reread the file.

The result was a bit unpredictable, and it was difficult to see how a file had been constructed.

#46968 (comment) is an issue with a host resolv.conf containing only a localhost resolver (systemd's 127.0.0.53), and an IPv6 enabled custom network. In the Sandbox.setupDNS() step default IPv4 and IPv6 nameservers were added, but the IPv6 servers (which were never required) were not removed in Sandbox.rebuildDNS(). The workaround for that problem works because setting one of the override flags has an unexpectedly dramatic effect on the process.

So, refactored that process, with the aim of making a lot of subtle behaviour more explicit.

Annotated the generated file with a trailing comment describing how the file was generated and listing external nameservers (and an header comment that says the file is generated). For example:

# Generated by Docker Engine.
# This file can be edited; Docker Engine will not make further changes once it
# has been modified.

nameserver 127.0.0.11
options ndots:0

# Based on host file: '/etc/resolv.conf' (internal resolver)
# ExtServers: [host(127.0.0.53)]
# Overrides: []
# Option ndots from: internal

Changes in behaviour are:

  • Format of the container's resolv.conf file, in all cases.
  • If the internal resolver is started after the container's resolv.conf has been edited, the file won't be modified to include the internal resolver.
    • This could affect a container that was connected to the default bridge, had its resolv.conf modified, then connected to a custom network. But, the resolv.conf update was likely to have been for a reason, and this won't be an issue once the internal resolver's started for the default bridge network.
  • When the internal resolver is started, IPv6 addresses are left in the container's resolv.conf if the sandbox currently has any IPv6 endpoints (previously, because of the way generated config was rewritten, this depended only on the first endpoint to have joined).
  • For legacy networking with no internal resolver, if the host's /etc/resolv.conf only contains loopback addresses, and a --dns-search or --dns-option is given, the loopback addresses end up in the container's config, where they won't work. The default nameservers are now used instead.
  • In the resolv.conf file, ; is now treated as a a line-comment marker, as well as #.
  • Fixes Erroneously filtering out non-local IPv6 DNS servers when using user-defined networks #46968 ... rewrites of the generated file resulted in default IPv6 nameservers being unnecessarily added to the config.

- How I did it

Replace regex matching/replacement and re-reading of generated files with a simple parser, and struct to remember and manipulate the file content.

Always start with the host's resolv.conf file, whether generating config for host networking, or with/without an internal resolver - rather than editing a file previously generated file.

Maintained the resolvconf interface for by rewriting functions in terms of the new struct/methods. Added convenience functions that do the necessary manipulation for legacy networking and the internal resolver (simplifying the sandbox code).

I've tried to preserve existing quirks, with TODO comments. For example:

  • options should accumulate over multiple directives.
  • We should probably use IPv6 nameservers as upstream servers for the internal resolver, rather than leave them in the container's config.
  • The IPv6-ness of the container's resolv.conf still depends on the order of endpoints joining.
  • Only loopback addresses from the host's config are looked up in the host's namespace.
  • Comments in a resolv.conf are lines with # or ; in the first column. Currently the rest of a line is ignored wherever they appear (which means we could miss directives the host will obey, particularly in options).

Explicitly parcelling up the legacy-network behaviour should make it easier/safer to remove.

- How to verify it

New unit tests, and integration/regression test for the closed ticket.

- Description for the changelog

Refactor 'resolv.conf' generation.

@robmry robmry force-pushed the 46968_refactor_resolvconf branch 2 times, most recently from e33623e to 1a0ce49 Compare January 8, 2024 16:35
@robmry robmry marked this pull request as ready for review January 8, 2024 17:42
@robmry robmry self-assigned this Jan 8, 2024
Copy link
Contributor

@corhere corhere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a perfect opportunity to put the new ResolvConf and related code into an internal package, which would free us from any compatibility promises. We should do that!

s.Overrides = append(s.Overrides, "options")
}

templText := `{{if .Comments}}{{with .Md.HeaderComment}}{{.}}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thoughts on moving the template to its own file and embedding it into the program?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done ... although I'm not really sure it's an improvement, now it's easier to miss changes to members of ResolvConf that are used by the template (because they're not searchable-for in the same file), and those don't show up until the template's used.

// WriteFile generates content and writes it to path. If hashPath is non-zero, it
// also writes a file containing a hash of the content, to enable UserModified()
// to determine whether the file has been modified.
func (rc *ResolvConf) WriteFile(path, hashPath string, perm os.FileMode) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aside from wrapping a call to rc.Generate(), this method is only tangentially related to the ResolvConf type.

Suggested change
func (rc *ResolvConf) WriteFile(path, hashPath string, perm os.FileMode) error {
func WriteFile(content []byte, path, hashPath string, perm os.FileMode) error {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I put it in the package, wrapping Generate(), for a couple of reasons ...

The old exported package functions use it - so, putting it outside the package would have meant duplicating the generate and write.

But, mostly, sandbox_dns_unix code depends on not-writing the hash file after it's been set up for the internal nameserver ... after that happens, it's never updated. (If it was, it'd break if the container was later connected to the default network.) That wasn't at all obvious, so I wanted to make it explicit by using a write method that requires a missing hash-file path - rather than just a missing update.

Perhaps the mistake was exporting Generate(). For now, I've un-exported it - but that can change if we want to make the package internal and remove the old methods.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The old exported package functions use it - so, putting it outside the package would have meant duplicating the generate and write.

So? What's the big deal writing a half-dozen lines of similar code in two places?

contents, err := rc.Generate(true)
if err != nil { return err }
if err := somepkg.WriteFile(contents, path, hashPath, 0o644); err != nil {
    return err
}

sandbox_dns_unix code depends on

This isn't sandbox_dns_unix code. It is a distinct exported package. Code that is not sandbox_dns_unix may have no need for writing a digest of a resolv.conf file to disk. It may not even want to write the generated resolv.conf to disk at all! Maybe it wants to write it into a tarball it's writing, or return it in an RPC response.

I would be much less picky about the shape of the API if this code becomes exclusively an implementation detail of sandbox_dns_unix such that we would be free to make breaking changes at any time to type ResolvConf, its methods, or related functions.

rc.options = fields[1:]
default:
// Copy anything that's not a recognised directive.
rc.other += line + "\n"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Iterative string concatenation is O(n^2). Why not make other a []string and concatenate when rendering the template?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

if len(sb.extDNS) == 0 {
sb.setExternalResolvers(currRC, resolvconf.IPv4, false)
// Work out whether ndots has been set from host config or overrides.
_, sb.ndotsSet = rc.Option("ndots:")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default is equivalent to ndots:1, so should we treat an explicit ndots:1 the same as not setting the option at all?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any value in the host's resolv.conf acts as an override for our ndots:0. So, treating ndots:1.

sb.ndotsSet is used in the resolver here ...

moby/libnetwork/resolver.go

Lines 425 to 429 in 43ffb1e

// If the user sets ndots > 0 explicitly and the query is
// in the root domain don't forward it out. We will return
// failure and let the client retry with the search domain
// attached.
if (queryType == dns.TypeA || queryType == dns.TypeAAAA) && r.backend.NdotsSet() &&

... which looks like a mistake.

The comment says If the user sets ndots > 0 explicitly but the test is r.backend.NdotsSet(), which is true for ndots:0.

It's another thing for a follow-up fix.

// fmt.Print(string(content))
assert.Check(t, is.DeepEqual(string(content), tc.expContent))
assert.Check(t, is.DeepEqual(rc.ExtNameServers(), tc.expExtServers,
cmp.Comparer(func(a, b ExtDNSEntry) bool { return a == b })))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You shouldn't have to supply a custom Comparer unless you need to customize the definition of equality to be something other than == (a.k.a. (reflect.Value).Equal)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without it ...

    resolvconf_unix_test.go:1143: assertion failed: cannot handle unexported field at {[]resolvconf.ExtDNSEntry}[0].Addr.addr:
        	"net/netip".Addr
        consider using a custom Comparer; if you control the implementation of type, you can also consider using an Exporter, AllowUnexported, or cmpopts.IgnoreUnexported

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, my bad, of course DeepEqual can't use direct == comparison as it wouldn't be able to make a field-by-field diff without recursively comparing fields. Setting a custom comparer for ExtDNSEntry would also prevent it from doing a field-by-field diff of its fields. The documented right answer is cmpopts.EquateComparable(netip.Addr{})

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh - that's handy, thank you ... it's new in go-cmp 0.6.0 - even more handily, that's in master as of last week. So, I'll rebase and do that.

@robmry
Copy link
Contributor Author

robmry commented Jan 24, 2024

Thank you for the review @corhere.

We have a perfect opportunity to put the new ResolvConf and related code into an internal package, which would free us from any compatibility promises. We should do that!

Are you thinking we should remove the old resolvconf package? At the moment, I've preserved its exported functions because I thought we were stuck with them. If I move their new implementation into an internal package, they won't be usable.

So, we could ditch those functions, it seems unlikely anyone's using the package as it was. But, I'm not sure what the opportunity is... nothing's really changed, we could have made it internal at any point?

@robmry robmry force-pushed the 46968_refactor_resolvconf branch from 1a0ce49 to 0042ecc Compare January 24, 2024 16:08
@corhere
Copy link
Contributor

corhere commented Jan 24, 2024

Are you thinking we should remove the old resolvconf package? At the moment, I've preserved its exported functions because I thought we were stuck with them. If I move their new implementation into an internal package, they won't be usable.

I would not mind if we removed the old resolvconf package, but I was assuming the premise that we were stuck with them. We can move the new code (type ResolvConf struct and friends) into an internal package, import the internal package into the existing resolvconf package, and implement the existing resolvconf API (FilterResolvDNS, GetNameservers, etc.) in terms of the internal package.

@robmry robmry force-pushed the 46968_refactor_resolvconf branch 2 times, most recently from a7ed9a3 to 20842f7 Compare January 24, 2024 20:27
@robmry
Copy link
Contributor Author

robmry commented Jan 24, 2024

I would not mind if we removed the old resolvconf package, but I was assuming the premise that we were stuck with them. We can move the new code (type ResolvConf struct and friends) into an internal package, import the internal package into the existing resolvconf package, and implement the existing resolvconf API (FilterResolvDNS, GetNameservers, etc.) in terms of the internal package.

Got it... I've moved the new stuff to libnetwork/internal/resolvconf.

I'm not sure if using the same name is a bad idea, but it seems like the right name so I kept it. Happy to change it though.

@robmry robmry force-pushed the 46968_refactor_resolvconf branch from 20842f7 to 5c875fd Compare January 24, 2024 20:50
Copy link
Contributor

@corhere corhere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Package name libnetwork/internal/resolvconf is exactly what I would have suggested 👍

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was hoping that we'd get syntax highlighting if this was moved into its own file... but Go templates is not one of the supported syntaxes in https://github.com/github-linguist/linguist so we won't get highlighting on GitHub regardless of file name.

You made a good point about locality. I phrased my review comment as a question because I genuinely was unsure whether it was the right pick and wanted to start a conversation. I support you if you want to move the template back inline.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, yes - I've put it back inline.

Comment on lines +142 to +144
// SetHeader stores a comment to be included at the top of the generated
// resolv.conf file. No formatting or checking is done on the string, it must
// include a '#' (or a ';') in the first column of each line.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// SetHeader stores a comment to be included at the top of the generated
// resolv.conf file. No formatting or checking is done on the string, it must
// include a '#' (or a ';') in the first column of each line.
// SetHeader sets the content to be included verbatim at the top of the
// generated resolv.conf file. No formatting or checking is done on the
// string. It must be valid resolv.conf syntax. (Comments must have '#'
// or ';' in the first column of each line).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

// OverrideNameServers replaces the current set of nameservers. Invalid
// addresses are silently discarded (on the assumption that addresses
// have already been validated, higher up the stack).
func (rc *ResolvConf) OverrideNameServers(nameServers []string) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
func (rc *ResolvConf) OverrideNameServers(nameServers []string) {
func (rc *ResolvConf) OverrideNameServers(nameServers []netip.Addr) {

I'd really like to refactor stringly-typed IP addresses out of libnetwork. I figure we can accomplish this iteratively by writing new code to accept only pre-parsed addresses and parsing at the boundary between existing code and new code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good plan, done.

}
rc.nameServers = filtered
if len(rc.nameServers) == 0 {
log.G(context.TODO()).Info("No non-localhost DNS nameservers are left in resolv.conf. Using default external servers")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the daemon log necessary, given that a comment will be injected into the rendered resolv.conf with essentially the same message?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The log message is from the original resolvconf, and it'll still appear for FilterResolvDNS - which doesn't include comments in its output. So, I think it's best to leave it in.

if errors.Is(err, fs.ErrNotExist) {
return false, nil
}
return false, errors.Wrapf(err, "Failed to read hash file %s", rcHashPath)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the capital letters.

}
}

func a2s(addrs []netip.Addr) []string {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's unfortunate the standard library does not have a generic slices.Map function. (Yet. See https://go.dev/issue/61898) So I whipped up #47209 for you.

var a2s = sliceutil.Mapper(netip.Addr.String)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Done.


content, err := os.ReadFile(path)
assert.NilError(t, err)
// fmt.Print(string(content))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's useful; don't comment it out!

Suggested change
// fmt.Print(string(content))
t.Logf("%s", content)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was useful for grabbing the expected output when writing the test - but it'll show up in the DeepEqual output if the test fails ... done anyway.

Comment on lines +2 to +4
//
// Its functions are implemented in terms of 'libnetwork/internal/resolvconf', which is specialised
// for use by libnetwork's use-cases.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
//
// Its functions are implemented in terms of 'libnetwork/internal/resolvconf', which is specialised
// for use by libnetwork's use-cases.

What are you doing? The package doc comment is externally visible! Documenting implementation details makes them part of the API contract!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, right. Removed.

defaultPath = "/etc/resolv.conf"
// alternatePath is a path different from defaultPath, that may be used to resolve DNS. See Path().
alternatePath = "/run/systemd/resolve/resolv.conf"
newrc "github.com/docker/docker/libnetwork/internal/resolvconf"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Avoid renaming imports except to avoid a name collision. There is nothing else named resolvconf in this scope.

Suggested change
newrc "github.com/docker/docker/libnetwork/internal/resolvconf"
"github.com/docker/docker/libnetwork/internal/resolvconf"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I thought referring to (other) resolvconf in resolvconf might be confusing. But, done.


// with IPv6 enabled, and no non-localhost servers, Google defaults (both IPv4+IPv6) should be added
ns0 = "\nnameserver 8.8.8.8\nnameserver 8.8.4.4\nnameserver 2001:4860:4860::8888\nnameserver 2001:4860:4860::8844"
ns0 = "nameserver 8.8.8.8\nnameserver 8.8.4.4\nnameserver 2001:4860:4860::8888\nnameserver 2001:4860:4860::8844\n"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could apply strings.TrimSpace to result.Content to make the tests not be a change-detector for differences in leading and trailing whitespace.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done - rejigged that test as a list of testcases.

@robmry robmry force-pushed the 46968_refactor_resolvconf branch from 5c875fd to b225411 Compare January 25, 2024 11:59
@thaJeztah thaJeztah added status/2-code-review kind/enhancement Enhancements are not bugs or new features but can improve usability or performance. area/networking Networking area/networking/dns Networking labels Jan 25, 2024
@robmry robmry force-pushed the 46968_refactor_resolvconf branch from b225411 to 210fe47 Compare January 25, 2024 12:44
Copy link
Contributor

@corhere corhere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Partial review; I probably missed things

@robmry robmry force-pushed the 46968_refactor_resolvconf branch 2 times, most recently from fa05a0d to 3faac62 Compare January 30, 2024 18:30
Copy link
Contributor

@corhere corhere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fully reviewed! Overwriting the host's resolv.conf notwithstanding, everything else looks good to me.

Comment on lines +27 to +28
// Replace the test host's resolv.conf with one that only contains a loopback address.
// TODO(robmry) - there must be a better way!
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have resolvconf.Path() return the value of a magic environment variable if set and start a test daemon up with that env var set? Or (this might be too clever to be practical) add an env var to set a "sysroot" path which gets prepended to all hardcoded host-absolute paths which would need to be overridden in tests. I stole this idea from GCC and clang; it's part of their strategy for supporting cross-compilation without the hassle of chroots.

Obviously, the environment variable would be for integration-testing purposes only. We have prior art for using environment variables this way.

// This is a temporary environment variables used in CI to allow pushing
// manifest v2 schema 1 images to test-registries used for testing *pulling*
// these images.
if os.Getenv("DOCKER_ALLOW_SCHEMA1_PUSH_DONOTUSE") == "" {

Another possible approach would be to unshare a mount namespace, bind-mount over /etc/resolv.conf and run a test daemon in that namespace. But the environment variable override idea is easier to implement and reason about.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Force: true,
})

assert.Check(t, is.Equal(result.Stdout.String(), `# Generated by Docker Engine.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be a good place to use https://pkg.go.dev/gotest.tools/v3@v3.5.1/golden instead of having to update the expected string manually whenever the template is changed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this single integration test, I think it's probably more fiddly to work out how to run the test with "-update" to generate the expected output than to just replace the text.

But I have now used it for the 40-ish unit tests in libnetwork/internal/resolvconf.


content, err := os.ReadFile(path)
assert.NilError(t, err)
// fmt.Print(string(content))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// fmt.Print(string(content))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


content, err := os.ReadFile(path)
assert.NilError(t, err)
// fmt.Print(string(content))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// fmt.Print(string(content))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

t.Helper()
actual = stripCommentsRE.ReplaceAllLiteral(actual, []byte{})
if !bytes.Equal(actual, expected) {
t.Fatalf("Got:\n%s\nExpected:\n%s", actual, expected)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even if you're avoiding gotest.tools in this file for a reason, you could still call https://pkg.go.dev/github.com/google/go-cmp/cmp#Diff directly to be nice to whoever has to diagnose test failures.


ctx := setupTest(t)

// Replace the test host's resolv.conf with one that only contains a loopback address.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yikes! Things could go very wrong if an unsuspecting developer tried to run the integration suite outside of a container. If no better way is implemented, could you add a sanity check to skip this test if it's not run from inside a container?

_, err := os.Stat("/.dockerenv")
skip.If(t, err == nil, "outside a container")

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's grim - but, not my idea! It was already happening in a few tests ...

// take a copy of resolv.conf for restoring after test completes
resolvConfSystem, err := os.ReadFile("/etc/resolv.conf")
if err != nil {
t.Fatal(err)
}
// cleanup
defer func() {
if err := os.WriteFile("/etc/resolv.conf", resolvConfSystem, 0o644); err != nil {
t.Fatal(err)
}
}()

// take a copy of resolv.conf for restoring after test completes
resolvConfSystem, err := os.ReadFile("/etc/resolv.conf")
if err != nil {
t.Fatal(err)
}
// cleanup
defer func() {
if err := os.WriteFile("/etc/resolv.conf", resolvConfSystem, 0o644); err != nil {
t.Fatal(err)
}
}()

// take a copy of resolv.conf for restoring after test completes
resolvConfSystem, err := os.ReadFile("/etc/resolv.conf")
if err != nil {
t.Fatal(err)
}
// cleanup
defer func() {
if err := os.WriteFile("/etc/resolv.conf", resolvConfSystem, 0o644); err != nil {
t.Fatal(err)
}
}()

So, I'll do the env-var thing, and fix those too.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eek, that's really scary! Thank you!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I deleted test "TestResolvConf" - as I described in now-deleted comments, it wasn't clear what it was trying to test. And, whatever it was trying to test is covered elsewhere.

I tidied up the other two by turning them into a table-driven test ... they didn't need to overwrite the host's resolv.conf or use the new env-var, they could just use libnetwork.OptionOriginResolvConfPath.

@robmry robmry force-pushed the 46968_refactor_resolvconf branch from 3faac62 to fad457a Compare February 6, 2024 10:21
@robmry robmry force-pushed the 46968_refactor_resolvconf branch 2 times, most recently from c5fd29e to cdadd99 Compare February 6, 2024 12:59
Copy link
Contributor

@corhere corhere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resolvconf_test.go has tests which depend on global state set up by other tests. We can't have interdependent unit tests.

// defaultPath.
func Path() string {
detectSystemdResolvConfOnce.Do(func() {
// Allow tests to point at their own resolv.conf file.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How? defaultPath is const.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resolvconf_test.go has tests which depend on global state set up by other tests. We can't have interdependent unit tests.

I don't think so - nothing's sharing the instance of the daemon that's started by the test, which does this...

	d := daemon.New(t, daemon.WithEnvVars("DOCKER_TEST_RESOLV_CONF_PATH="+etchostsPath))
	d.StartWithBusybox(ctx, t, "--experimental", "--ip6tables")
	defer d.Stop(t)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you reply to the right comment? You quoted my review comment in a reply to this unrelated inline comment

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh... right - I didn't notice I'd accidentally left in that comment, so your "how" question looked related to the your comment about global state.

I thought defaultPath was the state you were referring to. But, it can't affect other tests, because the daemon's started with the env var set just for this test. Which probably means I missed the point, what global state were you looking at?

(Initially I tried to read the env-var here, to initialise pathAfterSystemdDetection to whatever it said. But, I'd forgotten the path is overridden at daemon level via libnetwork options. So, I put the env-var read there instead, and failed to delete that comment. I'll do that now.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what global state were you looking at?

golden.NormalizeCRLFToLF

(Initially I tried to read the env-var here, to initialise pathAfterSystemdDetection to whatever it said. But, I'd forgotten the path is overridden at daemon level via libnetwork options. So, I put the env-var read there instead, and failed to delete that comment. I'll do that now.)

Perfect! I was confused by the orphaned comment.

@robmry robmry force-pushed the 46968_refactor_resolvconf branch 2 times, most recently from cb65a15 to e7254f9 Compare February 6, 2024 19:57
Copy link
Contributor

@corhere corhere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! 🎉

@thaJeztah
Copy link
Member

Some failures in CI; some of them look to be due to output formatting (empty lines?)

=== FAIL: github.com/docker/docker/libnetwork/internal/resolvconf TestRCTransformForIntNS (0.08s)

=== FAIL: github.com/docker/docker/libnetwork/internal/resolvconf TestRCInvalidNS (0.00s)
    resolvconf_test.go:541: assertion failed: 
        --- expected
        +++ actual
        @@ -1,5 +1,5 @@
        -↵
        -#·Based·on·host·file:·''↵
        -#·Invalid·nameservers:·[1.2.3.4.5]↵
        -#·Overrides:·[]↵
        +
        +#·Based·on·host·file:·''
        +#·Invalid·nameservers:·[1.2.3.4.5]
        +#·Overrides:·[]
         
        
        
        You can run 'go test . -update' to automatically update testdata\TestRCInvalidNS.golden to the new expected value.'

@thaJeztah
Copy link
Member

oh! perhaps LF vs CRLF?

@robmry robmry force-pushed the 46968_refactor_resolvconf branch from e7254f9 to d8aedd1 Compare February 6, 2024 21:30
@robmry
Copy link
Contributor Author

robmry commented Feb 6, 2024

oh! perhaps LF vs CRLF?

Yes, seems to be, it's getting a bit irritating! The test works for me locally on Windows, but not here.

The symbols in the diff output are \r characters, so they're in the expected output ... but they weren't in the original .golden files. I guess they were added by the git-checkout on Windows.

I added a .gitattributes to try to sort that out, but it didn't help - perhaps because the files were already checked out on the test runner by then. So, with that file in place, I tried adding in the carriage-returns an arbitrary whitespace change to get the files re-checked-out but hopefully not modified because of the .gitattributes. That didn't work.

There's no .git/info/attributes, which would override the .gitattributes. So, I'm not sure what's happening. (The change I just made removes the \r chars again, because they shouldn't have been there anyway - but not expecting the test to pass.)

Fallbacks I can think of are to go back to not-using golden files, or skipIf Windows. But neither of those are good plans. Maybe I'll have a better idea by tomorrow.

@corhere
Copy link
Contributor

corhere commented Feb 6, 2024

@robmry .gitattributes works better when it's not spelled with three consecutive "t"s.
Screenshot 2024-02-06 at 4 54 17 PM

@thaJeztah
Copy link
Member

@robmry I just discovered this; would that do the job?

// NormalizeCRLFToLF enables end-of-line normalization for actual values passed
// to Assert and String, as well as the values saved to golden files with
// -update.
//
// Defaults to true. If you use the core.autocrlf=true git setting on windows
// you will need to set this to false.
//
// The value may be set to false by setting GOTESTTOOLS_GOLDEN_NormalizeCRLFToLF=false
// in the environment before running tests.
//
// The default value may change in a future major release.
//
// This does not affect the contents of the golden files themselves. And depending on the
// git settings on your system (or in github action platform default like windows), the
// golden files may contain CRLF line endings. You can avoid this by setting the
// .gitattributes file in your repo to use LF line endings for all files, or just the golden
// files, by adding the following line to your .gitattributes file:
//
// * text=auto eol=lf
var NormalizeCRLFToLF = os.Getenv("GOTESTTOOLS_GOLDEN_NormalizeCRLFToLF") != "false"

@corhere
Copy link
Contributor

corhere commented Feb 6, 2024

@thaJeztah
Copy link
Member

🙈 I thought it came op recently, forgot it was here; sorry!

Replace regex matching/replacement and re-reading of generated files
with a simple parser, and struct to remember and manipulate the file
content.

Annotate the generated file with a header comment saying the file is
generated, but can be modified, and a trailing comment describing how
the file was generated and listing external nameservers.

Always start with the host's resolv.conf file, whether generating config
for host networking, or with/without an internal resolver - rather than
editing a file previously generated for a different use-case.

Resolves an issue where rewrites of the generated file resulted in
default IPv6 nameservers being unnecessarily added to the config.

Signed-off-by: Rob Murray <rob.murray@docker.com>
@robmry robmry force-pushed the 46968_refactor_resolvconf branch from d8aedd1 to beb97f7 Compare February 6, 2024 22:26
@robmry
Copy link
Contributor Author

robmry commented Feb 6, 2024

@robmry .gitattributes works better when it's not spelled with three consecutive "t"s.

Ohh - ha ... thank you, I couldn't see that at all!

Copy link
Member

@akerouanton akerouanton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@thaJeztah thaJeztah added this to the 26.0.0 milestone Feb 28, 2024
@thaJeztah thaJeztah merged commit 6c3b352 into moby:master Feb 29, 2024
@vvoland
Copy link
Contributor

vvoland commented Feb 29, 2024

New files from this PR need the //go:build directive to prevent go downgrade:

#47213

@robmry

@cellisten
Copy link

Sorry for necrobumping but I am confused about the intended behavior here. According to the OP it seems that dns entries in daemon.json or passed as options using --dns=.... should always be copied but I am observing that if there is a 127.0.0.53 entry in /etc/resolv.conf it completely ignores any other dns configuration. Is that intentional?

@robmry
Copy link
Contributor Author

robmry commented Apr 30, 2025

Sorry for necrobumping but I am confused about the intended behavior here. According to the OP it seems that dns entries in daemon.json or passed as options using --dns=.... should always be copied but I am observing that if there is a 127.0.0.53 entry in /etc/resolv.conf it completely ignores any other dns configuration. Is that intentional?

It doesn't sound right ... but I can't reproduce what you describe. If you think there's a bug, please could you raise it as a new issue, so we can see which build you're using, the commands you're using, and the output you're seeing (including the contents of /etc/resolv.conf on the host and in a container)?

@cellisten
Copy link

Thanks for the quick response, bug report created here:
#49903

@cellisten
Copy link

I had a lengthy discussion with chatGPT about this as well, linking for context (asking some stupid questions just to get it to understand the context, excuse the noise).
One of the things that threw me off was that docker info doesn't seem to show nameservers at all.
https://chatgpt.com/share/68127b8c-348c-8011-a966-a8a662eb8c7b

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/networking/dns Networking area/networking Networking kind/enhancement Enhancements are not bugs or new features but can improve usability or performance. status/2-code-review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Erroneously filtering out non-local IPv6 DNS servers when using user-defined networks

6 participants