Various testutil improvements by thaJeztah · Pull Request #40062 · moby/moby

thaJeztah · 2019-10-09T14:15:25Z

See individual commits for a description of the changes

thaJeztah · 2019-10-09T14:15:44Z

ping @tiborvass @SamWhited @kolyshkin ptal

hack/make/.integration-daemon-stop

thaJeztah · 2019-10-09T14:36:21Z

testutil/daemon/daemon.go

Looks like currently, d.log is always the default noplogger, so nothing gets logged (I didn't find any occurrences of WithTestLogger() used anywhere)

I wonder if we should/could have these logs printed only in the case that the test fails. I seem to recall that was the default behaviour of Go tests (see golang/go#21461), but it looks like having -v (verbose) prints each test, including logs and things printed on stdout/stderr (and for the junit.xml files to be created, I think we need -v)

Dropped this commit, because this would also unmount the daemon root if live-restore is enabled, probably causing this failure:

=== FAIL: amd64.integration-cli TestDockerDaemonSuite/TestExecWithUserAfterLiveRestore (2.90s) --- FAIL: TestDockerDaemonSuite/TestExecWithUserAfterLiveRestore (2.90s) daemon.go:26: Creating a new daemon at: "/go/src/github.com/docker/docker/bundles/test-integration/TestDockerDaemonSuite/TestExecWithUserAfterLiveRestore" docker_cli_daemon_test.go:2723: assertion failed: expression is false: err == nil: Output: unable to find user test: no matching entries in passwd file

thaJeztah · 2019-10-09T16:24:41Z

Whoops; made a typo; fixed (also found some more small changes)

testutil/daemon/daemon.go:95: printf: Wrapf format % is missing verb at end of string� (govet)
  		return nil, errors.Wrapf(err, "failed to create daemon socket root %", SockRoot)

kolyshkin · 2019-10-09T18:37:36Z

testutil/daemon/daemon_unix.go

While at it, we can switch to using mount.Unmount() which does all the same things.

Yet better, use RecursiveUnmount() to further simplify things here.

I now recall I had a PR that made changes here; #36511, but for some reason it kept failing.

Let me try using the RecursiveUnmount() that sounds like a clean approach (if it works)

I take that back. Neither filepath.Walk() nor RecursiveUnmount() should be used here, as per #36511 (comment)

I.e. the approach in #36511 is the correct one; I would take that one here )

Let me rebase the other one, to keep this one simpler (and in case it's still problematic). I can rebase it on top of this one

testutil/daemon/daemon.go

kolyshkin · 2019-10-09T18:44:02Z

testutil/daemon/daemon.go

Would be good to note in the commit message that this fixes bogus error from removing docker.pid that was never created.

If this is just to fix the error when the pid file doesn't exist, maybe check `os.IsNotExist so we don't ignore anything else? Something like:

if d.pidFile != "" { if err := os.Remove(d.pidFile); !os.IsNotExist(err) { return err } } return nil

@SamWhited I would

not care about error returned from os.Remove() at all

not return any error here as this will fail the test

I.e. current approach looks fine for me.

testutil/daemon/daemon.go

kolyshkin

left a few nitpicks and one important comment about double d.cmd.Wait()

SamWhited · 2019-10-10T12:42:01Z

integration-cli/docker_cli_daemon_test.go

I know this was already in there, but it seems rather fragile. Maybe we could file an issue to fix this later, or if it's already returning a specific error type we could check if it's the appropriate type?

Perhaps we could have it check the daemon logs to check if it's the actual error we're expecting (although that one also changed between versions);

19.03

docker run -it --rm --privileged -v /var/lib/docker docker:19.03-dind dockerd --bridge=nosuchbridge --bip=1.1.1.1 # .... # failed to start daemon: You specified -b & --bip, mutually exclusive options. Please specify only one

17.06

docker run -it --rm --privileged -v /var/lib/docker docker:17.06-dind dockerd --bridge=nosuchbridge --bip=1.1.1.1 # .... # Error starting daemon: You specified -b & --bip, mutually exclusive options. Please specify only one

oh, actually, that's the error we're looking for below 🤦‍♂

testutil/daemon/daemon.go

SamWhited · 2019-10-10T12:47:21Z

testutil/daemon/daemon.go

If this is just to fix the error when the pid file doesn't exist, maybe check `os.IsNotExist so we don't ignore anything else? Something like:

if d.pidFile != "" { if err := os.Remove(d.pidFile); !os.IsNotExist(err) { return err } } return nil

SamWhited · 2019-10-10T12:48:56Z

testutil/daemon/daemon.go

Should we also log the error here in case a pid file still exists that needs to be cleaned up?

@SamWhited I initially had that (see #40062 (comment)); question is though; is it important to fail (or log) the failure to remove the pidfile (if the test otherwise completed successfully)?

It could potentially let us know that something is wrong with our tests. I'd say that it's certainly not worth failing tests over, but is worth logging personally, but I defer to your judgement.

kolyshkin

LGTM

thaJeztah · 2019-10-10T20:35:29Z

Argh; Windows failed, but for some reason, Windows RS5 never prints the actual failure


[2019-10-10T18:20:14.730Z] ERROR: make.ps1 failed:
[2019-10-10T18:20:14.730Z] Unit tests failed
[2019-10-10T18:20:14.730Z] At C:\gopath\src\github.com\docker\docker\hack\make.ps1:324 char:32
[2019-10-10T18:20:14.730Z] +     if ($LASTEXITCODE -ne 0) { Throw "Unit tests failed" }
[2019-10-10T18:20:14.730Z] +                                ~~~~~~~~~~~~~~~~~~~~~~~~~
[2019-10-10T18:20:14.730Z]

It could be a compile error, and if I remember correctly, Windows RS1 does print the actual failure (using the exact same script 🤷‍♂); I'm gonna trigger a rebuild with RS1 enabled to see if I can find what's wrong.

Probably also open a ticket for it, so that someone with more Windows knowledge could help fixing that

thaJeztah · 2019-10-10T22:23:50Z

There you go; Windows RS1 correctly shows the compile error, whereas RS5 just hides it (and both report all unit tests to have "passed", but possibly the problem occurs after that);

testutil\daemon\daemon.go:221:25: cannot use d (type *Daemon) as type string in argument to cleanupNetworkNamespace

Before: DONE 2 tests in 12.272s ---> Making bundle: .integration-daemon-stop (in bundles/test-integration) umount: bundles/test-integration/root: mountpoint not found After: DONE 2 tests in 14.650s ---> Making bundle: .integration-daemon-stop (in bundles/test-integration) Signed-off-by: Sebastiaan van Stijn <github@gone.nl>

… cleanup test-daemons remove their docker.pid when stopped, so the `.integration-daemon-stop` script did not find the mounts for those daemons, and therefore was not unmounting them. As a result, cleaning up the bundles directory on consecutive runs of the tests would fail; rm: cannot remove 'bundles/test-integration/TestDockerSwarmSuite/TestSwarmInit/d1f188f3f5472/root': Device or resource busy This patch unmounts the root directory of the daemon as part of the cleanup step. Signed-off-by: Sebastiaan van Stijn <github@gone.nl>

This makes it easier to debug issues with tests that start multiple daemons. Signed-off-by: Sebastiaan van Stijn <github@gone.nl>

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>

This patch stores the location of the pidfile, so that we can use the same path that was set to create it. If no pidfile was created, we'll not try to remove it. We're now also ignoring errors when removing the pidfile, as they should not fail the test (especialy if no pidfile was created in the first place, as that could potentially hide the actual failure). This may help with "failures" such as the one below: ``` FAIL: check_test.go:347: DockerSwarmSuite.TearDownTest check_test.go:352: d.Stop(c) /go/src/github.com/docker/docker/internal/test/daemon/daemon.go:414: t.Fatalf("Error while stopping the daemon %s : %v", d.id, err) ... Error: Error while stopping the daemon d1512c423813a : remove /go/src/github.com/docker/docker/bundles/test-integration/DockerSwarmSuite.TestServiceLogs/d1512c423813a/docker.pid: no such file or directory ``` Signed-off-by: Sebastiaan van Stijn <github@gone.nl>

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>

If the daemon was stopped successfully in one of the retry-loops, the function would return early; ```go for { select { case err := <-d.Wait: ---> the function returns here, both on "success" and on "fail" return err case <-time.After(20 * time.Second): ... ``` In that case, the pidfile would not be cleaned up. This patch changes the function to clean-up the pidfile in a defer, so that it will always be removed after succesfully stopping the daemon. Signed-off-by: Sebastiaan van Stijn <github@gone.nl>

`daemon.StartWithLogFile()` already creates a goroutine that calls `d.cmd.Waits()` and sends its return to the channel, `d.Wait`. This code called `d.cmd.Wait()` one more time, and returns the error, which may produce an error _because_ it's called a second time, and potentially cause an incorrect test-result. (thanks to Kir Kolyshkin for spotting this) Signed-off-by: Sebastiaan van Stijn <github@gone.nl>

thaJeztah · 2019-10-11T00:38:04Z

all green now

@cpuguy83 @tiborvass PTAL

tiborvass

LGTM but we should probably also add more t.Helper() in testutil/daemon (when daemons are starting/stopping the file lines are completely useless)

cpuguy83

LGTM

thaJeztah · 2019-10-11T22:57:03Z

but we should probably also add more t.Helper() in testutil/daemon (when daemons are starting/stopping the file lines are completely useless)

agreed; I didn't go looking for other ones yes; this one just stood out while testing my changes

thaJeztah added status/2-code-review area/testing process/cherry-pick labels Oct 9, 2019

thaJeztah requested a review from tianon as a code owner October 9, 2019 14:15

thaJeztah commented Oct 9, 2019

View reviewed changes

hack/make/.integration-daemon-stop Outdated Show resolved Hide resolved

thaJeztah commented Oct 9, 2019

View reviewed changes

thaJeztah force-pushed the testutil_improvements branch from 2010345 to ab15be9 Compare October 9, 2019 16:23

thaJeztah force-pushed the testutil_improvements branch from ab15be9 to bfdeb6c Compare October 9, 2019 18:14

kolyshkin reviewed Oct 9, 2019

View reviewed changes

testutil/daemon/daemon.go Outdated Show resolved Hide resolved

kolyshkin reviewed Oct 9, 2019

View reviewed changes

testutil/daemon/daemon.go Outdated Show resolved Hide resolved

kolyshkin reviewed Oct 9, 2019

View reviewed changes

testutil/daemon/daemon.go Outdated Show resolved Hide resolved

kolyshkin requested changes Oct 9, 2019

View reviewed changes

thaJeztah force-pushed the testutil_improvements branch 2 times, most recently from 6940646 to 6d82e2a Compare October 9, 2019 20:24

SamWhited reviewed Oct 10, 2019

View reviewed changes

thaJeztah added the kind/refactor PR's that refactor, or clean-up code label Oct 10, 2019

kolyshkin approved these changes Oct 10, 2019

View reviewed changes

thaJeztah force-pushed the testutil_improvements branch from 6d82e2a to 083519d Compare October 10, 2019 18:02

SamWhited approved these changes Oct 10, 2019

View reviewed changes

thaJeztah mentioned this pull request Oct 10, 2019

CI: Windows RS5 fails on compile errors, but doesn't print the error #40069

Closed

thaJeztah added 4 commits October 11, 2019 00:34

testutil/daemon: prefix all logs with daemon-id

2b3957d

This makes it easier to debug issues with tests that start multiple daemons. Signed-off-by: Sebastiaan van Stijn <github@gone.nl>

testutil/daemon: wrap errors

22662ca

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>

thaJeztah added 4 commits October 11, 2019 00:38

testutil/daemon: print all arguments when failing to start daemon

f684232

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>

thaJeztah force-pushed the testutil_improvements branch from 083519d to 293c1a2 Compare October 10, 2019 22:38

tiborvass approved these changes Oct 11, 2019

View reviewed changes

cpuguy83 approved these changes Oct 11, 2019

View reviewed changes

cpuguy83 merged commit 28b6457 into moby:master Oct 11, 2019

thaJeztah deleted the testutil_improvements branch October 11, 2019 22:55

thaJeztah added this to the 20.03.0 milestone Apr 2, 2020

thaJeztah removed the process/cherry-pick label Feb 18, 2022

Conversation

thaJeztah commented Oct 9, 2019

Uh oh!

thaJeztah commented Oct 9, 2019

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

thaJeztah commented Oct 9, 2019

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

kolyshkin left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kolyshkin left a comment

Choose a reason for hiding this comment

Uh oh!

thaJeztah commented Oct 10, 2019

Uh oh!

thaJeztah commented Oct 10, 2019

Uh oh!

thaJeztah commented Oct 11, 2019

Uh oh!

tiborvass left a comment

Choose a reason for hiding this comment

Uh oh!

cpuguy83 left a comment

Choose a reason for hiding this comment

Uh oh!

thaJeztah commented Oct 11, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants