Skip to content

Various testutil improvements#40062

Merged
cpuguy83 merged 8 commits intomoby:masterfrom
thaJeztah:testutil_improvements
Oct 11, 2019
Merged

Various testutil improvements#40062
cpuguy83 merged 8 commits intomoby:masterfrom
thaJeztah:testutil_improvements

Conversation

@thaJeztah
Copy link
Member

See individual commits for a description of the changes

@thaJeztah
Copy link
Member Author

ping @tiborvass @SamWhited @kolyshkin ptal

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like currently, d.log is always the default noplogger, so nothing gets logged (I didn't find any occurrences of WithTestLogger() used anywhere)

I wonder if we should/could have these logs printed only in the case that the test fails. I seem to recall that was the default behaviour of Go tests (see golang/go#21461), but it looks like having -v (verbose) prints each test, including logs and things printed on stdout/stderr (and for the junit.xml files to be created, I think we need -v)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dropped this commit, because this would also unmount the daemon root if live-restore is enabled, probably causing this failure:

 === FAIL: amd64.integration-cli TestDockerDaemonSuite/TestExecWithUserAfterLiveRestore (2.90s)
     --- FAIL: TestDockerDaemonSuite/TestExecWithUserAfterLiveRestore (2.90s)
         daemon.go:26: Creating a new daemon at: "/go/src/github.com/docker/docker/bundles/test-integration/TestDockerDaemonSuite/TestExecWithUserAfterLiveRestore"
         docker_cli_daemon_test.go:2723: assertion failed: expression is false: err == nil: Output: unable to find user test: no matching entries in passwd file

@thaJeztah thaJeztah force-pushed the testutil_improvements branch from 2010345 to ab15be9 Compare October 9, 2019 16:23
@thaJeztah
Copy link
Member Author

Whoops; made a typo; fixed (also found some more small changes)

testutil/daemon/daemon.go:95: printf: Wrapf format % is missing verb at end of string� (govet)
  		return nil, errors.Wrapf(err, "failed to create daemon socket root %", SockRoot)

@thaJeztah thaJeztah force-pushed the testutil_improvements branch from ab15be9 to bfdeb6c Compare October 9, 2019 18:14
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While at it, we can switch to using mount.Unmount() which does all the same things.

Yet better, use RecursiveUnmount() to further simplify things here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I now recall I had a PR that made changes here; #36511, but for some reason it kept failing.

Let me try using the RecursiveUnmount() that sounds like a clean approach (if it works)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I take that back. Neither filepath.Walk() nor RecursiveUnmount() should be used here, as per #36511 (comment)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I.e. the approach in #36511 is the correct one; I would take that one here )

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me rebase the other one, to keep this one simpler (and in case it's still problematic). I can rebase it on top of this one

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be good to note in the commit message that this fixes bogus error from removing docker.pid that was never created.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is just to fix the error when the pid file doesn't exist, maybe check `os.IsNotExist so we don't ignore anything else? Something like:

if d.pidFile != "" {
  if err := os.Remove(d.pidFile); !os.IsNotExist(err) {
    return err
  }
}
return nil

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SamWhited I would

  • not care about error returned from os.Remove() at all
  • not return any error here as this will fail the test

I.e. current approach looks fine for me.

Copy link
Contributor

@kolyshkin kolyshkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left a few nitpicks and one important comment about double d.cmd.Wait()

@thaJeztah thaJeztah force-pushed the testutil_improvements branch 2 times, most recently from 6940646 to 6d82e2a Compare October 9, 2019 20:24
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this was already in there, but it seems rather fragile. Maybe we could file an issue to fix this later, or if it's already returning a specific error type we could check if it's the appropriate type?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we could have it check the daemon logs to check if it's the actual error we're expecting (although that one also changed between versions);

19.03

docker run -it --rm --privileged -v /var/lib/docker docker:19.03-dind dockerd --bridge=nosuchbridge --bip=1.1.1.1
# ....
# failed to start daemon: You specified -b & --bip, mutually exclusive options. Please specify only one

17.06

docker run -it --rm --privileged -v /var/lib/docker docker:17.06-dind dockerd --bridge=nosuchbridge --bip=1.1.1.1
# ....
# Error starting daemon: You specified -b & --bip, mutually exclusive options. Please specify only one

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, actually, that's the error we're looking for below 🤦‍♂

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is just to fix the error when the pid file doesn't exist, maybe check `os.IsNotExist so we don't ignore anything else? Something like:

if d.pidFile != "" {
  if err := os.Remove(d.pidFile); !os.IsNotExist(err) {
    return err
  }
}
return nil

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also log the error here in case a pid file still exists that needs to be cleaned up?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SamWhited I initially had that (see #40062 (comment)); question is though; is it important to fail (or log) the failure to remove the pidfile (if the test otherwise completed successfully)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could potentially let us know that something is wrong with our tests. I'd say that it's certainly not worth failing tests over, but is worth logging personally, but I defer to your judgement.

@thaJeztah thaJeztah added the kind/refactor PR's that refactor, or clean-up code label Oct 10, 2019
Copy link
Contributor

@kolyshkin kolyshkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@thaJeztah thaJeztah force-pushed the testutil_improvements branch from 6d82e2a to 083519d Compare October 10, 2019 18:02
@thaJeztah
Copy link
Member Author

Argh; Windows failed, but for some reason, Windows RS5 never prints the actual failure


[2019-10-10T18:20:14.730Z] ERROR: make.ps1 failed:
[2019-10-10T18:20:14.730Z] Unit tests failed
[2019-10-10T18:20:14.730Z] At C:\gopath\src\github.com\docker\docker\hack\make.ps1:324 char:32
[2019-10-10T18:20:14.730Z] +     if ($LASTEXITCODE -ne 0) { Throw "Unit tests failed" }
[2019-10-10T18:20:14.730Z] +                                ~~~~~~~~~~~~~~~~~~~~~~~~~
[2019-10-10T18:20:14.730Z] 

It could be a compile error, and if I remember correctly, Windows RS1 does print the actual failure (using the exact same script 🤷‍♂); I'm gonna trigger a rebuild with RS1 enabled to see if I can find what's wrong.

Probably also open a ticket for it, so that someone with more Windows knowledge could help fixing that

@thaJeztah
Copy link
Member Author

There you go; Windows RS1 correctly shows the compile error, whereas RS5 just hides it (and both report all unit tests to have "passed", but possibly the problem occurs after that);

testutil\daemon\daemon.go:221:25: cannot use d (type *Daemon) as type string in argument to cleanupNetworkNamespace

Before:

    DONE 2 tests in 12.272s
    ---> Making bundle: .integration-daemon-stop (in bundles/test-integration)
    umount: bundles/test-integration/root: mountpoint not found

After:

    DONE 2 tests in 14.650s
    ---> Making bundle: .integration-daemon-stop (in bundles/test-integration)

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
… cleanup

test-daemons remove their docker.pid when stopped, so the `.integration-daemon-stop`
script did not find the mounts for those daemons, and therefore was not unmounting
them.

As a result, cleaning up the bundles directory on consecutive runs of the tests would fail;

    rm: cannot remove 'bundles/test-integration/TestDockerSwarmSuite/TestSwarmInit/d1f188f3f5472/root': Device or resource busy

This patch unmounts the root directory of the daemon as part of the cleanup step.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This makes it easier to debug issues with tests that
start multiple daemons.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This patch stores the location of the pidfile, so that we can use the
same path that was set to create it. If no pidfile was created, we'll
not try to remove it.

We're now also ignoring errors when removing the pidfile, as they should
not fail the test (especialy if no pidfile was created in the first place,
as that could potentially hide the actual failure).

This may help with "failures" such as the one below:

```
FAIL: check_test.go:347: DockerSwarmSuite.TearDownTest

check_test.go:352:
    d.Stop(c)
/go/src/github.com/docker/docker/internal/test/daemon/daemon.go:414:
    t.Fatalf("Error while stopping the daemon %s : %v", d.id, err)
... Error: Error while stopping the daemon d1512c423813a : remove /go/src/github.com/docker/docker/bundles/test-integration/DockerSwarmSuite.TestServiceLogs/d1512c423813a/docker.pid: no such file or directory
```

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
If the daemon was stopped successfully in one of the retry-loops,
the function would return early;

```go
for {
	select {
	case err := <-d.Wait:
---> the function returns here, both on "success" and on "fail"
		return err
	case <-time.After(20 * time.Second):
...
```

In that case, the pidfile would not be cleaned up. This patch changes
the function to clean-up the pidfile in a defer, so that it will
always be removed after succesfully stopping the daemon.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
`daemon.StartWithLogFile()` already creates a goroutine that
calls `d.cmd.Waits()` and sends its return to the channel, `d.Wait`.

This code called `d.cmd.Wait()` one more time, and returns the
error, which may produce an error _because_ it's called a second
time, and potentially cause an incorrect test-result.

(thanks to Kir Kolyshkin for spotting this)

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
@thaJeztah thaJeztah force-pushed the testutil_improvements branch from 083519d to 293c1a2 Compare October 10, 2019 22:38
@thaJeztah
Copy link
Member Author

all green now

@cpuguy83 @tiborvass PTAL

Copy link
Contributor

@tiborvass tiborvass left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM but we should probably also add more t.Helper() in testutil/daemon (when daemons are starting/stopping the file lines are completely useless)

Copy link
Member

@cpuguy83 cpuguy83 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@cpuguy83 cpuguy83 merged commit 28b6457 into moby:master Oct 11, 2019
@thaJeztah thaJeztah deleted the testutil_improvements branch October 11, 2019 22:55
@thaJeztah
Copy link
Member Author

but we should probably also add more t.Helper() in testutil/daemon (when daemons are starting/stopping the file lines are completely useless)

agreed; I didn't go looking for other ones yes; this one just stood out while testing my changes

@thaJeztah thaJeztah added this to the 20.03.0 milestone Apr 2, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/testing kind/refactor PR's that refactor, or clean-up code status/2-code-review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants