integration-cli: Replace sleeps with polling in swarm lock/unlock tests by aaronlehmann · Pull Request #33541 · moby/moby

aaronlehmann · 2017-06-06T13:30:54Z

This will hopefully make the tests more robust by replacing a fixed 3s
sleep with a polling loop that looks at whether the key PEM file is
encrypted or not.

cc @cyli

This will hopefully make the tests more robust by replacing a fixed 3s sleep with a polling loop that looks at whether the key PEM file is encrypted or not. Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>

aaronlehmann · 2017-06-06T19:54:46Z

A test failed in experimental. @cyli do you have any ideas what I might have missed?

cyli · 2017-06-06T20:28:58Z

@aaronlehmann I'm trying to track that down - in the logs it looks like there were 10 seconds between when the daemon was restarted and when leadership election finished and it was caught up via snapshot by the leader, so it should have succeeded (defaultReconciliationTimeout is 30)

cyli · 2017-06-07T08:50:56Z

Non-conclusive update so far: I don't think it's an issue with the test - when I replicate this test failure with a bunch of extra logging enabled, it looks like when the daemon starts up, it does get a snapshot update from one of the managers. However, on the cluster watch start (when the manager is first starting up), the manager does not seem to be spotting the unlock key change in that cluster object for some reason. Still trying to track down why.

cpuguy83 · 2017-06-20T21:06:31Z

Any news here?

cyli · 2017-06-20T21:48:51Z

Apologies, I hadn't had a chance to replicate again. The VM I was running this on ran without failing the test for a couple days until my machine rebooted. :| Will try again.

cyli · 2017-06-21T21:04:56Z

@aaronlehmann has pointed out that when restoring a snapshot, the raft store in swarm deletes all existing clusters and creates new clusters from the snapshot, so we may not get an update event. Am working on a fix.

thaJeztah · 2017-06-23T06:28:29Z

@cyli I see moby/swarmkit#2281 was merged; should we bump the swarmkit version, and close this one, or are the change in this PR still needed?

cyli · 2017-06-23T07:07:24Z

@thaJeztah This is still a good change I think - polling is a better idea than a straight 3 second sleep

LK4D4 · 2017-07-06T16:34:51Z

Code looks ok. @cyli is this okay to merge?

cyli · 2017-07-06T16:39:18Z

Apologies I thought I had already +1'ed it. It LGTM - wasn't sure if we needed to rebase or rerun the tests before merging.

aaronlehmann · 2017-07-06T16:55:55Z

Added the rebuild/* label but it doesn't seem to have done anything.

dnephin · 2017-07-06T16:58:45Z

It used to be that rebuild/ didn't work when the failing-ci label was set. I fixed it in master, but I'm not sure if we've redeployed since then.

It seems we haven't. Removed failing-ci fixed it

thaJeztah · 2017-07-06T21:02:19Z

Windows seems to be very flaky recently https://jenkins.dockerproject.org/job/Docker-PRs-WoW-RS1/15467/console

19:44:34 ----------------------------------------------------------------------
19:44:34 FAIL: check_test.go:97: DockerSuite.TearDownTest
19:44:34 
19:44:34 check_test.go:98:
19:44:34     testEnv.Clean(c, dockerBinary)
19:44:34 environment/clean.go:67:
19:44:34     t.Fatalf("error removing containers %v : %v (%s)", containers, result.Error, result.Combined())
19:44:34 ... Error: error removing containers [b53ad5b22938] : exit status 1 (Error response from daemon: Could not kill running container b53ad5b22938e29bdcb5a58dfca5676a5ce0ab16f209d1dd73599b5fff1851f3, cannot remove - Cannot kill container b53ad5b22938e29bdcb5a58dfca5676a5ce0ab16f209d1dd73599b5fff1851f3: invalid container: b53ad5b22938e29bdcb5a58dfca5676a5ce0ab16f209d1dd73599b5fff1851f3
19:44:34 )
19:44:34 
19:44:34 
19:44:34 ----------------------------------------------------------------------
19:44:34 PANIC: docker_cli_events_test.go:679: DockerSuite.TestEventsFilterImageInContainerAction
19:44:34 
19:44:34 ... Panic: Fixture has panicked (see related PANIC)
19:44:34

aaronlehmann · 2017-07-06T21:10:36Z

The windows rebuild failed immediately:

21:07:02 re-exec error: exit status 1: output: BackupWrite \\?\D:\CI\CI-1700e4c94\daemon\windowsfilter\f45a55918c7e7e83e5a1b321efcabc7415774d9634c8efa0fb91385ba2b471e7\Files\Windows\Speech\Engines\TTS\en-US\MSTTSLocEnUS.dat: There is not enough space on the disk.

thaJeztah · 2017-07-07T01:37:24Z

trying again, but looks like there's some nodes out of space 😞

ping @jhowardmsft @johnstep

lowenna · 2017-07-07T05:18:15Z

@thaJeztah yeah a few nodes have been cleared down. I'll do a purge of the others in the morning.

thaJeztah · 2017-07-07T06:43:59Z

all green now @dnephin @vdemeester LGTY?

vdemeester

LGTM 🐸

dnephin

LGTM

integration-cli: Replace sleeps with polling in swarm lock/unlock tests

17173ef

This will hopefully make the tests more robust by replacing a fixed 3s sleep with a polling loop that looks at whether the key PEM file is encrypted or not. Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>

GordonTheTurtle added the status/0-triage label Jun 6, 2017

vdemeester added status/2-code-review and removed status/0-triage labels Jun 6, 2017

aaronlehmann added the status/failing-ci Indicates that the PR in its current state fails the test suite label Jun 6, 2017

GordonTheTurtle assigned LK4D4 Jun 21, 2017

cyli mentioned this pull request Jun 21, 2017

[store] Change the Restore action on objects to update instead of delete/create moby/swarmkit#2281

Merged

aaronlehmann added the rebuild/* label Jul 6, 2017

dnephin removed the status/failing-ci Indicates that the PR in its current state fails the test suite label Jul 6, 2017

dnephin added rebuild/* and removed rebuild/* labels Jul 6, 2017

GordonTheTurtle removed the rebuild/* label Jul 6, 2017

thaJeztah added the rebuild/windowsRS1 label Jul 6, 2017

GordonTheTurtle removed the rebuild/windowsRS1 label Jul 6, 2017

thaJeztah added the rebuild/windowsRS1 label Jul 7, 2017

GordonTheTurtle removed the rebuild/windowsRS1 label Jul 7, 2017

vdemeester approved these changes Jul 7, 2017

View reviewed changes

dnephin approved these changes Jul 7, 2017

View reviewed changes

thaJeztah merged commit ced0b6c into moby:master Jul 7, 2017

Conversation

aaronlehmann commented Jun 6, 2017

Uh oh!

aaronlehmann commented Jun 6, 2017

Uh oh!

cyli commented Jun 6, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cyli commented Jun 7, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cpuguy83 commented Jun 20, 2017

Uh oh!

cyli commented Jun 20, 2017

Uh oh!

cyli commented Jun 21, 2017

Uh oh!

thaJeztah commented Jun 23, 2017

Uh oh!

cyli commented Jun 23, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

LK4D4 commented Jul 6, 2017

Uh oh!

cyli commented Jul 6, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

aaronlehmann commented Jul 6, 2017

Uh oh!

dnephin commented Jul 6, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

thaJeztah commented Jul 6, 2017

Uh oh!

aaronlehmann commented Jul 6, 2017

Uh oh!

thaJeztah commented Jul 7, 2017

Uh oh!

lowenna commented Jul 7, 2017

Uh oh!

thaJeztah commented Jul 7, 2017

Uh oh!

vdemeester left a comment

Choose a reason for hiding this comment

Uh oh!

dnephin left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

9 participants

cyli commented Jun 6, 2017 •

edited

Loading

cyli commented Jun 7, 2017 •

edited

Loading

cyli commented Jun 23, 2017 •

edited

Loading

cyli commented Jul 6, 2017 •

edited

Loading

dnephin commented Jul 6, 2017 •

edited

Loading