Fix ShrinkIndexIT by original-brownbear · Pull Request #44214 · elastic/elasticsearch

original-brownbear · 2019-07-11T10:14:24Z

Move this test suit to cluster scope. Currently, testShrinkThenSplitWithFailedNode stops a random node which randomly turns out to be the only shared master node so the cluster reset fails on account of the fact that no shared master node survived.
Closes [CI] Various tests in ShrinkIndexIT fail with "expected at least one master-eligible node left" #44164

@DaveCTurner can you take a look since you added that assertion in reset in 034c765? :)

* Move this test suit to cluster scope. Currently, `testShrinkThenSplitWithFailedNode` stops a random node which randomly turns out to be the only shared master node so the cluster reset fails on account of the fact that no shared master node survived. * Closes #44164

elasticmachine · 2019-07-11T10:14:25Z

Pinging @elastic/es-distributed

DaveCTurner

Good catch.

I note that the CI issue was complaining about multiple tests failing - is this because the reset() is associated with the test after the one that broke the cluster? If so, can we catch this in the right test in future by asserting in stopRandomDataNode that it isn't stopping the last-remaining shared master-eligible node (if autoManageMasterNodes == true at least)?

Also can we fix this while keeping a suite-wide cluster by choosing a node other than the unique shared master-eligible node in the offending test?

…., stop data only node

original-brownbear · 2019-07-11T11:51:01Z

@DaveCTurner

I note that the CI issue was complaining about multiple tests failing - is this because the reset() is associated with the test after the one that broke the cluster?

Yes.

If so, can we catch this in the right test in future by asserting in stopRandomDataNode that it isn't stopping the last-remaining shared master-eligible node (if autoManageMasterNodes == true at least)?

Sweet idea, done in 4a59fc0, that does indeed fail a lot nicer :)

Also can we fix this while keeping a suite-wide cluster by choosing a node other than the unique shared master-eligible node in the offending test?

Sure, also done in 4a59fc0. Just fired up a new data only node for this. As far as I can see that doesn't change the test's behavior and makes things super safe without complicating things?

DaveCTurner

LGTM thanks for the second round @original-brownbear.

original-brownbear · 2019-07-11T13:49:36Z

thanks @DaveCTurner !

* Fix ShrinkIndexIT * Move this test suit to cluster scope. Currently, `testShrinkThenSplitWithFailedNode` stops a random node which randomly turns out to be the only shared master node so the cluster reset fails on account of the fact that no shared master node survived. * Closes #44164

* The assertion added in elastic#44214 is tripped by tests running dedicated test clusters per test needlessly.This breaks existing tests like the one in elastic#44245. * Closes elastic#44245

* The assertion added in #44214 is tripped by tests running dedicated test clusters per test needlessly.This breaks existing tests like the one in #44245. * Closes #44245

Fix ShrinkIndexIT

e286978

* Move this test suit to cluster scope. Currently, `testShrinkThenSplitWithFailedNode` stops a random node which randomly turns out to be the only shared master node so the cluster reset fails on account of the fact that no shared master node survived. * Closes #44164

original-brownbear added >test Issues or PRs that are addressing/adding tests :Distributed/Distributed A catch all label for anything in the Distributed Area. Please avoid if you can. v8.0.0 v7.4.0 labels Jul 11, 2019

original-brownbear requested a review from DaveCTurner July 11, 2019 10:14

DaveCTurner reviewed Jul 11, 2019

View reviewed changes

original-brownbear added 2 commits July 11, 2019 13:03

Merge remote-tracking branch 'elastic/master' into 44164

6cf8fc8

CR: Keep shared cluster, ensure we don't stop only shared master elig…

4a59fc0

…., stop data only node

original-brownbear requested a review from DaveCTurner July 11, 2019 11:51

DaveCTurner approved these changes Jul 11, 2019

View reviewed changes

original-brownbear merged commit a052067 into elastic:master Jul 11, 2019

original-brownbear deleted the 44164 branch July 11, 2019 13:49

original-brownbear mentioned this pull request Jul 12, 2019

Fix InternalTestCluster StopRandomNode Assertion #44258

Merged

original-brownbear mentioned this pull request Jul 12, 2019

Fix InternalTestCluster StopRandomNode Assertion (#44258) #44265

Merged

jkakavas mentioned this pull request Jul 23, 2019

[CI] ShrinkIndexIT testShrinkThenSplitWithFailedNode failure #44736

Closed

jakelandis added v8.0.0-alpha1 and removed v8.0.0 labels Jul 26, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix ShrinkIndexIT#44214

Fix ShrinkIndexIT#44214
original-brownbear merged 3 commits intoelastic:masterfrom
original-brownbear:44164

original-brownbear commented Jul 11, 2019

Uh oh!

elasticmachine commented Jul 11, 2019

Uh oh!

DaveCTurner left a comment

Uh oh!

original-brownbear commented Jul 11, 2019

Uh oh!

DaveCTurner left a comment

Uh oh!

original-brownbear commented Jul 11, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

original-brownbear commented Jul 11, 2019

Uh oh!

elasticmachine commented Jul 11, 2019

Uh oh!

DaveCTurner left a comment

Choose a reason for hiding this comment

Uh oh!

original-brownbear commented Jul 11, 2019

Uh oh!

DaveCTurner left a comment

Choose a reason for hiding this comment

Uh oh!

original-brownbear commented Jul 11, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants