Skip to content

Fix Cluster Stabilization in SnapshotResiliencyTests#55159

Merged
original-brownbear merged 1 commit intoelastic:masterfrom
original-brownbear:55103
Apr 14, 2020
Merged

Fix Cluster Stabilization in SnapshotResiliencyTests#55159
original-brownbear merged 1 commit intoelastic:masterfrom
original-brownbear:55103

Conversation

@original-brownbear
Copy link
Copy Markdown
Contributor

Just like in AbstractCoordinatorTestCase we can't just assume the cluster
is stable once all the cluster states align since stray follower/leader check
tasks could still hit us after a disconnect, causing future test operations to fail.
=> fixed by running all tasks in the possible time span of running into these
checks before validating that cluster states align on all nodes to prevent this
like we do in the coordinator tests.

Closes #55103

Just like in `AbstractCoordinatorTestCase` we can't just assume the cluster
is stable once all the cluster states align since stray follower/leader check
tasks could still hit us after a disconnect, causing future test operations to fail.
=> fixed by running all tasks in the possible time span of running into these
checks before validating that cluster states align on all nodes to prevent this
like we do in the coordinator tests.

Closes #55103
@original-brownbear original-brownbear added >test Issues or PRs that are addressing/adding tests :Distributed/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs v8.0.0 v7.8.0 labels Apr 14, 2020
@elasticmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-distributed (:Distributed/Snapshot/Restore)

Copy link
Copy Markdown
Contributor

@ywelsch ywelsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@original-brownbear
Copy link
Copy Markdown
Contributor Author

Thanks Yannick!

@original-brownbear original-brownbear merged commit 88f5149 into elastic:master Apr 14, 2020
@original-brownbear original-brownbear deleted the 55103 branch April 14, 2020 15:27
original-brownbear added a commit that referenced this pull request Apr 14, 2020
Just like in `AbstractCoordinatorTestCase` we can't just assume the cluster
is stable once all the cluster states align since stray follower/leader check
tasks could still hit us after a disconnect, causing future test operations to fail.
=> fixed by running all tasks in the possible time span of running into these
checks before validating that cluster states align on all nodes to prevent this
like we do in the coordinator tests.

Closes #55103
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Distributed/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs >test Issues or PRs that are addressing/adding tests v7.8.0 v8.0.0-alpha1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[CI] SnapshotResiliencyTests.testSnapshotDeleteWithMasterFailover failure

4 participants