Fix Snapshot Completion Listener Lost on Master Failover#54286
Merged
original-brownbear merged 4 commits intoelastic:masterfrom Mar 27, 2020
original-brownbear:master-failover-after-finalization-bug
Merged
Fix Snapshot Completion Listener Lost on Master Failover#54286original-brownbear merged 4 commits intoelastic:masterfrom original-brownbear:master-failover-after-finalization-bug
original-brownbear merged 4 commits intoelastic:masterfrom
original-brownbear:master-failover-after-finalization-bug
Conversation
If master fails over before (or we run into any other exception) when removing the snapshot from the CS we must still resolve all the completion listeners for the snapshot.
Collaborator
|
Pinging @elastic/es-distributed (:Distributed/Snapshot/Restore) |
…ter-finalization-bug
ywelsch
approved these changes
Mar 27, 2020
Contributor
ywelsch
left a comment
There was a problem hiding this comment.
LGTM. Ideally, in the future, we will support master failovers on these listeners (as we currently communicate the snapshot as failed even though it might be completed successfully by another master?)
Contributor
Author
|
Thanks Yannick! Hmpf BwC tests are busted because of #54313
Already doing that in the concurrent snapshots branch via cluster state observers :) |
…ter-finalization-bug
Contributor
Author
|
Jenkins test this (seems Jenkins completely locked up mid-way) |
This was referenced Mar 27, 2020
original-brownbear
added a commit
that referenced
this pull request
Mar 27, 2020
…4330) * Fix Snapshot Completion Listener Lost on Master Failover If master fails over before (or we run into any other exception) when removing the snapshot from the CS we must still resolve all the completion listeners for the snapshot.
original-brownbear
added a commit
that referenced
this pull request
Mar 27, 2020
…4332) * Fix Snapshot Completion Listener Lost on Master Failover If master fails over before (or we run into any other exception) when removing the snapshot from the CS we must still resolve all the completion listeners for the snapshot.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
If master fails over before (or we run into any other exception) when removing
the snapshot from the CS we must still resolve all the completion listeners for
the snapshot.
Without this change master failing over during a CS update in snapshot finalization will lead to leaking the snapshot listeners in the
SnapshotsServicelisteners map and will snapshot and snapshot delete requests to never be answered on the transport layer + it's a (small) memory leak.