Do not perform cleanup if Manifest write fails with dirty exception#40519
Merged
andrershov merged 2 commits intoelastic:masterfrom Apr 1, 2019
Merged
Do not perform cleanup if Manifest write fails with dirty exception#40519andrershov merged 2 commits intoelastic:masterfrom
andrershov merged 2 commits intoelastic:masterfrom
Conversation
Collaborator
|
Pinging @elastic/es-distributed |
ywelsch
approved these changes
Mar 27, 2019
Contributor
ywelsch
left a comment
There was a problem hiding this comment.
Also unmute testAtomicityWithFailures?
andrershov
pushed a commit
that referenced
this pull request
Apr 1, 2019
…40519) Currently, if Manifest write is unsuccessful (i.e. WriteStateException is thrown) we perform cleanup of newly created metadata files. However, this is wrong. Consider the following sequence (caught by CI here #39077): - cluster global data is written **successful** - the associated manifest write **fails** (during the fsync, ie files have been written) - deleting (revert) the manifest files, **fails**, metadata is therefore persisted - deleting (revert) the cluster global data is **successful** In this case, when trying to load metadata (after node restart because of dirty WriteStateException), the following exception will happen ``` java.io.IOException: failed to find global metadata [generation: 0] ``` because the manifest file is referencing missing global metadata file. This commit checks if thrown WriteStateException is dirty and if its we don't perform any cleanup, because new Manifest file might be created, but its deletion has failed. In the future, we might add more fine-grained check - perform the clean up if WriteStateException is dirty, but Manifest deletion is successful. Closes #39077 (cherry picked from commit 1fac569)
andrershov
pushed a commit
that referenced
this pull request
Apr 1, 2019
…40519) Currently, if Manifest write is unsuccessful (i.e. WriteStateException is thrown) we perform cleanup of newly created metadata files. However, this is wrong. Consider the following sequence (caught by CI here #39077): - cluster global data is written **successful** - the associated manifest write **fails** (during the fsync, ie files have been written) - deleting (revert) the manifest files, **fails**, metadata is therefore persisted - deleting (revert) the cluster global data is **successful** In this case, when trying to load metadata (after node restart because of dirty WriteStateException), the following exception will happen ``` java.io.IOException: failed to find global metadata [generation: 0] ``` because the manifest file is referencing missing global metadata file. This commit checks if thrown WriteStateException is dirty and if its we don't perform any cleanup, because new Manifest file might be created, but its deletion has failed. In the future, we might add more fine-grained check - perform the clean up if WriteStateException is dirty, but Manifest deletion is successful. Closes #39077 (cherry picked from commit 1fac569)
jasontedor
added a commit
to jasontedor/elasticsearch
that referenced
this pull request
Apr 1, 2019
* elastic/7.0: [TEST] Mute WebhookHttpsIntegrationTests.testHttps [DOCS] Add 'time value' links to several monitor settings (elastic#40633) (elastic#40687) Do not perform cleanup if Manifest write fails with dirty exception (elastic#40519) Remove mention of soft deletes from getting started (elastic#40668) Fix bug in detecting use of bundled JDK on macOS Reindex conflicts clarification (docs) (elastic#40442) SQL: [Tests] Enable integration tests for fixed issues (elastic#40664) Add information about the default sort mode (elastic#40657) SQL: [Docs] Fix example for CURDATE SQL: [Docs] Fix doc errors regarding CURRENT_DATE. (elastic#40649) Clarify using time_zone and date math in range query (elastic#40655) Add notice for bundled jdk (elastic#40576) disable kerberos test until kerberos fixture is working again [DOCS] Use "source" instead of "inline" in ML docs (elastic#40635) Unmute and fix testSubParserArray (elastic#40626) Geo Point parse error fix (elastic#40447) Increase suite timeout to 30 minutes for docs tests (elastic#40521) Fix repository-hdfs when no docker and unnecesary fixture Avoid building hdfs-fixure use an image that works instead
gurkankaymak
pushed a commit
to gurkankaymak/elasticsearch
that referenced
this pull request
May 27, 2019
…lastic#40519) Currently, if Manifest write is unsuccessful (i.e. WriteStateException is thrown) we perform cleanup of newly created metadata files. However, this is wrong. Consider the following sequence (caught by CI here elastic#39077): - cluster global data is written **successful** - the associated manifest write **fails** (during the fsync, ie files have been written) - deleting (revert) the manifest files, **fails**, metadata is therefore persisted - deleting (revert) the cluster global data is **successful** In this case, when trying to load metadata (after node restart because of dirty WriteStateException), the following exception will happen ``` java.io.IOException: failed to find global metadata [generation: 0] ``` because the manifest file is referencing missing global metadata file. This commit checks if thrown WriteStateException is dirty and if its we don't perform any cleanup, because new Manifest file might be created, but its deletion has failed. In the future, we might add more fine-grained check - perform the clean up if WriteStateException is dirty, but Manifest deletion is successful. Closes elastic#39077
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Currently if Manifest write is unsuccesful (i.e. WriteStateException is thrown) we perform cleanup of newly created metadata files. However, this is wrong.
Consider the following sequence (catched by CI here #39077):
In this case, when trying to load metadata (after node restart because of dirty WriteStateException), the following exception will happen
because manifest file is referencing missing global metadata file.
This commit checks if thrown WriteStateException is dirty and if its we don't perform any cleanup, because new Manifest file might be created, but its deletion has failed.
In the future, we might add more fine grained check - perform the cleanup if WriteStateException is dirty, but Manifest deletion is succesful.
Closes #39077