Obey lock order if working with store to get metadata snapshots#24787
Merged
s1monw merged 2 commits intoelastic:masterfrom May 19, 2017
Merged
Obey lock order if working with store to get metadata snapshots#24787s1monw merged 2 commits intoelastic:masterfrom
s1monw merged 2 commits intoelastic:masterfrom
Conversation
Today when we get a metadata snapshot from the index shard we ensure that if there is no engine started on the shard that we lock the index writer before we go and fetch the store metadata. Yet, if we concurrently recover that shard, recovery finalization might fail since it can't acquire the IW lock on the directory. This is mainly due to the wrong order of aquiring the IW lock and the metadata lock. Fetching store metadata without a started engine should block on the metadata lock in Store.java but since IndexShard locks the writer first we get into a failed recovery dance especially in test. In production this is less of an issue since we rarely get into this siutation if at all. Closes elastic#24481
s1monw
added a commit
that referenced
this pull request
May 19, 2017
Today when we get a metadata snapshot from the index shard we ensure that if there is no engine started on the shard that we lock the index writer before we go and fetch the store metadata. Yet, if we concurrently recover that shard, recovery finalization might fail since it can't acquire the IW lock on the directory. This is mainly due to the wrong order of aquiring the IW lock and the metadata lock. Fetching store metadata without a started engine should block on the metadata lock in Store.java but since IndexShard locks the writer first we get into a failed recovery dance especially in test. In production this is less of an issue since we rarely get into this siutation if at all. Closes #24481
s1monw
added a commit
that referenced
this pull request
May 19, 2017
Today when we get a metadata snapshot from the index shard we ensure that if there is no engine started on the shard that we lock the index writer before we go and fetch the store metadata. Yet, if we concurrently recover that shard, recovery finalization might fail since it can't acquire the IW lock on the directory. This is mainly due to the wrong order of aquiring the IW lock and the metadata lock. Fetching store metadata without a started engine should block on the metadata lock in Store.java but since IndexShard locks the writer first we get into a failed recovery dance especially in test. In production this is less of an issue since we rarely get into this siutation if at all. Closes #24481
dnhatn
added a commit
to dnhatn/elasticsearch
that referenced
this pull request
Dec 12, 2017
Today when we get a metadata snapshot directly from a store directory, we acquire a metadata lock, then acquire an IW lock. However, we create a CheckIndex in IndexShard without acquiring the metadata lock first. This causes a recovery failed because the IW lock can be still held by `snapshotStoreMetadata`. This commit makes sure to create a CheckIndex under the metadata lock. Closes elastic#24481 Relates elastic#24787
dnhatn
added a commit
that referenced
this pull request
Dec 20, 2017
Today when we get a metadata snapshot directly from a store directory, we acquire a metadata lock, then acquire an IndexWriter lock. However, we create a CheckIndex in IndexShard without acquiring the metadata lock first. This causes a recovery failed because the IndexWriter lock can be still held by method snapshotStoreMetadata. This commit makes sure to create a CheckIndex under the metadata lock. Closes #24481 Closes #27731 Relates #24787
dnhatn
added a commit
that referenced
this pull request
Dec 20, 2017
Today when we get a metadata snapshot directly from a store directory, we acquire a metadata lock, then acquire an IndexWriter lock. However, we create a CheckIndex in IndexShard without acquiring the metadata lock first. This causes a recovery failed because the IndexWriter lock can be still held by method snapshotStoreMetadata. This commit makes sure to create a CheckIndex under the metadata lock. Closes #24481 Closes #27731 Relates #24787
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Today when we get a metadata snapshot from the index shard we ensure
that if there is no engine started on the shard that we lock the index
writer before we go and fetch the store metadata. Yet, if we concurrently
recover that shard, recovery finalization might fail since it can't acquire
the IW lock on the directory. This is mainly due to the wrong order of acquiring
the IW lock and the metadata lock. Fetching store metadata without a started engine
should block on the metadata lock in Store.java but since IndexShard locks the writer
first we get into a failed recovery dance especially in test. In production
this is less of an issue since we rarely get into this situation if at all.
Closes #24481