Ensure that max seq # is equal to the global checkpoint when creating ReadOnlyEngines#37426
Merged
tlrx merged 5 commits intoelastic:masterfrom Jan 22, 2019
Merged
Conversation
Collaborator
|
Pinging @elastic/es-distributed |
ywelsch
reviewed
Jan 16, 2019
server/src/main/java/org/elasticsearch/index/engine/ReadOnlyEngine.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/index/engine/ReadOnlyEngine.java
Outdated
Show resolved
Hide resolved
ywelsch
reviewed
Jan 17, 2019
server/src/main/java/org/elasticsearch/index/engine/ReadOnlyEngine.java
Outdated
Show resolved
Hide resolved
Member
Author
|
@elasticmachine run gradle build tests 2 |
1 similar comment
Member
Author
|
@elasticmachine run gradle build tests 2 |
6a6b959 to
08bf83e
Compare
Member
Author
|
@elasticmachine run gradle build tests 2 |
1 similar comment
Member
Author
|
@elasticmachine run gradle build tests 2 |
Member
Author
|
@elasticmachine run gradle build tests 1 |
08bf83e to
e28c9af
Compare
Member
Author
|
@elasticmachine run gradle build tests 2 |
2 similar comments
Member
Author
|
@elasticmachine run gradle build tests 2 |
Member
Author
|
@elasticmachine run gradle build tests 2 |
e28c9af to
b24b5a0
Compare
b24b5a0 to
67a486e
Compare
Member
Author
|
Thanks @ywelsch |
jasontedor
added a commit
to jasontedor/elasticsearch
that referenced
this pull request
Jan 22, 2019
* elastic/master: (43 commits) Remove remaining occurances of "include_type_name=true" in docs (elastic#37646) SQL: Return Intervals in SQL format for CLI (elastic#37602) Publish to masters first (elastic#37673) Un-assign persistent tasks as nodes exit the cluster (elastic#37656) Fail start of non-data node if node has data (elastic#37347) Use cancel instead of timeout for aborting publications (elastic#37670) Follow stats api should return a 404 when requesting stats for a non existing index (elastic#37220) Remove deprecated FieldNamesFieldMapper.Builder#index (elastic#37305) Document that date math is locale independent Bootstrap a Zen2 cluster once quorum is discovered (elastic#37463) Upgrade to lucene-8.0.0-snapshot-83f9835. (elastic#37668) Mute failing test Fix java time formatters that round up (elastic#37604) Removes awaits fix as the fix is in. (elastic#37676) Mute failing test Ensure that max seq # is equal to the global checkpoint when creating ReadOnlyEngines (elastic#37426) Mute failing discovery disruption tests Add note about how the body is referenced (elastic#33935) Define constants for REST requests endpoints in tests (elastic#37610) Make prepare engine step of recovery source non-blocking (elastic#37573) ...
tlrx
added a commit
that referenced
this pull request
Jan 29, 2019
This commit changes the TransportVerifyShardBeforeCloseAction so that it issues a forced flush, forcing the translog and the Lucene commit to contain the same max seq number and global checkpoint in the case the Translog contains operations that were not written in the IndexWriter (like a Delete that touches a non existing doc). This way the assertion added in #37426 won't trip. Related to #33888
tlrx
added a commit
that referenced
this pull request
Jan 29, 2019
This commit changes the TransportVerifyShardBeforeCloseAction so that it issues a forced flush, forcing the translog and the Lucene commit to contain the same max seq number and global checkpoint in the case the Translog contains operations that were not written in the IndexWriter (like a Delete that touches a non existing doc). This way the assertion added in #37426 won't trip. Related to #33888
tlrx
added a commit
to tlrx/elasticsearch
that referenced
this pull request
Feb 11, 2019
… ReadOnlyEngines (elastic#37426) Since version 6.7.0 the Close Index API guarantees that all translog operations have been correctly flushed before the index is closed. If the index is reopened as a Frozen index (which uses a ReadOnlyEngine) we can verify that the maximum sequence number from the last Lucene commit is indeed equal to the last known global checkpoint and refuses to open the read only engine if it's not the case. In this PR the check is only done for indices created on or after 6.7.0 as they are guaranteed to be closed using the new Close Index API. Related elastic#33888
tlrx
added a commit
that referenced
this pull request
Feb 11, 2019
…8727) The Close Index API has been refactored in 6.7.0 and it now performs pre-closing sanity checks on shards before an index is closed: the maximum sequence number must be equals to the global checkpoint. While this is a strong requirement for regular shards, we identified the need to relax this check in the case of CCR following shards. The following shards are not in charge of managing the max sequence number or global checkpoint, which are pulled from a leader shard. They also fetch and process batches of operations from the leader in an unordered way, potentially leaving gaps in the history of ops. If the following shard lags a lot it's possible that the global checkpoint and max seq number never get in sync, preventing the following shard to be closed and a new PUT Follow action to be issued on this shard (which is our recommended way to resume/restart a CCR following). This commit allows each Engine implementation to define the specific verification it must perform before closing the index. In order to allow following/frozen/closed shards to be closed whatever the max seq number or global checkpoint are, the FollowingEngine and ReadOnlyEngine do not perform any check before the index is closed. Co-authored-by: Martijn van Groningen <martijn.v.groningen@gmail.com> This commit also contains #37426. Related #33888
Member
Author
|
This has been backported to |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Since version 6.7.0 the Close Index API guarantees that all translog operations have been correctly flushed before the index is closed. If the index is reopened as a Frozen index (which uses a
ReadOnlyEngine) we can verify that the maximum sequence number from the last Lucene commit is indeed equal to the last known global checkpoint and refuses to open the read only engine if it's not the case. In this PR the check is only done for indices created on or after 6.7.0 as they are guaranteed to be closed using the new Close Index API.