Retrying replication requests on replica doesn't call onRetry#21189
Merged
bleskes merged 3 commits intoelastic:masterfrom Oct 31, 2016
Merged
Retrying replication requests on replica doesn't call onRetry#21189bleskes merged 3 commits intoelastic:masterfrom
onRetry#21189bleskes merged 3 commits intoelastic:masterfrom
Conversation
Contributor
|
bummer that we missed that one... glad we have it fixed now. |
s1monw
reviewed
Oct 30, 2016
| throw new AssertionError("doc [" + index.type() + "][" + index.id() + "] exists in version map (version " + versionValue + ")"); | ||
| } | ||
| try (final Searcher searcher = acquireSearcher("assert doc doesn't exist")) { | ||
| final long docsWithId = searcher.reader().totalTermFreq(index.uid()); |
Contributor
There was a problem hiding this comment.
this is tricky TTF will not respect deletes, I think you need to fetch the doc though. It's not much overhead compared to this.
s1monw
reviewed
Oct 30, 2016
| (TransportReplicationAction.ConcreteShardRequest<Request>) capturedRequest.request; | ||
| assertThat(concreteShardRequest.getRequest(), equalTo(request)); | ||
| assertThat(concreteShardRequest.getRequest().isRetrySet.get(), equalTo(true)); | ||
| assertThat(concreteShardRequest.getTargetAllocationID(), |
Contributor
|
LGTM except of the assert |
bleskes
commented
Oct 30, 2016
| throw new AssertionError("doc [" + index.type() + "][" + index.id() + "] exists in version map (version " + versionValue + ")"); | ||
| } | ||
| try (final Searcher searcher = acquireSearcher("assert doc doesn't exist")) { | ||
| final long docsWithId = searcher.searcher().count(new TermQuery(index.uid())); |
Contributor
There was a problem hiding this comment.
yeah this looks better but I think we can still have a doc in the index when we haven't refreshed AND the doc is deleted? I think we should not check the searcher if versionValue.delete() == true?
Contributor
|
LGTM |
Contributor
|
test this please |
Contributor
Author
|
thx @s1monw . I'll give it a few hours on CI before back porting. |
Contributor
|
++ thanks @bleskes |
bleskes
added a commit
that referenced
this pull request
Nov 1, 2016
Replication request may arrive at a replica before the replica's node has processed a required mapping update. In these cases the TransportReplicationAction will retry the request once a new cluster state arrives. Sadly that retry logic failed to call `ReplicationRequest#onRetry`, causing duplicates in the append only use case. This commit fixes this and also the test which missed the check. I also added an assertion which would have helped finding the source of the duplicates. This was discovered by https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+master+multijob-unix-compatibility/os=opensuse/174/ Relates #20211
bleskes
added a commit
that referenced
this pull request
Nov 1, 2016
Replication request may arrive at a replica before the replica's node has processed a required mapping update. In these cases the TransportReplicationAction will retry the request once a new cluster state arrives. Sadly that retry logic failed to call `ReplicationRequest#onRetry`, causing duplicates in the append only use case. This commit fixes this and also the test which missed the check. I also added an assertion which would have helped finding the source of the duplicates. This was discovered by https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+master+multijob-unix-compatibility/os=opensuse/174/ Relates #20211
Contributor
Author
|
This is now pushed to 5.0.1 & 5.1.0 as well |
bleskes
added a commit
that referenced
this pull request
Nov 15, 2016
When processing a mapping updates, the master current creates an `IndexService` and uses its mapper service to do the hard work. However, if the master is also a data node and it already has an instance of `IndexService`, we currently reuse the the `MapperService` of that instance. Sadly, since mapping updates are change the in memory objects, this means that a mapping change that can rejected later on during cluster state publishing will leave a side effect on the index in question, bypassing the cluster state safety mechanism. This commit removes this optimization and replaces the `IndexService` creation with a direct creation of a `MapperService`. Also, this fixes an issue multiple from multiple shards for the same field caused unneeded cluster state publishing as the current code always created a new cluster state. This were discovered while researching #21189
bleskes
added a commit
that referenced
this pull request
Nov 16, 2016
When processing a mapping updates, the master current creates an `IndexService` and uses its mapper service to do the hard work. However, if the master is also a data node and it already has an instance of `IndexService`, we currently reuse the the `MapperService` of that instance. Sadly, since mapping updates are change the in memory objects, this means that a mapping change that can rejected later on during cluster state publishing will leave a side effect on the index in question, bypassing the cluster state safety mechanism. This commit removes this optimization and replaces the `IndexService` creation with a direct creation of a `MapperService`. Also, this fixes an issue multiple from multiple shards for the same field caused unneeded cluster state publishing as the current code always created a new cluster state. This were discovered while researching #21189
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Replication request may arrive at a replica before the replica's node has processed a required mapping update. In these cases the TransportReplicationAction will retry the request once a new cluster state arrives. Sadly that retry logic failed to call
ReplicationRequest#onRetry, causing duplicates in the append only use case.This PR fixes this and also the test which missed the check. I also added an assertion which would have helped finding the source of the duplicates.
This was discovered by https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+master+multijob-unix-compatibility/os=opensuse/174/
The test also surfaces an issue with mapping updates on the master (they are potentially performed on a live index :( ) but this will be fixed in another PR.
Relates #20211