Skip to content

Disallow multiple concurrent recovery attempts for same target shard#25428

Merged
ywelsch merged 1 commit intoelastic:masterfrom
ywelsch:enhance/prevent-duplicate-recoveries
Jun 28, 2017
Merged

Disallow multiple concurrent recovery attempts for same target shard#25428
ywelsch merged 1 commit intoelastic:masterfrom
ywelsch:enhance/prevent-duplicate-recoveries

Conversation

@ywelsch
Copy link
Copy Markdown
Contributor

@ywelsch ywelsch commented Jun 27, 2017

The primary shard uses the GlobalCheckPointTracker to track local checkpoint information of recovering and started replicas in order to calculate the global checkpoint. As the tracker is updated through recoveries as well, it is easier to reason about the tracker if we can ensure that there are no concurrent recovery attempts for the same target shard (which can happen in case of network disconnects).

@ywelsch ywelsch added :Distributed/Recovery Anything around constructing a new shard, either from a local or a remote source. >enhancement v6.0.0 labels Jun 27, 2017
@ywelsch ywelsch requested a review from jasontedor June 27, 2017 13:44
@ywelsch
Copy link
Copy Markdown
Contributor Author

ywelsch commented Jun 27, 2017

CLAChecker T_T

Copy link
Copy Markdown
Member

@jasontedor jasontedor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

if (onNewRecoveryException != null) {
throw onNewRecoveryException;
}
for (RecoverySourceHandler existingHandler : recoveryHandlers) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😐

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The size of this list is bound by the setting cluster.routing.allocation.node_concurrent_outgoing_recoveries, which defaults to 2. Note that this setting applies to the sum of all recoveries from this node, and the list here is specific for each shard. So this should not be a bottleneck ;)

@ywelsch ywelsch merged commit 5d1e67c into elastic:master Jun 28, 2017
@ywelsch
Copy link
Copy Markdown
Contributor Author

ywelsch commented Jun 28, 2017

Thanks @jasontedor

jasontedor added a commit to ywelsch/elasticsearch that referenced this pull request Jun 28, 2017
* master:
  Do not swallow exception when relocating
  Docs: Fix typo for request cache (elastic#25444)
  Remove implicit 32-bit support
  [DOCS] reworded to prevent code span rendering glitch (elastic#25442)
  Disallow multiple concurrent recovery attempts for same target shard (elastic#25428)
  Update global checkpoint when increasing primary term on replica (elastic#25422)
  Add backwards compatibility indices for 5.4.3
  Add version 5.4.3 after release
  Update MSI installer images (elastic#25414)
  Add missing newline at end of SetsTests.java
  Rename handoff primary context transport handler
  correct expected thrown exception in mappingMetaData to ElasticsearchParseException (elastic#25410)
  test: Make many percolator integration tests real integration tests
  [DOCS] Update docs to use shared attribute file (elastic#25403)
  Add Javadocs and tests for set difference methods
  Tests: Add parsing test for AggregationsTests (elastic#25396)
  test: get upgrade status for all indices
  Mute SignificantTermsAggregatorTests#testSignificance()
jasontedor added a commit to jasontedor/elasticsearch that referenced this pull request Jun 28, 2017
…cal-checkpoint

* enhance/single-updateshardstate-method:
  Some cleanup
  Do not swallow exception when relocating
  Docs: Fix typo for request cache (elastic#25444)
  Remove implicit 32-bit support
  [DOCS] reworded to prevent code span rendering glitch (elastic#25442)
  Disallow multiple concurrent recovery attempts for same target shard (elastic#25428)
  Update global checkpoint when increasing primary term on replica (elastic#25422)
  Add backwards compatibility indices for 5.4.3
  Add version 5.4.3 after release
  Update MSI installer images (elastic#25414)
  Add missing newline at end of SetsTests.java
  fix test
  Rename handoff primary context transport handler
  Provide single IndexShard method to update state on incoming cluster state
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Distributed/Recovery Anything around constructing a new shard, either from a local or a remote source. >enhancement v6.0.0-beta1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants