Re-fetch shard info of primary when new node joins#47035
Re-fetch shard info of primary when new node joins#47035dnhatn merged 11 commits intoelastic:masterfrom
Conversation
|
Pinging @elastic/es-distributed |
This was fixed in #47016. |
henningandersen
left a comment
There was a problem hiding this comment.
LGTM, but I am not sure my review counts on this one 😃...
server/src/test/java/org/elasticsearch/gateway/ReplicaShardAllocatorIT.java
Show resolved
Hide resolved
original-brownbear
left a comment
There was a problem hiding this comment.
Just some drive-by comments
server/src/test/java/org/elasticsearch/gateway/ReplicaShardAllocatorIT.java
Outdated
Show resolved
Hide resolved
server/src/test/java/org/elasticsearch/gateway/AsyncShardFetchTests.java
Outdated
Show resolved
Hide resolved
server/src/test/java/org/elasticsearch/gateway/ReplicaShardAllocatorIT.java
Show resolved
Hide resolved
original-brownbear
left a comment
There was a problem hiding this comment.
LGTM :) Thanks Nhat!
Might be best for David or Yannick to look over this as well though. I understand what's going on here just fine now, but might miss some implication of this change.
|
@DaveCTurner @ywelsch This PR blocks the work in #46959. It would be great if one of you can take a look. Thank you! |
DaveCTurner
left a comment
There was a problem hiding this comment.
Looks good, I suggested a comment and a few small changes.
server/src/main/java/org/elasticsearch/gateway/GatewayAllocator.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/gateway/GatewayAllocator.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/gateway/GatewayAllocator.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/gateway/GatewayAllocator.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/gateway/GatewayAllocator.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/gateway/GatewayAllocator.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/gateway/GatewayAllocator.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/gateway/GatewayAllocator.java
Outdated
Show resolved
Hide resolved
…r.java Co-Authored-By: David Turner <david.turner@elastic.co>
…r.java Co-Authored-By: David Turner <david.turner@elastic.co>
|
Test failures are fixed in #47196. |
|
@henningandersen @original-brownbear @DaveCTurner Thank you for reviewing. |
Today, we don't clear the shard info of the primary shard when a new node joins; then we might risk of making replica allocation decisions based on the stale information of the primary. The serious problem is that we can cancel the current recovery which is more advanced than the copy on the new node due to the old info we have from the primary. With this change, we ensure the shard info from the primary is not older than any node when allocating replicas. Relates #46959 This work was done by Henning in #42518. Co-authored-by: Henning Andersen <henning.andersen@elastic.co>
Today, we don't clear the shard info of the primary shard when a new node joins; then we might risk of making replica allocation decisions based on the stale information of the primary. The serious problem is that we can cancel the current recovery which is more advanced than the copy on the new node due to the old info we have from the primary.
With this change, we ensure the shard info from the primary is not older than any node when allocating replicas.
Relates #46959
This work was done by @henningandersen in #42518.
Co-authored-by: Henning Andersen henning.andersen@elastic.co