Put a fake allocation id on allocate stale primary command#34140
Conversation
… after recovery is done Relates to elastic#33432
|
Pinging @elastic/es-distributed |
…fore recovery is done (historyUUID is adjusted):
…cation_allocation_id
…efore recovery is done (historyUUID is adjusted)
bleskes
left a comment
There was a problem hiding this comment.
I left some comments. I also think we need some unit tests. Maybe here: AllocationCommandsTests
| addAllocationId(startedShard); | ||
| if (startedShard.primary() | ||
| // started shard has to have null recoverySource; have to pick up recoverySource from its initializing state | ||
| && (initializingShard.recoverySource() instanceof RecoverySource.ExistingStoreRecoverySource |
There was a problem hiding this comment.
Why are snapshot and restore relevant here? I hope we can make this more explicit, depending on ExistingStoreRecoverySource#FORCE_STALE_PRIMARY_INSTANCE in some form or fashion
| indexMetaDataBuilder.putInSyncAllocationIds(shardId.id(), Collections.emptySet()); | ||
| } else { | ||
| assert recoverySource instanceof RecoverySource.ExistingStoreRecoverySource | ||
| || recoverySource instanceof RecoverySource.SnapshotRecoverySource |
There was a problem hiding this comment.
snapshot/restore behaves like allocating a stale primary ( confirmed by @ywelsch )
Enforced check and extended comment at fbdf6d7
RecoverySource.ExistingStoreRecoverySource was introduced due to misleading by BalanceConfigurationTests test failure (fixed at d3df30c):
it used oversimplified allocator and it leads to inconsistent behaviour allocation decision (and after that inconsistent cluster state). On removing nodes some shards could be completely unassigned, in such cases PrimaryShardAllocator makes decision to to allocate that shard as NO_VALID_SHARD_COPY, while used in test NoopGatewayAllocator ignores that.
| shardRouting.recoverySource().getType() == RecoverySource.Type.EXISTING_STORE && | ||
| inSyncAllocationIds.contains(shardRouting.allocationId().getId()) == false) | ||
| inSyncAllocationIds.contains(shardRouting.allocationId().getId()) == false && | ||
| inSyncAllocationIds.contains(RecoverySource.ExistingStoreRecoverySource.FORCED_ALLOCATION_ID) == false) |
There was a problem hiding this comment.
I think we want to go for hard equality here - the inSync set should be exactly the FORCE_ALLOCATION_ID
| && (initializingShard.recoverySource() instanceof RecoverySource.ExistingStoreRecoverySource | ||
| || initializingShard.recoverySource() instanceof RecoverySource.SnapshotRecoverySource)) { | ||
| Updates updates = changes(startedShard.shardId()); | ||
| updates.removedAllocationIds.add(RecoverySource.ExistingStoreRecoverySource.FORCED_ALLOCATION_ID); |
There was a problem hiding this comment.
can we make sure this is the only one?
There was a problem hiding this comment.
in e331b1f I added assert check in another place where we have old and new allocation ids
…g fake allocation id
…NoopGatewayAllocator that generated inconsistent cluster state
…y FORCE_STALE_PRIMARY_INSTANCE
…cation_allocation_id
|
extended AllocationCommandsTests with AllocateStalePrimaryAllocationCommand in 8b697ef |
…cation_allocation_id
bleskes
left a comment
There was a problem hiding this comment.
@vladimirdolzhenko sorry for the delay. I left some comments
server/src/main/java/org/elasticsearch/cluster/routing/IndexRoutingTable.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/cluster/routing/RecoverySource.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/cluster/routing/allocation/IndexMetaDataUpdater.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/cluster/routing/allocation/IndexMetaDataUpdater.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/cluster/routing/allocation/IndexMetaDataUpdater.java
Outdated
Show resolved
Hide resolved
server/src/test/java/org/elasticsearch/cluster/routing/AllocationIdIT.java
Outdated
Show resolved
Hide resolved
server/src/test/java/org/elasticsearch/cluster/routing/AllocationIdIT.java
Outdated
Show resolved
Hide resolved
server/src/test/java/org/elasticsearch/cluster/routing/AllocationIdIT.java
Outdated
Show resolved
Hide resolved
server/src/test/java/org/elasticsearch/cluster/routing/allocation/AllocationCommandsTests.java
Show resolved
Hide resolved
...er/src/test/java/org/elasticsearch/cluster/routing/allocation/BalanceConfigurationTests.java
Show resolved
Hide resolved
…) equal to startedShard.allocationId()
…the fake allocation id
…cation_allocation_id
|
run gradle build tests |
bleskes
left a comment
There was a problem hiding this comment.
Thx @vladimirdolzhenko, I left some more small comments
server/src/main/java/org/elasticsearch/cluster/routing/allocation/IndexMetaDataUpdater.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/cluster/routing/allocation/IndexMetaDataUpdater.java
Outdated
Show resolved
Hide resolved
server/src/test/java/org/elasticsearch/cluster/routing/AllocationIdIT.java
Outdated
Show resolved
Hide resolved
server/src/test/java/org/elasticsearch/cluster/routing/AllocationIdIT.java
Outdated
Show resolved
Hide resolved
server/src/test/java/org/elasticsearch/cluster/routing/AllocationIdIT.java
Outdated
Show resolved
Hide resolved
server/src/test/java/org/elasticsearch/cluster/routing/AllocationIdIT.java
Outdated
Show resolved
Hide resolved
server/src/test/java/org/elasticsearch/cluster/routing/AllocationIdIT.java
Outdated
Show resolved
Hide resolved
server/src/test/java/org/elasticsearch/cluster/routing/AllocationIdIT.java
Outdated
Show resolved
Hide resolved
server/src/test/java/org/elasticsearch/cluster/routing/AllocationIdIT.java
Outdated
Show resolved
Hide resolved
… node name instead of node id
|
@bleskes I addressed your comments, could you please have another look ? Thanks |
…cation_allocation_id
bleskes
left a comment
There was a problem hiding this comment.
LGTM. Thanks for all the iterations.
|
Thanks @bleskes for the review and comments. |
* master: (24 commits) Replicate index settings to followers (elastic#35089) Rename RealmConfig.globalSettings() to settings() (elastic#35330) [TEST] Cleanup FileUserPasswdStoreTests (elastic#35329) Scripting: Add back lookup vars in score script (elastic#34833) watcher: Fix integration tests to ensure correct start/stop of Watcher (elastic#35271) Remove ALL shard check in CheckShrinkReadyStep (elastic#35346) Use soft-deleted docs to resolve strategy for engine operation (elastic#35230) [ILM] Check shard and relocation status in AllocationRoutedStep (elastic#35316) Ignore date ranges containing 'now' when pre-processing a percolator query (elastic#35160) Add a frozen engine implementation (elastic#34357) Put a fake allocation id on allocate stale primary command (elastic#34140) [CCR] Enforce auto follow pattern name restrictions (elastic#35197) [ILM] rolling upgrade tests (elastic#35328) [ML] Add Missing data checking class (elastic#35310) Apply `ignore_throttled` also to concrete indices (elastic#35335) Make version field names more meaningful (elastic#35334) [CCR] Added HLRC support for pause follow API (elastic#35216) [Docs] Improve Convert Processor description (elastic#35280) [Painless] Removes extraneous compile method (elastic#35323) [CCR] Fail with a better error if leader index is red (elastic#35298) ...
…4140) removes fake allocation id after recovery is done Relates to elastic#33432
Put a fake allocation id on allocate stale primary command; remove it after recovery is done
Spin-off of #33432 (comment)
Eliminates case when the node after AllocateStalePrimary (or after recovery from a snapshot) crashes between shard initialization and shard recovery and allocation id is already adjusted.
Relates to #33432