Add an option to return early from an allocate call#134786
Add an option to return early from an allocate call#134786elasticsearchmachine merged 14 commits intoelastic:mainfrom
Conversation
Instead of running simulation all the way to balance or no possible movement, this PR adds an option to finish early based on the number of relocating shards. Resolves: ES-12862
|
Note to reviewers: This is a draft PR to ensure that I am on the right track and align with other relevant changes. Your feedback is appreciated! 🙏 |
DiannaHohensee
left a comment
There was a problem hiding this comment.
Left some feedback/questions.
I'm primarily concerned with the ability to configure the number of relocations before returning. The purpose of this change is to avoid ever encountering THROTTLE answers that can negatively interact with NOT_PREFERRED, resulting in allocation of a shard to NOT_PREFERRED because all the other nodes returned THROTTLE (not the decision we want to make). If the number of relocations is configurable, then we can run into throttling after 2 relocations from the ThrottlingAllocationDecider and/or ConcurrentRebalanceAllocationDecider -- maybe I'm missing other THROTTLE cases.
Additionally, maybe we can sprinkle some assertions around the balancer code that we never encounter a THROTTLE answer? It would be bad if we did 😬
...ain/java/org/elasticsearch/cluster/routing/allocation/allocator/BalancedShardsAllocator.java
Outdated
Show resolved
Hide resolved
...ain/java/org/elasticsearch/cluster/routing/allocation/allocator/BalancedShardsAllocator.java
Outdated
Show resolved
Hide resolved
| if (allocation.hasTargetRelocatingShards()) { | ||
| // The shard may have been relocated on the ModelNode, but not on the cluster (RoutingNodes) if | ||
| // it is throttled. The `hasTargetRelocatingShards` check does not account for this case. | ||
| // This might be a non-issue since our intention is to have no Throttle with early returns. |
There was a problem hiding this comment.
The supposed contract of tryRelocateShard, according to the method comment, is that true means the move was executed "on the cluster" 🤔 Pre-existing outdated comment, seems like.
How about we update this code
to assert a non-throttling answer and always do the routingNodes relocate call? With your complete code changes for this task, I would never expect the balancer to run into THROTTLE again -- we're trying to avoid THROTTLE situations, which can lead to overriding not-preferred answers.There was a problem hiding this comment.
Yeah added such an assertion in the caller, see dbd0fed
...ain/java/org/elasticsearch/cluster/routing/allocation/allocator/BalancedShardsAllocator.java
Outdated
Show resolved
Hide resolved
...ain/java/org/elasticsearch/cluster/routing/allocation/allocator/BalancedShardsAllocator.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/cluster/routing/allocation/RoutingAllocation.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/cluster/routing/allocation/RoutingAllocation.java
Outdated
Show resolved
Hide resolved
...main/java/org/elasticsearch/cluster/routing/allocation/allocator/DesiredBalanceComputer.java
Outdated
Show resolved
Hide resolved
This reverts commit 4aad2a3.
...ain/java/org/elasticsearch/cluster/routing/allocation/allocator/BalancedShardsAllocator.java
Outdated
Show resolved
Hide resolved
|
The PR is now updated based on the latest discussion, specifically:
Your comments are appreciated. Thank you! |
...ain/java/org/elasticsearch/cluster/routing/allocation/allocator/BalancedShardsAllocator.java
Outdated
Show resolved
Hide resolved
| assert allocation.isSimulating() == false || balancerSettings.completeEarlyOnShardAssignmentChange() | ||
| : "inconsistent states: isSimulating [" | ||
| + allocation.isSimulating() | ||
| + "] vs completeEarlyOnShardAssignmentChange [" | ||
| + balancerSettings.completeEarlyOnShardAssignmentChange() | ||
| + "]"; |
There was a problem hiding this comment.
Based on the production code flows, the BalancedShardsAllocator should only see simulating RoutingAllocation when DesiredBalanceAllocator is in use. Otherwise it should always see a non-simulating RoutingAllocation. Therefore, we could potentially just use allocation.isSimulating() and avoid the need for balancerSettings.completeEarlyOnShardAssignmentChange(). But I decided to keep it for now because:
- They are conceptually separate things
- Instead of ensuring them being identical at all times, we only really need to ensure simulating
RoutingAllocationcan only be handled by a balancer withcompleteEarlyOnShardAssignmentChange() == true, i.e. what the assertion checks. - It's easy to remove
completeEarlyOnShardAssignmentChangein future when we are sure there is no any exception (production or test). For now, it feels safer with it.
|
Pinging @elastic/es-distributed-coordination (Team:Distributed Coordination) |
|
@nicktindall I added a test in bd00382. It would be great if you could give the PR another look. Thanks a lot! |
…#134786)" (elastic#135476) This reverts commit 31f1810.
This PR reapply #134786 with a bug fix which previously caused many test failures. Resolves: #135406 Resolves: #135407 Resolves: #135150 Resolves: #135151 Resolves: #135194 Resolves: #135248 Resolves: #135249 Resolves: #135408 Resolves: #135473 Resolves: #135474 The following is the original commit message Instead of running simulation all the way to balance or no possible movement, this PR adds an option to finish early based on the number of relocating shards. Note this early return mechanism is enabled only when desired balance is in use, i.e. it affects simulation only. Resolves: ES-12862
Instead of running simulation all the way to balance or no possible movement, this PR adds an option to finish early based on the number of relocating shards. Note this early return mechanism is enabled only when desired balance is in use, i.e. it affects simulation only.
Resolves: ES-12862