Recalculate balance using up-to date weights when THROTTLE#88385
Recalculate balance using up-to date weights when THROTTLE#88385idegtiarenko merged 8 commits intoelastic:masterfrom
Conversation
Add a test that checks a corner situation when currently after moving some shards around we end up with exactly the same balance as before.
|
Hi @idegtiarenko, I've created a changelog YAML for you. |
|
Pinging @elastic/es-distributed (Team:Distributed) |
| addIndex(metadataBuilder, routingTableBuilder, "index-3", 10, Map.of("node-0", 3, "node-1", 3, "node-2", 4)); | ||
| addIndex(metadataBuilder, routingTableBuilder, "index-4", 10, Map.of("node-0", 4, "node-1", 3, "node-2", 3)); | ||
| addIndex(metadataBuilder, routingTableBuilder, "index-5", 10, Map.of("node-0", 3, "node-1", 4, "node-2", 3)); | ||
| addIndex(metadataBuilder, routingTableBuilder, "index-6", 10, Map.of("node-0", 3, "node-1", 3, "node-2", 4)); |
There was a problem hiding this comment.
I am checking if it is possible to reproduce it with a simpler setup
| assert decision.type() == Type.THROTTLE; | ||
| minNode.addShard(shard.relocate(minNode.getNodeId(), shardSize)); | ||
| return false; | ||
| return true; |
There was a problem hiding this comment.
To add a bit of context:
return false is used when exiting this method without moving a shard and as a result not updating the node weights.
This means that previously the weights here were not updated if a move was simulated. This could lead to rebalancing using outdated weights if two or more moves in a row for a single index are simulated.
There was a problem hiding this comment.
👍
The two branches of the if (decision.type() == Type.YES) test are now almost the same. Does it work to consolidate them?
There was a problem hiding this comment.
I think there's more to consolidate. How about this?
diff --git a/server/src/main/java/org/elasticsearch/cluster/routing/allocation/allocator/BalancedShardsAllocator.java b/server/src/main/java/org/elasticsearch/cluster/routing/allocation/allocator/BalancedShardsAllocator.java
index 97bdbd7367a..66c1c7f2001 100644
--- a/server/src/main/java/org/elasticsearch/cluster/routing/allocation/allocator/BalancedShardsAllocator.java
+++ b/server/src/main/java/org/elasticsearch/cluster/routing/allocation/allocator/BalancedShardsAllocator.java
@@ -1073,18 +1073,21 @@ public class BalancedShardsAllocator implements ShardsAllocator {
maxNode.removeShard(shard);
long shardSize = allocation.clusterInfo().getShardSize(shard, ShardRouting.UNAVAILABLE_EXPECTED_SHARD_SIZE);
- if (decision.type() == Type.YES) {
- /* only allocate on the cluster if we are not throttled */
- logger.debug("Relocate [{}] from [{}] to [{}]", shard, maxNode.getNodeId(), minNode.getNodeId());
- minNode.addShard(routingNodes.relocateShard(shard, minNode.getNodeId(), shardSize, allocation.changes()).v1());
- return true;
- } else {
- /* allocate on the model even if throttled */
- logger.debug("Simulate relocation of [{}] from [{}] to [{}]", shard, maxNode.getNodeId(), minNode.getNodeId());
- assert decision.type() == Type.THROTTLE;
- minNode.addShard(shard.relocate(minNode.getNodeId(), shardSize));
- return false;
- }
+ assert decision.type() == Type.YES || decision.type() == Type.THROTTLE : decision.type();
+ logger.debug(
+ "decision [{}]: relocate [{}] from [{}] to [{}]",
+ decision.type(),
+ shard,
+ maxNode.getNodeId(),
+ minNode.getNodeId()
+ );
+ minNode.addShard(
+ decision.type() == Type.YES
+ /* only allocate on the cluster if we are not throttled */
+ ? routingNodes.relocateShard(shard, minNode.getNodeId(), shardSize, allocation.changes()).v1()
+ : shard.relocate(minNode.getNodeId(), shardSize)
+ );
+ return true;
}
}
logger.trace("No shards of [{}] can relocate from [{}] to [{}]", idx, maxNode.getNodeId(), minNode.getNodeId());| addIndex(metadataBuilder, routingTableBuilder, "index-0", Map.of("node-0", 4, "node-1", 4, "node-2", 2)); | ||
| addIndex(metadataBuilder, routingTableBuilder, "index-1", Map.of("node-0", 4, "node-1", 4, "node-2", 2)); | ||
| addIndex(metadataBuilder, routingTableBuilder, "index-2", Map.of("node-0", 3, "node-1", 3, "node-2", 4)); | ||
| addIndex(metadataBuilder, routingTableBuilder, "index-3", Map.of("node-0", 3, "node-1", 3, "node-2", 4)); | ||
| addIndex(metadataBuilder, routingTableBuilder, "index-4", Map.of("node-0", 4, "node-1", 3, "node-2", 3)); | ||
| addIndex(metadataBuilder, routingTableBuilder, "index-5", Map.of("node-0", 3, "node-1", 4, "node-2", 3)); | ||
| addIndex(metadataBuilder, routingTableBuilder, "index-6", Map.of("node-0", 3, "node-1", 3, "node-2", 4)); |
There was a problem hiding this comment.
I think this might be quite dependent on the ordering of these indices in a map. Should we randomise their names and the order in which they're added?
| assert decision.type() == Type.THROTTLE; | ||
| minNode.addShard(shard.relocate(minNode.getNodeId(), shardSize)); | ||
| return false; | ||
| return true; |
There was a problem hiding this comment.
👍
The two branches of the if (decision.type() == Type.YES) test are now almost the same. Does it work to consolidate them?
| @@ -0,0 +1,5 @@ | |||
| pr: 88385 | |||
| summary: Test balance improves after rebalancing | |||
There was a problem hiding this comment.
Can we improve the summary here? It doesn't tell me much about the bug or the fix.
Fix a bug when a shard could be re-balanced using outdated node weights in case 2 or more shards movements in a row for a single index were simulated due to THROTTLE.
Closes: #88384