Save some RoutingNodes Instantiations#79941
Save some RoutingNodes Instantiations#79941original-brownbear merged 10 commits intoelastic:masterfrom original-brownbear:avoid-instantiating-temporary-routing-nodes
Conversation
This commit introduces the ability to create a mutable copy of `RoutingNodes` instead of having to re-compute the data structures from scratch which is much faster and allows for some reuse of immutable parts of a routing nodes instance. Also, this sets up further improvements that avoid some more redundant routing node instantiations in `RoutingAllocation`.
|
Pinging @elastic/es-distributed (Team:Distributed) |
|
@elasticmachine update branch |
henningandersen
left a comment
There was a problem hiding this comment.
I left a few smaller comments. My main question is about the benefits of this and whether we are actually saving any significant amounts of instantiation by doing this. Do you have some data on that?
One option I was wondering about is whether the normal RoutingNodes constructor can simply be tuned enough to have equivalent performance, but it might not be viable. Did you examine that direction?
| return true; | ||
| } | ||
| final RoutingNodes expected = new RoutingNodes(routingTable, nodes); | ||
| assert routingNodes.equals(expected) |
There was a problem hiding this comment.
equals does not look like it compares all fields, for instance on ShardRouting. Also, it would be even nicer to verify that all the ShardRouting instances are the same?
But then again, that really does not change with this PR, so feel free to disregard this.
| assert invariant(); | ||
| } | ||
|
|
||
| public RoutingNode copy() { |
There was a problem hiding this comment.
Can this be package private instead?
Also, perhaps rename to mutableCopy?
There was a problem hiding this comment.
Sure, though these instances are all mutable to begin with, so maybe leave the name? :)
| this.ignoredPrimaries = ignoredPrimaries; | ||
| } | ||
|
|
||
| public UnassignedShards copyTo(RoutingNodes newNodes) { |
There was a problem hiding this comment.
Perhaps rename to copyFor, at least to me To signals that it writes into the argument.
There was a problem hiding this comment.
++ renamed
| if (version == 0 && indicesRouting.isEmpty()) { | ||
| table = EMPTY_ROUTING_TABLE; |
There was a problem hiding this comment.
I am not sure I understand when this helps us, can you elaborate and/or comment?
There was a problem hiding this comment.
Hmm this just leaked in here from the other branch. It doesn't really give a big performance boost, it's more of a cleanup and I suppose saves a couple of cycles around this code in tests :) I can back it out if it's too noisy?
There was a problem hiding this comment.
reverted :)
| RoutingAllocation routingAllocation = new RoutingAllocation( | ||
| null, | ||
| new RoutingNodes(clusterState), | ||
| new RoutingNodes(clusterState.routingTable(), clusterState.nodes()), |
There was a problem hiding this comment.
Would clusterState.getRoutingNodes() not work here too (and in the other tests using the constructor)?
There was a problem hiding this comment.
Yes, literally the only reason I didn't use it was because that would instantiate that field on the ClusterState and thus change the behavior of the test. I can't see how that would be relevant here in practice, but I didn't want to make any changes to this in this test (could have just called that method before as well).
| final RoutingAllocation routingAllocation = new RoutingAllocation( | ||
| new AllocationDeciders(List.of(decider)), | ||
| new RoutingNodes(clusterState), | ||
| new RoutingNodes(clusterState.routingTable(), clusterState.nodes()), |
There was a problem hiding this comment.
Can we use clusterState.getRoutingNodes() instead?
There was a problem hiding this comment.
Same answer as above, probably fine to make that change but I didn't want to change test behavior here.
| public RoutingNodes(RoutingTable routingTable, DiscoveryNodes discoveryNodes) { | ||
| this(routingTable, discoveryNodes, true); | ||
| } | ||
|
|
||
| public RoutingNodes(ClusterState clusterState, boolean readOnly) { | ||
| public RoutingNodes(RoutingTable routingTable, DiscoveryNodes discoveryNodes, boolean readOnly) { |
There was a problem hiding this comment.
I wonder if we should make these private and add factory methods with names signaling the read-only-ness as well as that they are only to be called by ClusterState.
There was a problem hiding this comment.
Sure moved to two separate factory methods here, looks much cleaner indeed.
…g-temporary-routing-nodes
Yes, this is one of the most expensive steps in re-route today (and becomes the most expensive step in it once other fixes in #77466 to index settings have made it in). This makes the part where we create mutable routing nodes more than 50% cheaper in many cases which is a 10% saving on the full operation even today (and a larger one relatively speaking once the other fixes hit).
I have and #77466 contains a number of improvements on that front as well, there's limits on how far optimizations can go here though without fundamentally changing things. |
…uting-nodes' into avoid-instantiating-temporary-routing-nodes
DaveCTurner
left a comment
There was a problem hiding this comment.
LGTM.
We should probably have different classes for mutable and immutable RoutingNodes, given how many of the methods don't make sense if readOnly is set.
| if (version == 0 && indicesRouting.isEmpty()) { | ||
| table = EMPTY_ROUTING_TABLE; |
| this.relocatingShards = routingNodes.relocatingShards; | ||
| this.activeShardCount = routingNodes.activeShardCount; | ||
| this.totalShardCount = routingNodes.totalShardCount; | ||
| this.attributeValuesByAttribute = new HashMap<>(routingNodes.attributeValuesByAttribute.size()); |
There was a problem hiding this comment.
This is a lazy cache of some immutable properties of the immutable set of nodes so it's never meaningfully mutated, so really we could just re-use the same map, or else instantiate it as an empty map. Also if it's only used in AwarenessAllocationDecider it should be empty on immutable instances anyway. But maybe it's nicer to explicitly copy it here anyway just for clarity.
Maybe we should move this to DiscoveryNodes so it doesn't get invalidated on every cluster state update too. Probably not a big deal, just thinking out loud here.
| public RoutingNodes mutableCopy() { | ||
| // we should not call this on mutable instances, it's still expensive to create the copy and callers should instead mutate a single | ||
| // instance | ||
| assert readOnly : "tried to create a mutable copy from a mutable instance"; |
There was a problem hiding this comment.
Could we assert this in the (private) constructor instead, just in case we inadvertently add another caller?
|
Jenkins run elasticsearch-ci/part-1 (known failure) |
|
Thanks David & Henning! |
This commit introduces the ability to create a mutable copy of
RoutingNodesinstead of having to re-compute the data structures from scratch which is much
faster and allows for some reuse of immutable parts of a routing nodes instance.
Also, this sets up further improvements that avoid some more redundant routing node
instantiations in
RoutingAllocation.relates #77466