Don't assign persistent tasks to nodes shutting down#72260
Don't assign persistent tasks to nodes shutting down#72260dakrone merged 5 commits intoelastic:masterfrom
Conversation
This commit changes the `PersistentTasksClusterService` to limit nodes for a task to a subset of nodes (candidates) that are not currently shutting down. It does not yet cancel tasks that may already be running on the nodes that are shut down, that will be added in a subsequent request. Relates to elastic#70338
|
Pinging @elastic/es-core-infra (Team:Core/Infra) |
|
Pinging @elastic/ml-core for some persistent tasks love |
henningandersen
left a comment
There was a problem hiding this comment.
LGTM, left a few minor comments to consider.
server/src/main/java/org/elasticsearch/cluster/metadata/NodesShutdownMetadata.java
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/persistent/PersistentTasksClusterService.java
Outdated
Show resolved
Hide resolved
.../plugin/ml/src/main/java/org/elasticsearch/xpack/ml/action/TransportStartDatafeedAction.java
Show resolved
Hide resolved
| assertBusy(() -> assertNotNull("expected to have candidate nodes chosen for task", candidates.get())); | ||
| // Check that the node that is not shut down is the only candidate | ||
| assertThat(candidates.get().stream().map(DiscoveryNode::getId).collect(Collectors.toSet()), contains(candidateNode)); | ||
| assertThat(candidates.get().stream().map(DiscoveryNode::getId).collect(Collectors.toSet()), not(contains(shutdownNode))); |
There was a problem hiding this comment.
Maybe also verify that the candidateNode was the chosen one by putting the nodeId from the task into taskCompleted in nodeOperation below?
There was a problem hiding this comment.
We don't actually have access to the nodeId from nodeOperation below, only the parent task id (which is just "cluster" in this case)
server/src/main/java/org/elasticsearch/cluster/metadata/NodesShutdownMetadata.java
Show resolved
Hide resolved
benwtrent
left a comment
There was a problem hiding this comment.
ML + transform stuff looks good! Excited for this API :D
| // If we already know that we can't find an ml node because all ml nodes are running at capacity or | ||
| // simply because there are no ml nodes in the cluster then we fail quickly here: | ||
| PersistentTasksCustomMetadata.Assignment assignment = getAssignment(params, clusterState); | ||
| PersistentTasksCustomMetadata.Assignment assignment = getAssignment(params, clusterState.nodes().getAllNodes(), clusterState); |
There was a problem hiding this comment.
This is tricky, it is possible that this validation passes as all the possible assigning nodes are shutting down, but we don't catch that.
The ML team might remove that last validation (awaiting_lazy_assignment) as now it is unreliable.
There was a problem hiding this comment.
I don't think the way it is now is any worse than it was before:
- Possible new scenario: validation passes because get assignment thinks an ML node is available when it's actually shutting down, hence the job opens but cannot be assigned
- Old scenario this replaces: validation passes because get assignment finds an ML node is available that is about to be shut down (but nobody apart from the operator knows), job gets assigned, then shortly afterwards the node shuts down and the job cannot be assigned
In both scenarios we end up with a job that is open but cannot be assigned because the cluster doesn't have room for it. And in both scenarios the solution is to add another ML node to the cluster.
This commit changes the `PersistentTasksClusterService` to limit nodes for a task to a subset of nodes (candidates) that are not currently shutting down. It does not yet cancel tasks that may already be running on the nodes that are shut down, that will be added in a subsequent request. Relates to elastic#70338
…72426) * Don't assign persistent tasks to nodes shutting down (#72260) This commit changes the `PersistentTasksClusterService` to limit nodes for a task to a subset of nodes (candidates) that are not currently shutting down. It does not yet cancel tasks that may already be running on the nodes that are shut down, that will be added in a subsequent request. Relates to #70338 * Fix transport client usage in test
This commit changes the
PersistentTasksClusterServiceto limit nodes for a task to a subset ofnodes (candidates) that are not currently shutting down.
It does not yet cancel tasks that may already be running on the nodes that are shut down, that will
be added in a subsequent request.
Relates to #70338