Skip to content

Don't assign persistent tasks to nodes shutting down#72260

Merged
dakrone merged 5 commits intoelastic:masterfrom
dakrone:persistent-task-shutdown
Apr 28, 2021
Merged

Don't assign persistent tasks to nodes shutting down#72260
dakrone merged 5 commits intoelastic:masterfrom
dakrone:persistent-task-shutdown

Conversation

@dakrone
Copy link
Copy Markdown
Member

@dakrone dakrone commented Apr 26, 2021

This commit changes the PersistentTasksClusterService to limit nodes for a task to a subset of
nodes (candidates) that are not currently shutting down.

It does not yet cancel tasks that may already be running on the nodes that are shut down, that will
be added in a subsequent request.

Relates to #70338

This commit changes the `PersistentTasksClusterService` to limit nodes for a task to a subset of
nodes (candidates) that are not currently shutting down.

It does not yet cancel tasks that may already be running on the nodes that are shut down, that will
be added in a subsequent request.

Relates to elastic#70338
@dakrone dakrone added v8.0.0 :Core/Infra/Node Lifecycle Node startup, bootstrapping, and shutdown v7.14.0 labels Apr 26, 2021
@elasticmachine elasticmachine added the Team:Core/Infra Meta label for core/infra team label Apr 26, 2021
@elasticmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-core-infra (Team:Core/Infra)

@davidkyle
Copy link
Copy Markdown
Member

Pinging @elastic/ml-core for some persistent tasks love

Copy link
Copy Markdown
Contributor

@henningandersen henningandersen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, left a few minor comments to consider.

assertBusy(() -> assertNotNull("expected to have candidate nodes chosen for task", candidates.get()));
// Check that the node that is not shut down is the only candidate
assertThat(candidates.get().stream().map(DiscoveryNode::getId).collect(Collectors.toSet()), contains(candidateNode));
assertThat(candidates.get().stream().map(DiscoveryNode::getId).collect(Collectors.toSet()), not(contains(shutdownNode)));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe also verify that the candidateNode was the chosen one by putting the nodeId from the task into taskCompleted in nodeOperation below?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't actually have access to the nodeId from nodeOperation below, only the parent task id (which is just "cluster" in this case)

Copy link
Copy Markdown
Contributor

@AthenaEryma AthenaEryma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM as well!

Copy link
Copy Markdown
Member

@benwtrent benwtrent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ML + transform stuff looks good! Excited for this API :D

// If we already know that we can't find an ml node because all ml nodes are running at capacity or
// simply because there are no ml nodes in the cluster then we fail quickly here:
PersistentTasksCustomMetadata.Assignment assignment = getAssignment(params, clusterState);
PersistentTasksCustomMetadata.Assignment assignment = getAssignment(params, clusterState.nodes().getAllNodes(), clusterState);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is tricky, it is possible that this validation passes as all the possible assigning nodes are shutting down, but we don't catch that.

The ML team might remove that last validation (awaiting_lazy_assignment) as now it is unreliable.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the way it is now is any worse than it was before:

  • Possible new scenario: validation passes because get assignment thinks an ML node is available when it's actually shutting down, hence the job opens but cannot be assigned
  • Old scenario this replaces: validation passes because get assignment finds an ML node is available that is about to be shut down (but nobody apart from the operator knows), job gets assigned, then shortly afterwards the node shuts down and the job cannot be assigned

In both scenarios we end up with a job that is open but cannot be assigned because the cluster doesn't have room for it. And in both scenarios the solution is to add another ML node to the cluster.

@dakrone dakrone merged commit 0f50800 into elastic:master Apr 28, 2021
@dakrone dakrone deleted the persistent-task-shutdown branch April 28, 2021 20:00
dakrone added a commit to dakrone/elasticsearch that referenced this pull request Apr 28, 2021
This commit changes the `PersistentTasksClusterService` to limit nodes for a task to a subset of
nodes (candidates) that are not currently shutting down.

It does not yet cancel tasks that may already be running on the nodes that are shut down, that will
be added in a subsequent request.

Relates to elastic#70338
dakrone added a commit that referenced this pull request Apr 28, 2021
…72426)

* Don't assign persistent tasks to nodes shutting down (#72260)

This commit changes the `PersistentTasksClusterService` to limit nodes for a task to a subset of
nodes (candidates) that are not currently shutting down.

It does not yet cancel tasks that may already be running on the nodes that are shut down, that will
be added in a subsequent request.

Relates to #70338

* Fix transport client usage in test
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Core/Infra/Node Lifecycle Node startup, bootstrapping, and shutdown >enhancement Team:Core/Infra Meta label for core/infra team v7.14.0 v8.0.0-alpha1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants