Provide a ClusterStateTaskExecutor implementation for simple batch execution by pxsalehi · Pull Request #90343 · elastic/elasticsearch

pxsalehi · 2022-09-26T10:14:29Z

pxsalehi · 2022-09-26T10:54:06Z

@elasticmachine run elasticsearch-ci/bwc

…erStateExecutor

pxsalehi · 2022-09-27T12:36:04Z

I have also adapted five of the existing custom batch executors (that were simply iterating the tasks) to use the new class. There are about 24 implementations of ClusterStateTaskExecutor left. I will go through them again, maybe I could update one or two of them. The rest seem to fall into two categories:

They do have some low-level logic to them, although not all seem to do anything complicated and potentially could be rewritten to make them simpler. I am reluctant to touch those since I am not entirely sure what the implication would be. Some do some validation or keep track of the previous tasks in the batch, e.g. AutoCreateAction.
Some would have been easy to change, but they pass around some values to their taskContext.success() or have custom logic there relying on the values calculated during task execution. E.g. FinalizeBlocksExecutor and AddBlocksExecutor in MetadataIndexStateService.

I am not sure whether we'd need to also try to cover those cases somehow.

elasticsearchmachine · 2022-09-27T12:52:30Z

Pinging @elastic/es-distributed (Team:Distributed)

DaveCTurner

Looks great. I left a few suggestions/comments.

DaveCTurner · 2022-09-27T12:56:32Z

server/src/main/java/org/elasticsearch/cluster/ClusterStateTaskExecutor.java

+     * series of executions, each taking an input cluster state and producing a new cluster state that serves as the
+     * input of the next task in the batch.
+     */
+    abstract class DefaultBatchExecutor<T extends ClusterStateTaskListener> implements ClusterStateTaskExecutor<T> {


Naming (and/or marketing) suggestion 😄

Suggested change

abstract class DefaultBatchExecutor<T extends ClusterStateTaskListener> implements ClusterStateTaskExecutor<T> {

abstract class SimpleBatchedExecutor<T extends ClusterStateTaskListener> implements ClusterStateTaskExecutor<T> {

Also I think this could reasonably be a top-level class.

DaveCTurner · 2022-09-27T12:56:53Z

server/src/main/java/org/elasticsearch/cluster/ClusterStateTaskExecutor.java

+         * @param curState The cluster state on which the task should be executed.
+         * @return The resulting cluster state after executing this task.
+         */
+        public abstract ClusterState executeTask(TaskContext<T> taskContext, ClusterState curState);


I don't think any implementations need access to the whole TaskContext, just the task itself.

DaveCTurner · 2022-09-27T12:57:40Z

server/src/main/java/org/elasticsearch/cluster/ClusterStateTaskExecutor.java

+         * @param curState The cluster state on which the task should be executed.
+         * @return The resulting cluster state after executing this task.
+         */
+        public abstract ClusterState executeTask(TaskContext<T> taskContext, ClusterState curState);


Also naming suggestion, no need to come up with a more specific name when there's only one of them:

Suggested change

public abstract ClusterState executeTask(TaskContext<T> taskContext, ClusterState curState);

public abstract ClusterState executeTask(TaskContext<T> taskContext, ClusterState clusterState);

DaveCTurner · 2022-09-27T13:00:38Z

server/src/main/java/org/elasticsearch/cluster/ClusterStateTaskExecutor.java

+         */
+        public ClusterState afterBatchExecution(
+            BatchExecutionContext<T> batchExecutionContext,
+            ClusterState initState,


I feel bad about giving folks access to the initial state here. I think that could lead to some unexpected behaviour. Really the only question should be whether there were any changes. WDYT about having two methods, one called if initState == curState and the other called if something changed? Either that or a boolean parameter.

I think the flag would be good since then the afterBatchExecution logic would be all in one place. If we take away the initState, DeleteDesiredNodesExecutor wouldn't fit into this. I guess it is a special "low-level" executor anyway.

In DeleteDesiredNodesExecutor we have that initState == curState so that should still be fine. I'm ok with passing in curState, it's only initState that seems potentially problematic.

so instead of

initialState().copyAndUpdateMetadata(metadata -> metadata.removeCustom(DesiredNodesMetadata.TYPE))

I can just use

clusterState.copyAndUpdateMetadata(metadata -> metadata.removeCustom(DesiredNodesMetadata.TYPE))

and they'd be the same?

Right.

Although that does raise a good point that BatchExecutionContext gives access to the initial state. And the tasks. I think we don't want that either, we should always run this method within a dropHeadersContext() and then it should only need the state.

DaveCTurner · 2022-09-27T13:01:31Z

server/src/main/java/org/elasticsearch/cluster/ClusterStateTaskExecutor.java

+        public ClusterState afterBatchExecution(
+            BatchExecutionContext<T> batchExecutionContext,
+            ClusterState initState,
+            ClusterState curState


Naming suggestion as above:

Suggested change

ClusterState curState

ClusterState clusterState

(at the least we should call it currentState in full, no need to make it harder for non-English speakers for the sake of those 4 chars)

DaveCTurner · 2022-09-27T13:02:35Z

server/src/main/java/org/elasticsearch/cluster/ClusterStateTaskExecutor.java

+         * @param taskContext The task that failed with an exception.
+         * @param e The Exception thrown by the task execution.
+         */
+        public void taskFailed(TaskContext<T> taskContext, Exception e) {


I expect this not to have any overrides, let's just hard-code it for now.

DaveCTurner · 2022-09-27T13:03:48Z

server/src/main/java/org/elasticsearch/cluster/ClusterStateTaskExecutor.java

+         * @param taskContext The task that successfully finished execution.
+         */
+        public void taskSucceeded(TaskContext<T> taskContext) {
+            taskContext.success(() -> {});


I expect this will almost always be overridden, and it would trappily lose a listener notification not to do so. Let's make this abstract and make implementations be explicit if they want a no-op.

DaveCTurner · 2022-09-27T13:06:58Z

server/src/main/java/org/elasticsearch/cluster/ClusterStateTaskExecutor.java

+            for (final var taskContext : batchExecutionContext.taskContexts()) {
+                try (var ignored = taskContext.captureResponseHeaders()) {
+                    curState = executeTask(taskContext, curState);
+                    taskSucceeded(taskContext);


Rather than asking implementations to manipulate the TaskContext themselves, I think they will almost always just want to call a method on the underlying task. Can we do this instead?

Suggested change

taskSucceeded(taskContext);

taskContext.success(() -> taskSucceeded(taskContext.getTask()));

What about when the task is itself a ClusterStateAckListener? In that case we'd have to ask the executor to implement an empty taskSucceeded and in the batch execution inspect the type of the task? Or taskSucceeded would optionally return the task? Both seem not much better. Or maybe there is a simpler way to deal with that? Or do we explicitly not want to accommodate that case?

I think we probably want a whole different executor for ack-sensitive tasks, with different callbacks. Typically these things complete their listener in onAllNodesAcked() rather than when the publication completes.

So I guess I'll just drop it from this PR.

👍 sounds good to me

DaveCTurner · 2022-09-27T13:10:32Z

server/src/main/java/org/elasticsearch/cluster/ClusterStateTaskExecutor.java

+     * series of executions, each taking an input cluster state and producing a new cluster state that serves as the
+     * input of the next task in the batch.
+     */
+    abstract class DefaultBatchExecutor<T extends ClusterStateTaskListener> implements ClusterStateTaskExecutor<T> {


Could we make void clusterStatePublished(ClusterState) a final method and allow an override void clusterStatePublished() which doesn't expose the published state? Implementations shouldn't be using the published state I think.

…erStateExecutor

DaveCTurner

Great, I like it. One tiny comment/suggestion.

...er/src/main/java/org/elasticsearch/reservedstate/service/ReservedStateErrorTaskExecutor.java

…eservedStateErrorTaskExecutor.java Co-authored-by: David Turner <david.turner@elastic.co>

idegtiarenko · 2022-09-28T09:19:43Z

server/src/main/java/org/elasticsearch/cluster/SimpleBatchedExecutor.java

+    @Override
+    public final ClusterState execute(BatchExecutionContext<T> batchExecutionContext) throws Exception {
+        var initState = batchExecutionContext.initialState();
+        var clusterState = initState;
+        for (final var taskContext : batchExecutionContext.taskContexts()) {
+            try (var ignored = taskContext.captureResponseHeaders()) {
+                var task = taskContext.getTask();
+                clusterState = executeTask(task, clusterState);
+                taskContext.success(() -> taskSucceeded(task));
+            } catch (Exception e) {
+                taskContext.onFailure(e);
+            }
+        }
+        try (var ignored = batchExecutionContext.dropHeadersContext()) {
+            return afterBatchExecution(clusterState, clusterState != initState);
+        }
+    }


idegtiarenko · 2022-09-28T09:28:59Z

server/src/main/java/org/elasticsearch/cluster/SimpleBatchedExecutor.java

+     *
+     * @param task The task that successfully finished execution.
+     */
+    public abstract void taskSucceeded(T task);


I wonder if it is worth adding new interface for tasks:

interface ClusterStateUpdateTask extends ClusterStateTaskListener { ClusterState execute(ClusterState clusterState); void onSuccess(); }

I believe this way it would be possible to have a single non abstract executor implementation for most of the cases as we already have dedicated tasks for most of the things.

var executor1 = new BatchExecutor(); // no need for custom after batch execution var executor2 = new BatchExecutor((clusterState, modified) -> { /*custom after batch execution*/ });

Yep that seems like a logical followup too. Except:

we should allow these tasks to pass a result from execute to onSuccess (see my other comment)

I'd rather customise afterBatchExecution by overriding a method than passing in a lambda - I find it to be clearer what the code actually does this way.

the name ClusterStateUpdateTask is already widely used for unbatched tasks, we should call it something else

I'll look into that a bit more. For now I didn't want to force anything on the tasks, but just to save some boilerplate code that gets repeated whenever a basic executor is needed (which we ask everyone with a cluster state update task, to provide).

DaveCTurner · 2022-09-28T09:29:16Z

Some would have been easy to change, but they pass around some values to their taskContext.success()

Btw I think it would be worth doing this in a follow-up. It's a pretty useful pattern, and if there's no obvious way to do it then I bet we'll get folks adding a mutable field to the task object to hold the result which is a bit yucky.

I think I'd do this by adding to the executor another type parameter TaskResult, having execute return a Tuple<ClusterState,TaskResult>, and passing the result for each task to taskSucceeded.

I think we probably want a whole different executor for ack-sensitive tasks, with different callbacks.

I also think this is worth doing in a follow-up.

pxsalehi · 2022-09-28T09:46:38Z

Thanks David, and Ievgen. I'll follow up with more PRs.

…ecution (elastic#90343) This change provides a basic implementation for batch executors that simply need to execute the tasks in the batch iteratively, producing a cluster state after each task. This allows executing the tasks in the batch as a series of executions, each taking an input cluster state and producing a new cluster state that serves as the input of the next task in the batch.

elasticsearchmachine added the v8.6.0 label Sep 26, 2022

Reduce boilerplate code and adapt some executors

f671fb5

pxsalehi force-pushed the ps260922-batchClusterStateExecutor branch from 5765bb0 to f671fb5 Compare September 26, 2022 15:49

pxsalehi added 4 commits September 26, 2022 18:27

Add a few more methods

9d5de30

Merge remote-tracking branch 'upstream/main' into ps260922-batchClust…

22f026c

…erStateExecutor

Adapt ReservedState task executors

cba3042

Clean up and comment

45af020

pxsalehi changed the title ~~[WIP] Make it easier to write a batchable cluster state update~~ Provide a ClusterStateTaskExecutor implementation for simple batch execution Sep 27, 2022

pxsalehi marked this pull request as ready for review September 27, 2022 12:36

elasticsearchmachine added the needs:triage Requires assignment of a team area label label Sep 27, 2022

pxsalehi added 2 commits September 27, 2022 14:38

Typo

34e4043

Small corrections

288b587

pxsalehi requested a review from DaveCTurner September 27, 2022 12:40

pxsalehi added >non-issue :Distributed/Cluster Coordination Cluster formation and cluster state publication, including cluster membership and fault detection. and removed needs:triage Requires assignment of a team area label labels Sep 27, 2022

elasticsearchmachine added the Team:Distributed Meta label for distributed team. label Sep 27, 2022

DaveCTurner reviewed Sep 27, 2022

View reviewed changes

pxsalehi added 5 commits September 27, 2022 18:32

address review comments

7bd3922

Merge remote-tracking branch 'upstream/main' into ps260922-batchClust…

ba2dc3f

…erStateExecutor

Do not pass the batch to afterBatchExecution

4226009

Cleanup

873d396

Change two more executors

81e2ec8

DaveCTurner approved these changes Sep 27, 2022

View reviewed changes

...er/src/main/java/org/elasticsearch/reservedstate/service/ReservedStateErrorTaskExecutor.java Outdated Show resolved Hide resolved

Update server/src/main/java/org/elasticsearch/reservedstate/service/R…

90ca3cc

…eservedStateErrorTaskExecutor.java Co-authored-by: David Turner <david.turner@elastic.co>

idegtiarenko reviewed Sep 28, 2022

View reviewed changes

pxsalehi merged commit 0958649 into elastic:main Sep 28, 2022

This was referenced Sep 28, 2022

Add a batched executor for tasks with results #90459

Merged

Add SimpleBatchedAckListenerTaskExecutor #90521

Merged

	abstract class DefaultBatchExecutor<T extends ClusterStateTaskListener> implements ClusterStateTaskExecutor<T> {
	abstract class SimpleBatchedExecutor<T extends ClusterStateTaskListener> implements ClusterStateTaskExecutor<T> {

	public abstract ClusterState executeTask(TaskContext<T> taskContext, ClusterState curState);
	public abstract ClusterState executeTask(TaskContext<T> taskContext, ClusterState clusterState);

	taskSucceeded(taskContext);
	taskContext.success(() -> taskSucceeded(taskContext.getTask()));

Conversation

pxsalehi commented Sep 26, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pxsalehi commented Sep 26, 2022

Uh oh!

pxsalehi commented Sep 27, 2022

Uh oh!

elasticsearchmachine commented Sep 27, 2022

Uh oh!

DaveCTurner left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

DaveCTurner left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

DaveCTurner commented Sep 28, 2022

Uh oh!

pxsalehi commented Sep 28, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

pxsalehi commented Sep 26, 2022 •

edited

Loading