Skip to content

Batch open-indices cluster state updates#83760

Merged
joegallo merged 13 commits intoelastic:masterfrom
joegallo:batch-open-indices
Feb 10, 2022
Merged

Batch open-indices cluster state updates#83760
joegallo merged 13 commits intoelastic:masterfrom
joegallo:batch-open-indices

Conversation

@joegallo
Copy link
Copy Markdown
Contributor

@joegallo joegallo commented Feb 9, 2022

Related to #81627 and #83432

The crux of this is that it processes the whole batch all at once -- every task succeeds or fails. If we want to do better than that, then we'll need to be more clever about shardLimitValidator.validateShardLimit(currentState, indices). At present, it takes a cluster state, and we want to avoid creating new intermediate cluster states inside this batching executor.

Here's a scenario where the difference here matters. Let's imagine three requests to open indices: 3 indices for the first request, 2 for the second, and 1 for the third (all have 1 primary, 0 replicas). The shardLimitValidator thinks we have space for four more shards. Prior to this PR, the first and third requests would have succeeded, while the second would have failed. With this batching, we could have all of them fail (if they are executed as a single batch).

If we want to avoid that, we could run internal batching, but then we're generating intermediate cluster states just to pass them off to the shardLimitValidator and then we might as well go back to using AckedClusterStateUpdateTask. Or we could rewrite the shardLimitValidator to accept something cheaper that we could build/have throughout.

Yet another approach (my personal favorite) would be to invoke the shardLimitValidator multiple times against the various subsets of possible tasks to execute (e.g. if the current cluster state accepts the shards for the first task's indices (yes), will it accept the first and second tasks' (no), how about the first and third (yes)).

These aren't used from tests or other code, so they might as well be
private.
Mostly a bunch of ceremony around moving the innermost openIndices()
call to a custom ClusterStateTaskExecutor. For now this does the
simplest thing possible and 'just' pulls all the indices from all the
requests and gloms them together (well, with de-duplication) into one
big openIndices call.
@joegallo joegallo added :Data Management/Indices APIs DO NOT USE. Use ":Distributed/Indices APIs" or ":StorageEngine/Templates" instead. v8.2.0 labels Feb 9, 2022
Copy link
Copy Markdown
Member

@DaveCTurner DaveCTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, I left only small comments.

@DaveCTurner
Copy link
Copy Markdown
Member

FWIW I'm undecided about the ShardLimitValidator question but I do lean towards the simple fail-everything solution proposed here. The limits are supposed to be unreasonably high and in the context of ILM we close-and-open fairly fast so it's hard to hit a limit on open without already having breached it before closing. And some opens are going to have to fail if you hit the limits.

IMO it's a bit of a bug that an automatic close-and-reopen goes through this state where shards don't count towards the limit for a brief period and can therefore fail to reopen like this. We know we're reopening them soon, we should keep their spot in the cluster reserved and fail other things instead.

Copy link
Copy Markdown
Contributor

@original-brownbear original-brownbear left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great Joe just one addition to David's points.

FWIW I'm undecided about the ShardLimitValidator question but I do lean towards the simple fail-everything solution proposed here.

++ good enough for now IMO

@joegallo joegallo marked this pull request as ready for review February 10, 2022 14:27
@elasticmachine elasticmachine added the Team:Data Management (obsolete) DO NOT USE. This team no longer exists. label Feb 10, 2022
@elasticmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-data-management (Team:Data Management)

Copy link
Copy Markdown
Member

@DaveCTurner DaveCTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Hi @joegallo, I've created a changelog YAML for you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Data Management/Indices APIs DO NOT USE. Use ":Distributed/Indices APIs" or ":StorageEngine/Templates" instead. >enhancement Team:Data Management (obsolete) DO NOT USE. This team no longer exists. v8.2.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants