Skip to content

Disk indicator refactoring#90366

Merged
gmarouli merged 9 commits intoelastic:mainfrom
gmarouli:disk-indicator-refactoring
Sep 28, 2022
Merged

Disk indicator refactoring#90366
gmarouli merged 9 commits intoelastic:mainfrom
gmarouli:disk-indicator-refactoring

Conversation

@gmarouli
Copy link
Copy Markdown
Contributor

@gmarouli gmarouli commented Sep 26, 2022

During this refactoring we introduce the DiskHealthAnalyzer which will determine the health state, the symptom, the impacts and the diagnoses.

The benefits of this class is that the constructor is using the input information to fill in data structures that answer the questions we need to determine the symptom, the impacts and the diagnosis. These features are quite complex to calculate so the simpler the code there the best it is.

We also restructured the tests to say in the test name the expected status and then the premises, blocked indices and/or status of the nodes. We introduce a comment that explains what the tests simulates and what's expected.

Finally, we change the retrieval of the indices per node to use the RoutingNodes instead of the RoutingTable directly for efficiency.

Closes #90212

@gmarouli gmarouli added :Distributed/Health Issues for the health report API v8.5.1 >refactoring labels Sep 26, 2022
@gmarouli gmarouli requested a review from andreidan September 26, 2022 15:39
@gmarouli gmarouli marked this pull request as ready for review September 26, 2022 15:39
@elasticsearchmachine elasticsearchmachine added the Team:Data Management (obsolete) DO NOT USE. This team no longer exists. label Sep 26, 2022
@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-data-management (Team:Data Management)

Copy link
Copy Markdown
Contributor

@andreidan andreidan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for refactoring this indicator Mary.
This is heading in the right direction 🚀

I left a few suggestions and questions

.filter(routing -> indices.contains(routing.index().getName()))
.map(ShardRouting::currentNodeId)
.collect(Collectors.toSet());
DiskHealthAnalyzer diskHealthAnalyzer = new DiskHealthAnalyzer(diskHealthInfoMap, blockedIndices, clusterState);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the DiskHealthAnalyzer already has the cluster state, could it look and compute the blocked indices?

We now compute them in the indicator but not use them anywhere else except passing them to DiskHealthAnalyzer.

What do you think?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is true, my reasoning was that the interface shows the information needed to calculate it, but we can document why we need the cluster state and leave it at that ;) .

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in that case I will move the details too.

}

public Index[] copyIndices() {
return shardsByIndex.keySet().toArray(Index.EMPTY_ARRAY);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dakrone do you know of a different/more efficient way of getting all the indices on a node?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dakrone I merged this PR to move forward but please if you know about a better way to get all the indices per node, I will fix it in a new PR.

@gmarouli gmarouli requested a review from andreidan September 28, 2022 11:41
Copy link
Copy Markdown
Contributor

@andreidan andreidan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for iterating on this Mary. LGTM 🚀

Left a suggestion

@gmarouli gmarouli added the auto-backport Automatically create backport pull requests when merged label Sep 28, 2022
@gmarouli gmarouli merged commit 30f24d4 into elastic:main Sep 28, 2022
@gmarouli gmarouli deleted the disk-indicator-refactoring branch September 28, 2022 13:13
gmarouli added a commit to gmarouli/elasticsearch that referenced this pull request Sep 28, 2022
Introduced the DiskHealthAnalyzer which determines the health state, the symptom, the impacts, the diagnoses and the details. We also changed the retrieval of the indices per node to use the RoutingNodes instead of the RoutingTable directly for efficiency.

Co-authored-by: Andrei Dan <andrei.dan@elastic.co>
@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

💚 Backport successful

Status Branch Result
8.5

gmarouli added a commit that referenced this pull request Sep 28, 2022
Introduced the DiskHealthAnalyzer which determines the health state, the symptom, the impacts, the diagnoses and the details. We also changed the retrieval of the indices per node to use the RoutingNodes instead of the RoutingTable directly for efficiency.

Co-authored-by: Andrei Dan <andrei.dan@elastic.co>
@csoulios csoulios removed the v8.5.1 label Nov 1, 2022
@csoulios csoulios added the v8.5.0 label Nov 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

auto-backport Automatically create backport pull requests when merged :Distributed/Health Issues for the health report API >refactoring Team:Data Management (obsolete) DO NOT USE. This team no longer exists. v8.5.0 v8.6.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Refactor Disk health indicator

4 participants