Speed up MappingStats Computation on Coordinating Node by original-brownbear · Pull Request #82830 · elastic/elasticsearch

original-brownbear · 2022-01-19T23:00:24Z

We can exploit the mapping deduplication logic to save deserializing the
same mapping repeatedly here. This should fix extremely long running
computations when the cache needs to be refreshed for these stats
in the common case of many duplicate mappings in a cluster.
Also, removed some confusing and needless set creation in the constructor here.

We could go even further here probably and merge the logic for analysis and mapping stats parsing into one, but it doesn't matter much. With this fix the time to get a response in a 10k indices cluster with many repeated but very large (Beats) mappings (as you would expect them to be in the real world) goes from ~10s down to sub-second in the uncached case.
This removes a very long running task from the management pool and also ensures we don't burn endless CPU responding to monitoring in clusters that go through frequent metadata updates and thus won't see all that much benefit from the caching in TransportClusterStatsAction.

relates #77466

elasticmachine · 2022-01-19T23:00:27Z

Pinging @elastic/es-search (Team:Search)

elasticmachine · 2022-01-19T23:00:28Z

Pinging @elastic/es-data-management (Team:Data Management)

henningandersen

Thanks @original-brownbear this is a nice optimization. I think our testing may be a bit thin towards this though, perhaps you can add a bit of randomized unittests for cases of shared and non-shared mappings for both analysis-stats and mapping-stats (or maybe I missed it, happy to be pointed to it instead).

henningandersen · 2022-01-22T15:59:03Z

server/src/main/java/org/elasticsearch/action/admin/cluster/stats/AnalysisStats.java

I would find it more intuitive to use an IdentityHashMp or just a hash-map (since hash-code/equals compare on the hash anyway). Is there a reason to use an explicit hash-value as key here?

Right ... IdentityHashMap sounds nice and simplifies the loop a little as well saving another round of lookup :)

henningandersen · 2022-01-22T16:00:56Z

server/src/main/java/org/elasticsearch/action/admin/cluster/stats/AnalysisStats.java

I think we should also do ensureNotCancelled.run() here to ensure we can cancel here too.

++ adding it back

server/src/main/java/org/elasticsearch/action/admin/cluster/stats/MappingStats.java

henningandersen · 2022-01-22T16:26:40Z

server/src/main/java/org/elasticsearch/action/admin/cluster/stats/MappingStats.java

I think we can add a multiplier param to FieldScriptStats.update too to avoid the loop?

++ that's cuter :)

server/src/test/java/org/elasticsearch/action/admin/cluster/stats/MappingStatsTests.java

original-brownbear · 2022-01-22T23:03:54Z

Thanks for taking a look Henning!

perhaps you can add a bit of randomized unittests for cases of shared and non-shared mappings for both analysis-stats and mapping-stats

Right it was quite thin indeed. We had a pretty extensive test case for the mapping stats that I reused for a non-shard test (not the most beautiful solution but I figured it was a reasonable cost-return tradeoff).
For analyzer stats testing was quite thin, but the one test case I found I extended to cover the non-shared and shared case and it seems like that should logically cover all the things I could've broken.

henningandersen

LGTM, thanks for the extra work on the tests.

server/src/main/java/org/elasticsearch/action/admin/cluster/stats/MappingStats.java

henningandersen · 2022-01-25T19:34:24Z

server/src/main/java/org/elasticsearch/cluster/metadata/Metadata.java

I wonder if we should just expose the size of the map, since that is all we need? Small point, but would be nice to keep this internal to this class.

Hmm we could here but I have another PR inbound that needs this map. I left it as is for now, hope that's ok.

server/src/test/java/org/elasticsearch/action/admin/cluster/stats/AnalysisStatsTests.java

server/src/test/java/org/elasticsearch/action/admin/cluster/stats/MappingStatsTests.java

elasticsearchmachine · 2022-01-25T21:05:29Z

Hi @original-brownbear, I've created a changelog YAML for you.

original-brownbear · 2022-01-26T10:44:09Z

@elasticmachine update branch

elasticsearchmachine · 2022-01-26T10:51:49Z

Hi @original-brownbear, I've created a changelog YAML for you.

original-brownbear · 2022-01-26T11:08:38Z

@elasticmachine update branch (sorry some changelog madness here)

We can exploit the mapping deduplication logic to save deserializing the same mapping repeatedly here. This should fix extremly long running computations when the cache needs to be refreshed for these stats in the common case of many duplicate mappings in a cluster. In a follow-up we can probably do the same for `AnalysisStats` as well.

elasticsearchmachine · 2022-01-26T11:09:59Z

Hi @original-brownbear, I've created a changelog YAML for you.

original-brownbear · 2022-01-26T12:42:25Z

Jenkins run elasticsearch-ci/part-2

original-brownbear · 2022-01-26T13:36:07Z

Thanks Henning!

original-brownbear added >enhancement :Search Foundations/Mapping Index mappings, including merging and defining field types :Core/Infra/Stats Statistics tracking and retrieval APIs v8.1.0 labels Jan 19, 2022

elasticmachine added Team:Search Meta label for search team Team:Data Management (obsolete) DO NOT USE. This team no longer exists. labels Jan 19, 2022

original-brownbear mentioned this pull request Jan 20, 2022

Fix Large Shard Count Scalability Issues #77466

Open

97 tasks

original-brownbear requested review from DaveCTurner and henningandersen January 21, 2022 12:59

henningandersen reviewed Jan 22, 2022

View reviewed changes

original-brownbear requested a review from henningandersen January 22, 2022 23:03

henningandersen approved these changes Jan 25, 2022

View reviewed changes

original-brownbear removed the :Search Foundations/Mapping Index mappings, including merging and defining field types label Jan 26, 2022

elasticmachine removed the Team:Search Meta label for search team label Jan 26, 2022

original-brownbear added 2 commits January 26, 2022 12:09

Update docs/changelog/82830.yaml

d1c451d

original-brownbear merged commit f2cb910 into elastic:master Jan 26, 2022

original-brownbear deleted the speed-up-mapping-stats branch January 26, 2022 13:36

joegallo mentioned this pull request Mar 24, 2022

Pushing back on index stats requests can cause ILM rollover-ready checks to pile up #85333

Open

DaveCTurner mentioned this pull request Aug 12, 2022

Cluster Stats API Slows down Considerably for Larger Clusters #79563

Open

Conversation

original-brownbear commented Jan 19, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

elasticmachine commented Jan 19, 2022

Uh oh!

elasticmachine commented Jan 19, 2022

Uh oh!

henningandersen left a comment

Choose a reason for hiding this comment

Uh oh!

henningandersen Jan 22, 2022

Choose a reason for hiding this comment

Uh oh!

original-brownbear Jan 22, 2022

Choose a reason for hiding this comment

Uh oh!

henningandersen Jan 22, 2022

Choose a reason for hiding this comment

Uh oh!

original-brownbear Jan 22, 2022

Choose a reason for hiding this comment

Uh oh!

Uh oh!

henningandersen Jan 22, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

original-brownbear Jan 22, 2022

Choose a reason for hiding this comment

Uh oh!

Uh oh!

original-brownbear commented Jan 22, 2022

Uh oh!

henningandersen left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

henningandersen Jan 25, 2022

Choose a reason for hiding this comment

Uh oh!

original-brownbear Jan 26, 2022

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

elasticsearchmachine commented Jan 25, 2022

Uh oh!

original-brownbear commented Jan 26, 2022

Uh oh!

elasticsearchmachine commented Jan 26, 2022

Uh oh!

original-brownbear commented Jan 26, 2022

Uh oh!

elasticsearchmachine commented Jan 26, 2022

Uh oh!

original-brownbear commented Jan 26, 2022

Uh oh!

original-brownbear commented Jan 26, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

original-brownbear commented Jan 19, 2022 •

edited

Loading

henningandersen Jan 22, 2022 •

edited

Loading