More Compact Serialization of Metadata by original-brownbear · Pull Request #82608 · elastic/elasticsearch

original-brownbear · 2022-01-14T12:14:29Z

Serialize the map of hashes to mappings and then lookup from the map instead
of serializing them over and over for each index to make full cluster state
transport messages much smaller in the common case of many duplicate mappings.

This should make the master node impact of requests for the full cluster state (or at least the state including mappings) quite a bit cheaper memory+cpu+network wise. Also it saves lots of buffers on the coordinating/sending node as well as CPU for deduplicating mappings.

relates #77466

Serialize the map of hashes to mappings and then lookup from the map instead of serializing them over and over for each index to make full cluster state transport messages much smaller in the common case of many duplicate mappings.

elasticmachine · 2022-01-14T12:14:33Z

Pinging @elastic/es-distributed (Team:Distributed)

arteam · 2022-01-14T13:28:23Z

server/src/main/java/org/elasticsearch/cluster/metadata/Metadata.java

+        if (in.getVersion().onOrAfter(MAPPINGS_AS_HASH_VERSION)) {
+            final int mappings = in.readVInt();
+            if (mappings > 0) {
+                final Map<String, MappingMetadata> mappingMetadataMap = new HashMap<>(mappings);


The HashMap constructors accepts the capacity, not the expected amount of elements. It needs to be sized a bit higher than mappings, otherwise it will need to be resized/rehashed.

See https://github.com/google/guava/blob/master/guava/src/com/google/common/collect/Maps.java#L273

True, though I guess it might be worthwhile to have a general fix to this. We seem to always pre-size capacity == element count in deserialization. Technically, we probably could move to accounting for the load factor, but I wouldn't expect too much from it (especially when the key's hashcode is essentially free).

original-brownbear · 2022-01-14T14:25:35Z

Thanks Ievgen!

More Compact Serialization of Metadata

e9cb774

Serialize the map of hashes to mappings and then lookup from the map instead of serializing them over and over for each index to make full cluster state transport messages much smaller in the common case of many duplicate mappings.

original-brownbear added >enhancement :Distributed/Cluster Coordination Cluster formation and cluster state publication, including cluster membership and fault detection. v8.1.0 labels Jan 14, 2022

elasticmachine added the Team:Distributed Meta label for distributed team. label Jan 14, 2022

idegtiarenko approved these changes Jan 14, 2022

View reviewed changes

arteam reviewed Jan 14, 2022

View reviewed changes

original-brownbear merged commit 62db2ae into elastic:master Jan 14, 2022

original-brownbear deleted the efficient-serialization-metadata-over-wire branch January 14, 2022 14:25

original-brownbear mentioned this pull request Jan 14, 2022

Fix Large Shard Count Scalability Issues #77466

Open

97 tasks

original-brownbear mentioned this pull request Jan 27, 2022

A Node Joining a Cluster with a Large State Receives the Full Uncompressed State in a ValidateJoinRequest #83204

Closed

joegallo mentioned this pull request Jul 13, 2022

Deduplicate mappings in persisted cluster state #88479

Merged

original-brownbear restored the efficient-serialization-metadata-over-wire branch April 18, 2023 20:38

DaveCTurner mentioned this pull request Jan 12, 2026

[Cluster State] Combine Index Mappings when identical #140422

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

More Compact Serialization of Metadata#82608

More Compact Serialization of Metadata#82608
original-brownbear merged 1 commit intoelastic:masterfrom
original-brownbear:efficient-serialization-metadata-over-wire

original-brownbear commented Jan 14, 2022 •

edited

Loading

Uh oh!

elasticmachine commented Jan 14, 2022

Uh oh!

arteam Jan 14, 2022

Uh oh!

original-brownbear Jan 14, 2022

Uh oh!

original-brownbear commented Jan 14, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

original-brownbear commented Jan 14, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

elasticmachine commented Jan 14, 2022

Uh oh!

arteam Jan 14, 2022

Choose a reason for hiding this comment

Uh oh!

original-brownbear Jan 14, 2022

Choose a reason for hiding this comment

Uh oh!

original-brownbear commented Jan 14, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

original-brownbear commented Jan 14, 2022 •

edited

Loading