Use static empty store files metadata by DaveCTurner · Pull Request #84034 · elastic/elasticsearch

DaveCTurner · 2022-02-16T13:48:18Z

In a large cluster we expect most nodes not to have a copy of most
shards, but today during replica shard allocation we create a new (and
nontrivial) object for each node that has no copy of a shard. With this
commit we check at deserialization time whether the response is empty
and, if so, avoid the unnecessary instantiation.

Relates #77466

In a large cluster we expect most nodes not to have a copy of most shards, but today during replica shard allocation we create a new (and nontrivial) object for each node that has no copy of a shard. With this commit we check at deserialization time whether the response is empty and, if so, avoid the unnecessary instantiation. Relates elastic#77466

elasticmachine · 2022-02-16T13:48:22Z

Pinging @elastic/es-distributed (Team:Distributed)

elasticsearchmachine · 2022-02-16T13:48:42Z

Hi @DaveCTurner, I've created a changelog YAML for you.

idegtiarenko · 2022-02-16T16:00:05Z

server/src/main/java/org/elasticsearch/index/store/Store.java

-                commitUserData.put(in.readString(), in.readString());
+        public static MetadataSnapshot readFrom(StreamInput in) throws IOException {
+            final int metadataSize = in.readVInt();
+            final Map<String, StoreFileMetadata> metadata = metadataSize == 0 ? emptyMap() : new HashMap<>();


NIT: we might use org.elasticsearch.common.util.Maps#newMapWithExpectedSize to avoid resizing the map

Also may be it is worth generalizing this pattern of reading with something like:
Map<K, V> readMapFromList(Writeable.Reader<V> valueReader, Function<V, K> keyCreator) throws IOException that would internally handle map creation, initialization and sizing?

I think this is not the only place where we create map from list

++ see 488a05b

Looks like we are not properly sizing the map produced by StreamInput#readMap and StreamInput#readOrderedMap either. I'm not touching that in this PR, nor the map-from-list thing, but seems reasonable I think.

#84045 is a pr for pre-sizing maps in readMap. We could discuss if there any concerns around that topic.

I could also open a followup pr on map-from-list

…eCTurner/elasticsearch into 2022-02-16-reuse-empty-file-metadata

* upstream/master: (167 commits) Mute FrozenSearchableSnapshotsIntegTests#testCreateAndRestorePartialSearchableSnapshot Mute LdapSessionFactoryTests#testSslTrustIsReloaded Fix spotless violation from last commit Mute GeoGridTilerTestCase#testGeoGridSetValuesBoundingBoxes_UnboundedGeoShapeCellValues Small formatting clean up (elastic#84144) Always re-run Feature migrations which have encountered errors (elastic#83918) [DOCS] Clarify `orientation` usage for WKT and GeoJSON polygons (elastic#84025) Group field caps response by index mapping hash (elastic#83494) Shrink join queries in slow log (elastic#83914) TSDB: Reject the nested object fields that are configured time_series_dimension (elastic#83920) [DOCS] Remove note about partial response from Bulk API docs (elastic#84053) Allow regular data streams to be migrated to tsdb data streams. (elastic#83843) [DOCS] Fix `ignore_unavailable` parameter definition (elastic#84071) Make Metadata extend AbstractCollection (elastic#83791) Add API specs for OpenID Connect APIs Revert "Clean up for superuser role name references (elastic#83627)" (elastic#84096) Update Lucene analysis base url (elastic#84094) Avoid null threadContext in ResultDeduplicator (elastic#84093) Use static empty store files metadata (elastic#84034) Preserve context in snapshotDeletionListeners (elastic#84089) ... # Conflicts: # x-pack/plugin/rollup/build.gradle

In a large cluster we expect most nodes not to have a copy of most shards, but today during replica shard allocation we create a new (and nontrivial) object for each node that has no copy of a shard. With this commit we check at deserialization time whether the response is empty and, if so, avoid the unnecessary instantiation. Relates elastic#77466

`TransportNodesListShardStoreMetadata$StoreFilesMetadata` and `Store$MetadataSnapshot` are both morally-speaking records, and `LoadedMetadata` is really the same as `MetadataSnapshot`. This commit turns them into real records, gets rid of the unnecessary extra class, and renames some of the accessors. Spotted while working on elastic#84034

`TransportNodesListShardStoreMetadata$StoreFilesMetadata` and `Store$MetadataSnapshot` are both morally-speaking records, and `LoadedMetadata` is really the same as `MetadataSnapshot`. This commit turns them into real records, gets rid of the unnecessary extra class, and renames some of the accessors. Spotted while working on #84034

DaveCTurner added >enhancement :Distributed/Allocation All issues relating to the decision making around placing a shard (both master logic & on the nodes) v8.2.0 labels Feb 16, 2022

DaveCTurner requested a review from original-brownbear February 16, 2022 13:48

elasticmachine added the Team:Distributed Meta label for distributed team. label Feb 16, 2022

Update docs/changelog/84034.yaml

256215e

This was referenced Feb 16, 2022

Remove redundant response of empty result in AsyncShardFetch to avoid OOM issue #84010

Closed

Fix Large Shard Count Scalability Issues #77466

Open

idegtiarenko reviewed Feb 16, 2022

View reviewed changes

DaveCTurner added 2 commits February 16, 2022 16:08

newMapWithExpectedSize

488a05b

Merge branch '2022-02-16-reuse-empty-file-metadata' of github.com:Dav…

9577208

…eCTurner/elasticsearch into 2022-02-16-reuse-empty-file-metadata

idegtiarenko approved these changes Feb 16, 2022

View reviewed changes

DaveCTurner merged commit 1b7b2a1 into elastic:master Feb 17, 2022

DaveCTurner deleted the 2022-02-16-reuse-empty-file-metadata branch February 17, 2022 10:24

DaveCTurner mentioned this pull request Feb 24, 2022

Simplify Store$MetadataSnapshot and friends #84357

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use static empty store files metadata#84034

Use static empty store files metadata#84034
DaveCTurner merged 4 commits intoelastic:masterfrom
DaveCTurner:2022-02-16-reuse-empty-file-metadata

DaveCTurner commented Feb 16, 2022

Uh oh!

elasticmachine commented Feb 16, 2022

Uh oh!

elasticsearchmachine commented Feb 16, 2022

Uh oh!

idegtiarenko Feb 16, 2022

Uh oh!

idegtiarenko Feb 16, 2022 •

edited

Loading

Uh oh!

DaveCTurner Feb 16, 2022

Uh oh!

idegtiarenko Feb 16, 2022 •

edited

Loading

Uh oh!

idegtiarenko Feb 16, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

DaveCTurner commented Feb 16, 2022

Uh oh!

elasticmachine commented Feb 16, 2022

Uh oh!

elasticsearchmachine commented Feb 16, 2022

Uh oh!

idegtiarenko Feb 16, 2022

Choose a reason for hiding this comment

Uh oh!

idegtiarenko Feb 16, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

DaveCTurner Feb 16, 2022

Choose a reason for hiding this comment

Uh oh!

idegtiarenko Feb 16, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

idegtiarenko Feb 16, 2022

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

idegtiarenko Feb 16, 2022 •

edited

Loading

idegtiarenko Feb 16, 2022 •

edited

Loading