Use static empty store files metadata#84034
Conversation
In a large cluster we expect most nodes not to have a copy of most shards, but today during replica shard allocation we create a new (and nontrivial) object for each node that has no copy of a shard. With this commit we check at deserialization time whether the response is empty and, if so, avoid the unnecessary instantiation. Relates elastic#77466
|
Pinging @elastic/es-distributed (Team:Distributed) |
|
Hi @DaveCTurner, I've created a changelog YAML for you. |
| commitUserData.put(in.readString(), in.readString()); | ||
| public static MetadataSnapshot readFrom(StreamInput in) throws IOException { | ||
| final int metadataSize = in.readVInt(); | ||
| final Map<String, StoreFileMetadata> metadata = metadataSize == 0 ? emptyMap() : new HashMap<>(); |
There was a problem hiding this comment.
NIT: we might use org.elasticsearch.common.util.Maps#newMapWithExpectedSize to avoid resizing the map
There was a problem hiding this comment.
Also may be it is worth generalizing this pattern of reading with something like:
Map<K, V> readMapFromList(Writeable.Reader<V> valueReader, Function<V, K> keyCreator) throws IOException that would internally handle map creation, initialization and sizing?
I think this is not the only place where we create map from list
There was a problem hiding this comment.
++ see 488a05b
Looks like we are not properly sizing the map produced by StreamInput#readMap and StreamInput#readOrderedMap either. I'm not touching that in this PR, nor the map-from-list thing, but seems reasonable I think.
There was a problem hiding this comment.
#84045 is a pr for pre-sizing maps in readMap. We could discuss if there any concerns around that topic.
There was a problem hiding this comment.
I could also open a followup pr on map-from-list
…eCTurner/elasticsearch into 2022-02-16-reuse-empty-file-metadata
* upstream/master: (167 commits) Mute FrozenSearchableSnapshotsIntegTests#testCreateAndRestorePartialSearchableSnapshot Mute LdapSessionFactoryTests#testSslTrustIsReloaded Fix spotless violation from last commit Mute GeoGridTilerTestCase#testGeoGridSetValuesBoundingBoxes_UnboundedGeoShapeCellValues Small formatting clean up (elastic#84144) Always re-run Feature migrations which have encountered errors (elastic#83918) [DOCS] Clarify `orientation` usage for WKT and GeoJSON polygons (elastic#84025) Group field caps response by index mapping hash (elastic#83494) Shrink join queries in slow log (elastic#83914) TSDB: Reject the nested object fields that are configured time_series_dimension (elastic#83920) [DOCS] Remove note about partial response from Bulk API docs (elastic#84053) Allow regular data streams to be migrated to tsdb data streams. (elastic#83843) [DOCS] Fix `ignore_unavailable` parameter definition (elastic#84071) Make Metadata extend AbstractCollection (elastic#83791) Add API specs for OpenID Connect APIs Revert "Clean up for superuser role name references (elastic#83627)" (elastic#84096) Update Lucene analysis base url (elastic#84094) Avoid null threadContext in ResultDeduplicator (elastic#84093) Use static empty store files metadata (elastic#84034) Preserve context in snapshotDeletionListeners (elastic#84089) ... # Conflicts: # x-pack/plugin/rollup/build.gradle
In a large cluster we expect most nodes not to have a copy of most shards, but today during replica shard allocation we create a new (and nontrivial) object for each node that has no copy of a shard. With this commit we check at deserialization time whether the response is empty and, if so, avoid the unnecessary instantiation. Relates elastic#77466
`TransportNodesListShardStoreMetadata$StoreFilesMetadata` and `Store$MetadataSnapshot` are both morally-speaking records, and `LoadedMetadata` is really the same as `MetadataSnapshot`. This commit turns them into real records, gets rid of the unnecessary extra class, and renames some of the accessors. Spotted while working on elastic#84034
`TransportNodesListShardStoreMetadata$StoreFilesMetadata` and `Store$MetadataSnapshot` are both morally-speaking records, and `LoadedMetadata` is really the same as `MetadataSnapshot`. This commit turns them into real records, gets rid of the unnecessary extra class, and renames some of the accessors. Spotted while working on #84034
In a large cluster we expect most nodes not to have a copy of most
shards, but today during replica shard allocation we create a new (and
nontrivial) object for each node that has no copy of a shard. With this
commit we check at deserialization time whether the response is empty
and, if so, avoid the unnecessary instantiation.
Relates #77466