Updating index settings on a large number of indices can take many minutes and the resulting cluster state update might fail to cleanly publish without warning because of the time it takes to persist to disk.
The disk part is obvious, we write one Lucene document per index, resulting in a massive write to Lucene.
[2022-05-25T13:06:48,324][WARN ][o.e.g.PersistedClusterStateService] [elasticsearch-5] writing full cluster state took [25208ms] which is above the warn threshold of [10s]; wrote global metadata and metadata for [50007] indices
The bigger part of why a large setting update is slow though is the index metadata validation that is run for each index. This validation deserialises + reserializes the mapping for every index that got updated, which for large mappings combined with a large number of updates indices can take many minutes.
100.0% [cpu=98.6%, other=1.4%] (500ms out of 500ms) cpu usage by thread 'elasticsearch[elasticsearch-5][masterService#updateTask][T#1]'
10/10 snapshots sharing following 26 elements
app/org.elasticsearch.server@8.3.0-SNAPSHOT/org.elasticsearch.index.mapper.ObjectMapper$Builder.buildMappers(ObjectMapper.java:150)
app/org.elasticsearch.server@8.3.0-SNAPSHOT/org.elasticsearch.index.mapper.ObjectMapper$Builder.build(ObjectMapper.java:171)
app/org.elasticsearch.server@8.3.0-SNAPSHOT/org.elasticsearch.index.mapper.ObjectMapper$Builder.build(ObjectMapper.java:64)
app/org.elasticsearch.server@8.3.0-SNAPSHOT/org.elasticsearch.index.mapper.ObjectMapper$Builder.buildMappers(ObjectMapper.java:150)
app/org.elasticsearch.server@8.3.0-SNAPSHOT/org.elasticsearch.index.mapper.RootObjectMapper$Builder.build(RootObjectMapper.java:110)
app/org.elasticsearch.server@8.3.0-SNAPSHOT/org.elasticsearch.index.mapper.MappingParser.parse(MappingParser.java:99)
app/org.elasticsearch.server@8.3.0-SNAPSHOT/org.elasticsearch.index.mapper.MappingParser.parse(MappingParser.java:94)
app/org.elasticsearch.server@8.3.0-SNAPSHOT/org.elasticsearch.index.mapper.MapperService.parseMapping(MapperService.java:370)
app/org.elasticsearch.server@8.3.0-SNAPSHOT/org.elasticsearch.index.mapper.MapperService.merge(MapperService.java:347)
app/org.elasticsearch.server@8.3.0-SNAPSHOT/org.elasticsearch.index.mapper.MapperService.merge(MapperService.java:337)
app/org.elasticsearch.server@8.3.0-SNAPSHOT/org.elasticsearch.indices.IndicesService.verifyIndexMetadata(IndicesService.java:810)
app/org.elasticsearch.server@8.3.0-SNAPSHOT/org.elasticsearch.cluster.metadata.MetadataUpdateSettingsService$1.execute(MetadataUpdateSettingsService.java:247)
app/org.elasticsearch.server@8.3.0-SNAPSHOT/org.elasticsearch.cluster.metadata.MetadataUpdateSettingsService.lambda$new$0(MetadataUpdateSettingsService.java:79)
app/org.elasticsearch.server@8.3.0-SNAPSHOT/org.elasticsearch.cluster.metadata.MetadataUpdateSettingsService$$Lambda$3405/0x00000008014c4400.execute(Unknown Source)
app/org.elasticsearch.server@8.3.0-SNAPSHOT/org.elasticsearch.cluster.service.MasterService.innerExecuteTasks(MasterService.java:908)
app/org.elasticsearch.server@8.3.0-SNAPSHOT/org.elasticsearch.cluster.service.MasterService.executeTasks(MasterService.java:878)
app/org.elasticsearch.server@8.3.0-SNAPSHOT/org.elasticsearch.cluster.service.MasterService.runTasks(MasterService.java:248)
app/org.elasticsearch.server@8.3.0-SNAPSHOT/org.elasticsearch.cluster.service.MasterService$Batcher.run(MasterService.java:156)
app/org.elasticsearch.server@8.3.0-SNAPSHOT/org.elasticsearch.cluster.service.TaskBatcher.runIfNotProcessed(TaskBatcher.java:110)
app/org.elasticsearch.server@8.3.0-SNAPSHOT/org.elasticsearch.cluster.service.TaskBatcher$BatchedTask.run(TaskBatcher.java:148)
app/org.elasticsearch.server@8.3.0-SNAPSHOT/org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:709)
app/org.elasticsearch.server@8.3.0-SNAPSHOT/org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:260)
app/org.elasticsearch.server@8.3.0-SNAPSHOT/org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:223)
java.base@18.0.1.1/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
java.base@18.0.1.1/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
java.base@18.0.1.1/java.lang.Thread.run(Thread.java:833)
We should find a way to skip unnecessary mapping validation when nothing about the mappings has changed.
relates #77466
Updating index settings on a large number of indices can take many minutes and the resulting cluster state update might fail to cleanly publish without warning because of the time it takes to persist to disk.
The disk part is obvious, we write one Lucene document per index, resulting in a massive write to Lucene.
The bigger part of why a large setting update is slow though is the index metadata validation that is run for each index. This validation deserialises + reserializes the mapping for every index that got updated, which for large mappings combined with a large number of updates indices can take many minutes.
We should find a way to skip unnecessary mapping validation when nothing about the mappings has changed.
relates #77466