Stop unnecessary retries of shard-started tasks

When a data node finishes recovering a shard it notifies the master to move it to state `STARTED`. Today we repeat this request every time we receive a cluster state that hasn't updated the shard state yet:

https://github.com/elastic/elasticsearch/blob/b6fbf5a1548c1924e67c360af2b6dd8ec51508ce/server/src/main/java/org/elasticsearch/indices/cluster/IndicesClusterStateService.java#L601-L619

This behaviour means if the master is busy processing (potentially thousands) of other `URGENT` tasks then we'll submit the same task repeatedly (potentially thousands of times). It dates back a long time but is no longer necessary: we can trust that the master will process our original request first (or we get notified that it failed). We should stop sending these unnecessary retries.

Relates https://github.com/elastic/elasticsearch/issues/77466

	if (shardRouting.initializing() && (state == IndexShardState.STARTED \|\| state == IndexShardState.POST_RECOVERY)) {
	// the master thinks we are initializing, but we are already started or on POST_RECOVERY and waiting
	// for master to confirm a shard started message (either master failover, or a cluster event before
	// we managed to tell the master we started), mark us as started
	if (logger.isTraceEnabled()) {
	logger.trace("{} master marked shard as initializing, but shard has state [{}], resending shard started to {}",
	shardRouting.shardId(), state, nodes.getMasterNode());
	}
	if (nodes.getMasterNode() != null) {
	shardStateAction.shardStarted(
	shardRouting,
	primaryTerm,
	"master " + nodes.getMasterNode() + " marked shard as initializing, but shard state is [" + state +
	"], mark shard as started",
	shard.getTimestampRange(),
	SHARD_STATE_ACTION_LISTENER,
	clusterState);
	}
	}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stop unnecessary retries of shard-started tasks #81628

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Stop unnecessary retries of shard-started tasks #81628

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions