Write shard state metadata as soon as shard is created / initializing#16625
Conversation
There was a problem hiding this comment.
@bleskes I'm not sure what effect removing this has. The issue that made me remove this is that the shard state metadata was written when shard is created, then it was removed again if shard was recovery target, and not updated anymore since the shard state metadata did not change from point of view of IndexShard.persistMetadata(). With writing shard state metadata directly, we now know that the shard state metadata is up-to-date before we do recovery (hence no need to delete shard state?)
9b0988e to
8b252b7
Compare
|
@bleskes ping |
There was a problem hiding this comment.
if I understand this correctly, this part is relevant where we assigned a primary after a cluster upgrade and the shard initialized (and wrote a new state file) but we never got around to activating it before crushing again. if that's correct, can you add this to the comment?
|
change looks good to me. Left some suggestions and questions re testing.. |
fe713f0 to
ef3f69e
Compare
|
Pushed another commit addressing review comments. Also found a copy-paste bug in a test. |
|
LGTM. Thanks @ywelsch |
As we rely on active allocation ids persisted in the cluster state to select the primary shard copy, we can write shard state metadata on the allocated node as soon as the node knows about receiving this shard. This also ensures that in case of primary relocation, when the relocation target is marked as started by the master node, the shard state metadata with the correct allocation id has already been written on the relocation target. Before this change, shard state metadata was only written once the node knows it is marked as started. In case of failures between master marking the node as started and the node receiving and processing this event, the relation between the shard copy on disk and the cluster state could get lost. This means that manual allocation of the shard using the reroute command allocate_stale_primary was necessary. Closes elastic#16625
ef3f69e to
d76161d
Compare
…e-metadata Write shard state metadata as soon as shard is created / initializing
As we now rely on active allocation ids persisted in the cluster state to select
the primary shard copy, we can write shard state metadata on the allocated node
as soon as the node knows about receiving this shard. This also ensures that
in case of primary relocation, when the relocation target is marked as started
by the master node, the shard state metadata with the correct allocation id has
already been written on the relocation target. Before this change, shard state
metadata was only written once the node knows it is marked as started. In case
of failures between master marking the node as started and the node
receiving and processing this event, the relation between the shard copy on disk
and the cluster state could get lost. This means that manual allocation of
the shard using the reroute command allocate_stale_primary was necessary.
Relates to #14739