Skip global checkpoint sync for closed indices#41874
Conversation
|
Pinging @elastic/es-distributed |
henningandersen
left a comment
There was a problem hiding this comment.
Would it be an option to skip resync for closed indices? I think we rely on either having a perfect commit to use or doing a file based recovery anyway.
|
|
||
| @Override | ||
| protected boolean allowClosedIndices() { | ||
| return true; // it's okay to sync global checkpoints on closed indices |
There was a problem hiding this comment.
This looks dangerous, since ReadOnlyEngine.syncTransLog() does nothing. So we will accept the global checkpoint on replica, but it will not be persisted.
Looks like RecoveryTarget.finalizeRecovery also relies on that method, so maybe we should implement it. I think this means that the global checkpoint is not persisted, but the replica is marked in sync. If this replica then later becomes primary after for instance a full restart, it could have an old global checkpoint and would fail the shard? Probably a different issue/PR (if I am right).
|
@henningandersen Thanks for looking.
Yes, we can skip as resync is a noop for closed indices. I explored this option then prefer other option to avoid adding one more exception for closed indices. I can make this change again - it's pretty contained.
We can add an assertion to make sure that the persisted global checkpoint equals to the in-memory value. |
|
@henningandersen I discussed this with Tanguy and Yannick during GAH. We need to keep resync; otherwise, replicas won't see the new primary term. I updated this PR to skip global checkpoint sync for closed indices. Can you please take another look? Thank you! |
|
Thanks @henningandersen @ywelsch. |
The verifying-before-close step ensures the global checkpoints on all shard copies are in sync; thus, we don' t need to sync global checkpoints for closed indices. Relate #33888
The verifying-before-close step ensures the global checkpoints on all shard copies are in sync; thus, we don' t need to sync global checkpoints for closed indices. Relate elastic#33888
The verifying-before-close step ensures the global checkpoints on all shard copies are in sync; thus, we don' t need to sync global checkpoints for closed indices.
Relate #33888 (comment)