Advance PRRLs to match GCP of tracked shards#43751
Conversation
This commit adjusts the behaviour of the retention lease sync to first renew any peer-recovery retention leases where either: - the corresponding shard's global checkpoint has advanced, or - the lease is older than half of its expiry time Relates elastic#41536
|
Pinging @elastic/es-distributed |
|
The "half the expiry time" heuristic is up for debate. We need some kind of margin to avoid renewing these leases on each sync (which would undo #42299), but we could choose a different number or, indeed, make it configurable. |
ywelsch
left a comment
There was a problem hiding this comment.
I've left one question. Looking good o.w.
| assert indexSettings.getIndexVersionCreated().before(Version.V_8_0_0) : indexSettings.getIndexVersionCreated(); | ||
| } | ||
| } else { | ||
| if (retentionLease.retainingSequenceNumber() <= checkpointState.globalCheckpoint |
There was a problem hiding this comment.
should we only renew if the shard is active? A non-completed recovery attempt should not qualify for a renewal
There was a problem hiding this comment.
We could but this would mean that when the recovery completes it would be an active shard with a very old lease until the next sync a few minutes later, and a failure in those few minutes could result in its lease expiring. I see no harm in renewing leases for recovering shards too.
There was a problem hiding this comment.
I hadn't thought about that situation. Yes, that would be bad. Perhaps we could delay it until we have checkpointState.tracked though (as we are only able to communicate to the shard itself in that case).
There was a problem hiding this comment.
It feels a little unnatural to restrict this even to tracked shards. We establish the lease during peer recovery before initiating tracking, so I think it makes sense to maintain it without regard to the tracking status of the shard.
However I have pushed 400794e to renew all the leases at once - if we're going to push out a new set of leases we may as well advance them all at the same time. This means that assigning a shard could cost one extra retention lease sync, to bring it up to date, and then the renewals will carry on with the usual schedule.
…-prrls-during-sync
…-prrls-during-sync
…-prrls-during-sync
|
LGTM2 |
This commit adjusts the behaviour of the retention lease sync to first renew any peer-recovery retention leases where either: - the corresponding shard's global checkpoint has advanced, or - the lease is older than half of its expiry time Relates #41536
This commit adjusts the behaviour of the retention lease sync to first renew
any peer-recovery retention leases where either:
the corresponding shard's global checkpoint has advanced, or
the lease is older than half of its expiry time
Relates #41536