Skip to content

[7.x] Add support for peer recoveries using snapshots after primary failovers#79137

Merged
fcofdez merged 2 commits intoelastic:7.xfrom
fcofdez:recover-seqno-snapshot-7.x
Oct 14, 2021
Merged

[7.x] Add support for peer recoveries using snapshots after primary failovers#79137
fcofdez merged 2 commits intoelastic:7.xfrom
fcofdez:recover-seqno-snapshot-7.x

Conversation

@fcofdez
Copy link
Copy Markdown
Contributor

@fcofdez fcofdez commented Oct 14, 2021

This commit adds support for peer recoveries using snapshots after
a primary failover if the snapshot shares the same logical contents
but the physical files are different. It uses the seq no information
stored in the snapshot to compare against the current shard source
node seq nos and decide whether or not it can use the snapshot to
recover the shard. Since the underlying index files are different
to the source index files, error handling is different than when
the files are shared. In this case, if there's an error while
snapshots files are recovered, we have to cancel the on-going
downloads, wait until all in-flight operations complete, remove
the recovered files and start from scratch using a fallback
recovery plan that uses the files from the source node.

Relates #73496
Backport of #77420

This commit adds support for peer recoveries using snapshots after
a primary failover if the snapshot shares the same logical contents
but the physical files are different. It uses the seq no information
stored in the snapshot to compare against the current shard source
node seq nos and decide whether or not it can use the snapshot to
recover the shard. Since the underlying index files are different
to the source index files, error handling is different than when
the files are shared. In this case, if there's an error while
snapshots files are recovered, we have to cancel the on-going
downloads, wait until all in-flight operations complete, remove
the recovered files and start from scratch using a fallback
recovery plan that uses the files from the source node.

Relates elastic#73496
Backport of elastic#77420
@fcofdez fcofdez merged commit 115d681 into elastic:7.x Oct 14, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant