Allow engine to recover from translog upto a seqno by dnhatn · Pull Request #33032 · elastic/elasticsearch

dnhatn · 2018-08-21T17:37:58Z

This change allows an engine to recover from its local translog up to
the given seqno. The extended API can be used in these use cases:

When a replica starts following a new primary, it resets its index to
the safe commit, then replays its local translog up to the current
global checkpoint (see Reset replica engine before primary-replica resync #32867).
When a replica starts a peer-recovery, it can initialize the
start_sequence_number to the persisted global checkpoint instead of the
local checkpoint of the safe commit. A replica will then replay its
local translog up to that global checkpoint before accepting remote
translog from the primary. This change will increase the chance of
operation-based recovery. I will make this in a follow-up.

Relates #32867

/cc @bleskes

This change allows an engine to recover from its local translog up to the given seqno. The extended API can be used in these use cases: 1. When a replica starts following a new primary, it resets its index to the safe commit, then replays its local translog up to the current global checkpoint (see elastic#32867). 2. When a replica starts a peer-recovery, it can initialize the start_sequence_number to the persisted global checkpoint instead of the local checkpoint of the safe commit. A replica will then replay its local translog up to that global checkpoint before accepting remote translog from the primary. This change will increase the chance of operation-based recovery. I will make this in a follow-up. Relates elastic#32867

elasticmachine · 2018-08-21T17:37:59Z

Pinging @elastic/es-distributed

ywelsch

I've left a few comments

ywelsch · 2018-08-21T20:33:57Z

server/src/main/java/org/elasticsearch/index/engine/Engine.java


    /**
-     * Performs recovery from the transaction log.
+     * Performs recovery from the transaction log up to {@code recoverUpToSeqNo}.


can you add that it's an inclusive bound? (i.e. up to XYZ included)

ywelsch · 2018-08-21T20:39:48Z

server/src/main/java/org/elasticsearch/index/translog/Translog.java

    }

-    public Snapshot newSnapshotFromGen(long minGeneration) throws IOException {
+    public Snapshot newSnapshotFromGen(TranslogGeneration fromGeneration, long upToSeqNo) throws IOException {


why did you change this to take a TranslogGeneration with the uuid instead of just the long minGeneration? It's not using that uuid anywhere here AFAICS.

This method might be interpreted as a range of translog generations or a range of sequence numbers if the parameter is a tuple of Longs. I changed to TranslogGeneration to avoid this issue. I will revert this change if you don't like it.

I think it's good though. less likely to misuse.

ok, makes sense.

ywelsch · 2018-08-21T20:46:15Z

server/src/main/java/org/elasticsearch/index/translog/Translog.java

+            return new Snapshot() {
+                int skippedOps = 0;
+                @Override
+                public int totalOperations() {


also override overriddenOperations and delegate to snapshot.overriddenOperations?

good catch!

ywelsch · 2018-08-21T21:13:37Z

server/src/main/java/org/elasticsearch/index/translog/Translog.java

+            if (upToSeqNo == Long.MAX_VALUE) {
+                return snapshot;
+            }
+            return new Snapshot() {


I think we should have a proper (top-level) class for this, supporting both min and max. min would be useful for newSnapshotFromMinSeqNo (see e.g. PrimaryReplicaResyncer, which still has to filter based on min = startingSeqNo, all of which could be accomplished through the Snapshot), and max would be useful for this one here (where it might also have a min).

We might even make the interface of this newSnapshot method purely sequence-number-based, where you can specify the range of operations to recover instead of the translog generation. That last part is not something I would change right away, but maybe something to look into later.

I added filter(Predicate<Operation>) method to the Snapshot for this purpose. However, I feel it's too broad; then I go with the filter class that you suggested.

dnhatn · 2018-08-22T00:31:41Z

@ywelsch I've addressed your comments. Would you please have another look? Thank you!

s1monw

LGTM

ywelsch

LGTM

dnhatn · 2018-08-22T11:57:19Z

Thanks @ywelsch and @s1monw.

This change allows an engine to recover from its local translog up to the given seqno. The extended API can be used in these use cases: When a replica starts following a new primary, it resets its index to the safe commit, then replays its local translog up to the current global checkpoint (see #32867). When a replica starts a peer-recovery, it can initialize the start_sequence_number to the persisted global checkpoint instead of the local checkpoint of the safe commit. A replica will then replay its local translog up to that global checkpoint before accepting remote translog from the primary. This change will increase the chance of operation-based recovery. I will make this in a follow-up. Relates #32867

* 6.x: Allow engine to recover from translog upto a seqno (#33032) TEST: Skip assertSeqNos for closed shards (#33130) TEST: resync operation on replica should acquire shard permit (#33103) Add proxy support to RemoteClusterConnection (#33062) Build: Line up IDE detection logic Security index expands to a single replica (#33131) Suppress more tests HLRC: request/response homogeneity and JavaDoc improvements (#33133) [Rollup] Move toAggCap() methods out of rollup config objects (#32583) Muted all these tests due to #33128 Fix race condition in scheduler engine test

dnhatn added >enhancement v7.0.0 :Distributed/Engine Anything around managing Lucene and the Translog in an open shard. v6.5.0 labels Aug 21, 2018

dnhatn requested review from s1monw and ywelsch August 21, 2018 17:37

dnhatn requested a review from jasontedor August 21, 2018 17:37

dnhatn mentioned this pull request Aug 21, 2018

Reset replica engine before primary-replica resync #32867

Closed

ywelsch suggested changes Aug 21, 2018

View reviewed changes

feedback

4ee1ebc

dnhatn requested a review from ywelsch August 22, 2018 00:31

s1monw approved these changes Aug 22, 2018

View reviewed changes

ywelsch approved these changes Aug 22, 2018

View reviewed changes

dnhatn merged commit 262d3c0 into elastic:master Aug 22, 2018

dnhatn deleted the recover-upto-seqno branch August 22, 2018 11:57

dnhatn added the backport pending label Aug 22, 2018

dnhatn removed the backport pending label Aug 26, 2018

colings86 added v7.0.0-beta1 and removed v7.0.0 labels Feb 7, 2019

Conversation

dnhatn commented Aug 21, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

elasticmachine commented Aug 21, 2018

Uh oh!

ywelsch left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dnhatn commented Aug 22, 2018

Uh oh!

s1monw left a comment

Choose a reason for hiding this comment

Uh oh!

ywelsch left a comment

Choose a reason for hiding this comment

Uh oh!

dnhatn commented Aug 22, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

dnhatn commented Aug 21, 2018 •

edited

Loading