Use sequence numbers to identify out of order delivery in replicas & recovery#24060
Use sequence numbers to identify out of order delivery in replicas & recovery#24060bleskes merged 7 commits intoelastic:masterfrom
Conversation
s1monw
left a comment
There was a problem hiding this comment.
LGTM I left some minors that I would love to be addressed but no need for another round on my end.
| import static org.elasticsearch.common.lucene.uid.Versions.NOT_FOUND; | ||
|
|
||
| /** Utility class to resolve the Lucene doc ID, version, seqNo and primaryTerms for a given uid. */ | ||
| public class VersionsAndSeqNoResolver { |
There was a problem hiding this comment.
make this class final? and does it need to be public?
| * </ul> | ||
| */ | ||
| public static DocIdAndVersion loadDocIdAndVersion(IndexReader reader, Term term) throws IOException { | ||
| assert term.field().equals(UidFieldMapper.NAME); |
There was a problem hiding this comment.
oh please give us a message here what field it is... it will save some CPU cycles if shit hits the fan
| * </ul> | ||
| */ | ||
| public static DocIdAndSeqNo loadDocIdAndSeqNo(IndexReader reader, Term term) throws IOException { | ||
| assert term.field().equals(UidFieldMapper.NAME); |
| if (versionValue != null) { | ||
| if (op.seqNo() > versionValue.getSeqNo() || | ||
| (op.seqNo() == versionValue.getSeqNo() && op.primaryTerm() > versionValue.getTerm())) | ||
| status = OpVsLuceneDocStatus.OP_NEWER; |
There was a problem hiding this comment.
in general I find it odd that we have OP_STALE_OR_EQUAL instead of OP_NEWER_OR_EQUAL which is more natural then you have only one state where indexing would be fatal but I can see that this is an optimization to not index the doc again?
There was a problem hiding this comment.
yes indeed. In the case of equality it should be identical and we can skip tokenization etc.
| // a delete state and return false for the created flag in favor of code simplicity | ||
| final OpVsLuceneDocStatus opVsLucene = compareOpToLuceneDocBasedOnVersions(index); | ||
| final OpVsLuceneDocStatus opVsLucene; | ||
| if (index.seqNo() != SequenceNumbersService.UNASSIGNED_SEQ_NO) { |
There was a problem hiding this comment.
please leave a comment here when this unassigned seq no can occur... and when it can go away
There was a problem hiding this comment.
I left an extra comment on the assertion in the else clause:
// This can happen if the primary is still on an old node and send traffic without seq# or we recover from translog
// created by an old version.
assert config().getIndexSettings().getIndexVersionCreated().before(Version.V_6_0_0_alpha1_UNRELEASED) :
"index is newly created but op has no sequence numbers. op: " + index;
|
|
||
| VersionValue(long version) { | ||
| /** the seq number of the operation that last changed the associated uuid */ | ||
| private final long seqNo; |
There was a problem hiding this comment.
we can also just use the members no need to getters... I like it when it's struct like for internal things like this. WDYT?
There was a problem hiding this comment.
works for me. Less fluff is good.
|
@s1monw all your feedback is addressed. Thx. |
jasontedor
left a comment
There was a problem hiding this comment.
This PR was very easy to read; I'm glad that it was factored out of the first PR for this reason. I left some very minor comments. It LGTM and I don't need to do another review.
| * finds. */ | ||
|
|
||
| final class PerThreadIDAndVersionLookup { | ||
| final class PerThreadIDAndVersionSeqNoLookup { |
There was a problem hiding this comment.
The "and" is in an awkward place now, maybe PerThreadIDVersionAndSeqNoLookup?
| } | ||
|
|
||
| /** Return null if id is not found. */ | ||
| public DocIdAndSeqNo lookupSequenceNo(BytesRef id, Bits liveDocs, LeafReaderContext context) throws IOException { |
There was a problem hiding this comment.
I think that this method can be package private? And maybe the other lookup methods in this class too?
There was a problem hiding this comment.
agreed on all accounts.
| CloseableThreadLocal<PerThreadIDAndVersionSeqNoLookup> other = lookupStates.putIfAbsent(key, ctl); | ||
| if (other == null) { | ||
| // Our CTL won, we must remove it when the | ||
| // core is closed: |
There was a problem hiding this comment.
I think that you can put this comment all on one line.
There was a problem hiding this comment.
I know it was like that before, but we are here now. 😄
| reader.addCoreClosedListener(removeLookupState); | ||
| } else { | ||
| // Another thread beat us to it: just use | ||
| // their CTL: |
There was a problem hiding this comment.
I think that you can put this comment all on one line.
There was a problem hiding this comment.
I know it was like that before, but we are here now. 😄
| } | ||
| } | ||
|
|
||
| /** returns 0 if the primary term is not found */ |
There was a problem hiding this comment.
Can you add link to IndexMetaData#primaryTerm as it has a note about the primary term on an operational shard always being strictly positive?
| LUCENE_DOC_NOT_FOUND | ||
| } | ||
|
|
||
| private OpVsLuceneDocStatus compareOpToLuceneDocBasedOnSeqNo(final Operation op) throws IOException { |
There was a problem hiding this comment.
It's so nice to finally see this arriving.
| } | ||
|
|
||
| private OpVsLuceneDocStatus compareOpToLuceneDocBasedOnSeqNo(final Operation op) throws IOException { | ||
| assert op.seqNo() != SequenceNumbersService.UNASSIGNED_SEQ_NO : "resolving ops based seq# but no seqNo is found"; |
| } | ||
|
|
||
| /** returns 0 if the primary term is not found */ | ||
| public long lookUpPrimaryTerm(int docID) throws IOException { |
| public final long version; | ||
| public final LeafReaderContext context; | ||
|
|
||
| public DocIdAndVersion(int docId, long version, LeafReaderContext context) { |
There was a problem hiding this comment.
I think that this constructor can be package private?
| public final long seqNo; | ||
| public final LeafReaderContext context; | ||
|
|
||
| public DocIdAndSeqNo(int docId, long seqNo, LeafReaderContext context) { |
There was a problem hiding this comment.
I think that this constructor can be package private?
|
Thx @jasontedor @s1monw |
* master: Use sequence numbers to identify out of order delivery in replicas & recovery (elastic#24060) Remove customization of ES_USER and ES_GROUP
Internal indexing requests in Elasticsearch may be processed out of order and repeatedly. This is important during recovery and due to concurrency in replicating requests between primary and replicas. As such, a replica/recovering shard needs to be able to identify that an incoming request contains information that is old and thus need not be processed. The current logic is based on external version. This is sadly not sufficient. This PR moves the logic to rely on sequences numbers and primary terms which give the semantics we need.
Relates to #10708