Fail engine if hit document failure on replicas#43523
Fail engine if hit document failure on replicas#43523dnhatn merged 14 commits intoelastic:masterfrom
Conversation
We should not generate Noops for failed indexing operations on replicas or followers.
|
Pinging @elastic/es-distributed |
ywelsch
left a comment
There was a problem hiding this comment.
This is a tricky PR. We want to make sure we're not recording an operation as failed in the translog when we fail to add it to Lucene on a replica. Instead, we let the failure bubble up to the primary so that it can fail the replica. We could also consider this as a fatal failure, and directly fail the shard once indexing into Lucene fails.
The case we also need to consider is when we replay from the translog to Lucene on recovery from store. Should we then also fail the primary if we fail to replay the operation? This could mean that the primary is unrecoverable, e.g. because of some incompatibility introduced during an upgrade. If we're lenient there, however, it brings the risk of primary and replica going out of sync (if we let the replica locally recover up to global checkpoint). Perhaps we could allow a way for the shard to be recovered with a force command, which changes the history uuid. I think we need a more comprehensive plan here.
|
@ywelsch I've updated this PR to proceed with operations on replicas only. Can you please take a look? Thank you! |
| return new IndexResult(plan.versionForIndexing, index.primaryTerm(), index.seqNo(), plan.currentNotFoundOrDeleted); | ||
| } catch (Exception ex) { | ||
| if (indexWriter.getTragicException() == null) { | ||
| if ( treatDocumentFailureAsTragicError(index) == false && indexWriter.getTragicException() == null) { |
There was a problem hiding this comment.
should we treat AlreadyClosedException specially here as well (same as when we index deletion or noop tombstone).
There was a problem hiding this comment.
We should not have a special treatment for AlreadyClosedException here. If the engine was failed and closed by other thread, it's perfectly fine to bubble up to the AlreadyClosedException. In fact, we should bubble up AlreadyClosedException so we can detect situations where the engine has a buggy state.
However, we probably should call maybeFailEngine instead of failEngine if the exception is AlreadyClosedEngine to avoid unnecessary warning log if the engine was failed already.
There was a problem hiding this comment.
I think we should not try to wrap AlreadyClosedException into an IndexResult as we might possibly write it to the translog during closing.
| try { | ||
| maybeFailEngine("index", e); | ||
| if (treatDocumentFailureAsTragicError(index)) { | ||
| failEngine("index", e); |
There was a problem hiding this comment.
can we add more info about document into the "reason" string?
There was a problem hiding this comment.
I meant some info about the document itself, i.e. the id of the document (This could help in figuring out why the given failure happened)
|
Thanks @ywelsch. |
Backport of elastic/elasticsearch#43523 (cherry picked from commit 9929cb2) # Conflicts: # blackbox/docs/appendices/release-notes/unreleased.rst # es/es-server/src/test/java/org/elasticsearch/index/engine/InternalEngineTests.java
Backport of elastic/elasticsearch#43523 (cherry picked from commit 9929cb2)
Backport of elastic/elasticsearch#43523 (cherry picked from commit 9929cb2)
An indexing on a replica should never fail after it was successfully indexed on a primary. Hence, we should fail an engine if we hit any failure (document level or tragic failure) when processing an indexing on a replica.
Relates #43228
Closes #40435 (see #40435 (comment)).