Expose translog stats in ReadOnlyEngine#43752
Conversation
|
Pinging @elastic/es-distributed |
|
@elasticmachine run elasticsearch-ci/docbldesx |
dnhatn
left a comment
There was a problem hiding this comment.
LGTM. I left two minor comments.
| --- | ||
| "Translog stats on closed indices": | ||
| - skip: | ||
| version: " - 7.2.99" |
There was a problem hiding this comment.
I think we should skip this test until 7.9.99 then adjust to this value after backporting.
There was a problem hiding this comment.
weird that this did not fail the tests? Perhaps BWC is disabled?
There was a problem hiding this comment.
Pfff... I should have seen this. And yes, bwc tests were disabled at that time. I merged master and bwc are enabled again.
| reader = wrapReader(reader, readerWrapperFunction); | ||
| searcherManager = new SearcherManager(reader, searcherFactory); | ||
| this.docsStats = docsStats(lastCommittedSegmentInfos); | ||
| this.translogStats = translogStats != null ? translogStats : translogStats(config, lastCommittedSegmentInfos); |
There was a problem hiding this comment.
Can we assert that if translogStats is null then obtainLock is true? This ensures that we don't open two translog instances at the same time.
There was a problem hiding this comment.
++ especially as opening the translog here writes a new checkpoint file.
There was a problem hiding this comment.
This is a very good suggestion, thanks
| --- | ||
| "Translog stats on closed indices": | ||
| - skip: | ||
| version: " - 7.2.99" |
There was a problem hiding this comment.
weird that this did not fail the tests? Perhaps BWC is disabled?
| - do: | ||
| cluster.health: | ||
| expand_wildcards: all | ||
| wait_for_no_initializing_shards: true |
There was a problem hiding this comment.
this is only needed for the backport.
Alternatively you could use wait_for_active_shards = 1 on the indices.close action
server/src/main/java/org/elasticsearch/index/engine/ReadOnlyEngine.java
Outdated
Show resolved
Hide resolved
| reader = wrapReader(reader, readerWrapperFunction); | ||
| searcherManager = new SearcherManager(reader, searcherFactory); | ||
| this.docsStats = docsStats(lastCommittedSegmentInfos); | ||
| this.translogStats = translogStats != null ? translogStats : translogStats(config, lastCommittedSegmentInfos); |
There was a problem hiding this comment.
++ especially as opening the translog here writes a new checkpoint file.
ywelsch
left a comment
There was a problem hiding this comment.
LGTM. Don't forget to add the wait_for_active_shards condition on the 7.x backport.
The test IndexShardIT.testIndexCanChangeCustomDataPath() fails on 7.x and 7.3 because the translog cannot be recovered. While I can't reproduce the issue, I think it has been introduced in #43752 which changed ReadOnlyEngine so that it opens the translog in its constructor in order to load the translog stats. This opening writes a new checkpoint file, but because 7.x/7.3 does not wait for shards to be started after being closed, the test immediately starts to copy shard files to a new directory and possibly does not copy all the required translog files. By waiting for the shards to be started after being closed, we ensure that the shards (and engines) have been correctly initialized and that the translog checkpoint file is not currently being written. closes #43964
The test IndexShardIT.testIndexCanChangeCustomDataPath() fails on 7.x and 7.3 because the translog cannot be recovered. While I can't reproduce the issue, I think it has been introduced in #43752 which changed ReadOnlyEngine so that it opens the translog in its constructor in order to load the translog stats. This opening writes a new checkpoint file, but because 7.x/7.3 does not wait for shards to be started after being closed, the test immediately starts to copy shard files to a new directory and possibly does not copy all the required translog files. By waiting for the shards to be started after being closed, we ensure that the shards (and engines) have been correctly initialized and that the translog checkpoint file is not currently being written. closes #43964
This pull request changes
ReadOnlyEngineso that it loads stats from the translog when the engine is created. It also adds tests to check that translog stats are correctly exposed in Stats API for closed and frozen indices.