Set Lucene version upon index creation.#36038
Conversation
It is important that all shards of a given index have the same `indexCreatedVersionMajor` to Lucene, or eg. merging those shards is going to be considered illegal. At the moment, we use the latest Lucene version when creating a shard, which could cause shards to have different created versions eg. in case of forced allocation. This commit makes sure to reuse the appropriate Lucene version in order to avoid such issues. Closes elastic#33826
|
Pinging @elastic/es-distributed |
| IndexWriter writer = new IndexWriter(store.directory(), new IndexWriterConfig(null) | ||
| .setSoftDeletesField(Lucene.SOFT_DELETES_FIELD) | ||
| .setOpenMode(IndexWriterConfig.OpenMode.CREATE) | ||
| .setIndexCreatedVersionMajor(version.luceneVersion.major) |
There was a problem hiding this comment.
this is not the version with which the index was created, but the version of the master node in the cluster that took the snapshot :)
I think you need to access targetShard.indexSettings().getIndexMetaData().getCreationVersion() here
| } | ||
| } else { | ||
| store.createEmpty(); | ||
| store.createEmpty(indexShard.indexSettings().getIndexVersionCreated().luceneVersion); |
There was a problem hiding this comment.
can you also adapt the logic in addIndices in this class so that it uses the indexmetadata created version when shrinking/splitting? We then have it uniformly everywhere
| List<Version> versions = DeclaredVersionsHolder.DECLARED_VERSIONS; | ||
| Version tmp = new Version(id, org.apache.lucene.util.Version.LATEST); | ||
| int index = Collections.binarySearch(versions, tmp); | ||
| if (index < 0) { |
There was a problem hiding this comment.
do we ever expect to find a match in the above collection if we end up in the default case here? Could this be assert index < 0?
There was a problem hiding this comment.
No indeed, I added an assertion.
|
Thanks @ywelsch, I pushed a new commit. |
| // clean target directory (if previous recovery attempt failed) and create a fresh segment file with the proper lucene version | ||
| Lucene.cleanLuceneIndex(target); | ||
| assert sources.length > 0; | ||
| final int luceneIndexCreatedVersionMajor = Lucene.readSegmentInfos(sources[0]).getIndexCreatedVersionMajor(); |
There was a problem hiding this comment.
maybe for uniformity pass the lucene version as parameter to the addIndices method and use indexShard.indexSettings().getIndexVersionCreated().luceneVersion, just as in all the other places?
There was a problem hiding this comment.
I felt like reusing the version of one index was safer, as the Lucene version is sometimes guessed via logic in o.e.Version. I don't feel strongly about it though so if you'd rather like that we use the version that comes from index settings, I'm fine with it.
There was a problem hiding this comment.
ok, thanks for explaining your reasoning here. Let's keep as is, then.
It is important that all shards of a given index have the same
indexCreatedVersionMajorto Lucene, or eg. merging those shards is going tobe considered illegal. At the moment, we use the latest Lucene version when
creating a shard, which could cause shards to have different created versions
eg. in case of forced allocation. This commit makes sure to reuse the
appropriate Lucene version in order to avoid such issues.
Closes #33826