Upgrade to lucene-8.0.0-snapshot-31d7dfe6b1#35224
Conversation
|
|
||
| public class LowerCaseTokenizerFactory extends AbstractTokenizerFactory implements MultiTermAwareComponent { | ||
| @Deprecated | ||
| public class XLowerCaseTokenizerFactory extends AbstractTokenizerFactory { |
| @Override | ||
| public Object getMultiTermComponent() { | ||
| return this; | ||
| return new XLowerCaseTokenizer(); |
There was a problem hiding this comment.
Can we use a LetterTokenizer followed by a LowerCaseFilter like explained in the deprecation javadocs ?
I don't think we need to keep an XLowerCaseTokenizer.
There was a problem hiding this comment.
I'd hoped to do that, but unfortunately the contract of create demands that we return a Tokenizer, and I don't think there's an easy way of wrapping a Tokenizer + TokenFilter combination here?
There was a problem hiding this comment.
Can we at least restrict the usage to old indices (created before 7.0) in order to be able to remove it in 8 ?
There was a problem hiding this comment.
As discussed with @romseygeek offline we'll handle the deprecation and removal in a follow up pr.
| tokenizers.put("classic", ClassicTokenizerFactory.class); | ||
| tokenizers.put("letter", LetterTokenizerFactory.class); | ||
| tokenizers.put("lowercase", LowerCaseTokenizerFactory.class); | ||
| // tokenizers.put("lowercase", XLowerCaseTokenizerFactory.class); |
There was a problem hiding this comment.
The tests here are explicitly checking that we can load lucene analysis classes. LowercaseTokenizer isn't there any more, so this needs to be removed - the commenting out was just to get tests to pass.
| .put("edgengram", MovedToAnalysisCommon.class) | ||
| .put("keyword", MovedToAnalysisCommon.class) | ||
| .put("letter", MovedToAnalysisCommon.class) | ||
| .put("lowercase", MovedToAnalysisCommon.class) |
There was a problem hiding this comment.
We still need this ? The tokenizer is deprecated, not yet removed ?
There was a problem hiding this comment.
Again, this is checking for lucene classes that no longer exist.
jimczi
left a comment
There was a problem hiding this comment.
@nknize I pushed a fix for the failing tests and added a norelease comment regarding the LowercaseTokenizer deprecation/removal. We'll handle the deprecation/removal in a follow up to not block this pr. I am +1 to merge as is if the CI passes
|
Thanks @jimczi I'm not familiar with the error just thrown: |
jimczi
left a comment
There was a problem hiding this comment.
I fixed the failing tests, let's see if the CI passes now.
|
Looks like a new failure. Reproduces for me.
|
|
looks like all tests passed but CentOS packaging is angry. Isn't that usually a flaky one anyway? |
…-agg * master: (528 commits) Register Azure max_retries setting (elastic#35286) add version 6.4.4 [Docs] Add painless context details for bucket_script (elastic#35142) Upgrade jline to 3.8.2 (elastic#35288) SQL: new SQL CLI logo (elastic#35261) Logger: Merge ESLoggerFactory into Loggers (elastic#35146) Docs: Add section about range query for range type (elastic#35222) [ILM] change remove-policy-from-index http method from DELETE to POST (elastic#35268) [CCR] Forgot missing return statement, SQL: Fix null handling for AND and OR in SELECT (elastic#35277) [TEST] Mute ChangePolicyForIndexIT#testChangePolicyForIndex Serialize ignore_throttled also to 6.6 after backport Check for java 11 in buildSrc (elastic#35260) [TEST] increase await timeout in RemoteClusterConnectionTests Add missing up-to-date configuration (elastic#35255) Adapt Lucene BWC version SQL: Introduce Coalesce function (elastic#35253) Upgrade to lucene-8.0.0-snapshot-31d7dfe6b1 (elastic#35224) Fix failing ICU tests (elastic#35207) Prevent throttled indices to be searched through wildcards by default (elastic#34354) ...
This PR upgrades the master branch to lucene-8.0.0-snapshot-31d7dfe6b1. There are several changes that need to be reviewed:
MultiFields.getFieldsand its impact onTermVectorsServiceis the main issue that needs reviewed /cc @romseygeekOther changes include:
TokenStreamComponentsas a final class