Require [articles] setting in elision filter#43083
Require [articles] setting in elision filter#43083romseygeek merged 8 commits intoelastic:masterfrom
Conversation
|
Pinging @elastic/es-search |
|
If it requires setting, shouldn't elision filter provider be wrapped in Also - why are those settings required for (I do not have any elision specific settings in my index and everything works fine there) |
It should, thanks for catching!
Are you sure you're actually using the elision filter during indexing? If I set things up without any settings, then I get NPEs during indexing: |
|
Aha, but there's a prebuilt elision filter that by default uses French articles, so if you don't define any filters at all but still add |
Yes, that's how it's used in my index. Sorry for imprecise description. |
|
I opened #43568 as a follow-up to deal with pre-configured filters in the Analyze API |
cbuescher
left a comment
There was a problem hiding this comment.
@romseygeek change LGTM, but can you briefly clarify how this will be handled when backporting to 7.3? The way I currently understand your comments here is that there an elision filter without an explicite definition in the settings will still use the built-in "french" one? Just want to make sure we're not breaking existing mappings, even if they might be silently behave buggy, on a minor when backporting.
That's correct. The only change we're making here is to capture a mis-configured filter early - currently, if you set up an elision filter in an explicit mapping but don't give it an articles set, then it will throw a NullPointerException when it is first used. Now, you'll get an error when you create the mapping. |
jimczi
left a comment
There was a problem hiding this comment.
Thanks @romseygeek , I left a comment
|
|
||
| ElisionTokenFilterFactory(IndexSettings indexSettings, Environment env, String name, Settings settings) { | ||
| super(indexSettings, name, settings); | ||
| if (settings.get("articles") == null) { |
There was a problem hiding this comment.
It's hidden but it's also possible to set articles_path to reference a file so we should also check this alternative.
I am not sure that a lot of users are aware of this and I doubt that it is very useful but we should handle it for bwc.
There was a problem hiding this comment.
articles_path isn't documented, I'll add something to the tokenfilter docs about it as well.
jimczi
left a comment
There was a problem hiding this comment.
Thanks for updating, I left a minor comment but feel free to ignore.
|
|
||
| ElisionTokenFilterFactory(IndexSettings indexSettings, Environment env, String name, Settings settings) { | ||
| super(indexSettings, name, settings); | ||
| if (settings.get("articles") == null && settings.get("articles_path") == null) { |
There was a problem hiding this comment.
Can we check that parseArticles doesn't return a null value instead ?
We should throw an exception at construction time if a list of articles is not provided, otherwise we can get random NPEs during indexing. Relates to #43002
We should throw an exception at construction time if a list of
articles is not provided, otherwise we can get random NPEs during
indexing.
Fixes #43002