Fix kuromoji default stoptags#26600
Conversation
The order was reversed, as the expected value was given for the actual value and vice versa. This led to a confusing assertion error message: ``` FAILURE 0.04s J1 | KuromojiAnalysisTests.testPartOfSpeechFilter <<< FAILURES! > Throwable elastic#1: java.lang.AssertionError: expected different term at index 1 > Expected: "が" > but: was "おいしい" ``` when the string "が" was actually not expected.
* add new test which checks that part-of-speech tokens are removed when using the kuromoji_part_of_speech filter * initialize the default stop-tags in `KuromojiPartOfSpeechFilterFactory` if the `stoptags` are not given in the config
|
Since this is a community submitted pull request, a Jenkins build has not been kicked off automatically. Can an Elastic organization member please verify the contents of this patch and then kick off a build manually? |
|
@elasticmachine test this please |
|
@johtani looks like you might be the right person to look at this. Please un-assign yourself if not. |
johtani
left a comment
There was a problem hiding this comment.
LGTM
@cbuescher This PR is bug fix so we can cherry-pick it to 6.x, right?
|
@johtani if it is a bug fix an doesn't change existing behaviour I'd even pick it to 6.0. Looks low risk to me, do you agree? |
|
@cbuescher I agree with you. |
|
@avdv thanks a lot for this fix, I will merge this to master and the current 6.x branches |
Initialize the default stop-tags in `KuromojiPartOfSpeechFilterFactory` if the `stoptags` are not given in the config. Also adding a test which checks that part-of-speech tokens are removed when using the kuromoji_part_of_speech filter.
Initialize the default stop-tags in `KuromojiPartOfSpeechFilterFactory` if the `stoptags` are not given in the config. Also adding a test which checks that part-of-speech tokens are removed when using the kuromoji_part_of_speech filter.
|
thank for @cbuescher and @johtani! any chance to get this into 5.6.x, too? |
|
@avdv you are right, since this is a bugfix, I think we should also merge to the 5.6 branch. Will do so. Not sure though in which of the next minor releases it is going to end up. |
Initialize the default stop-tags in `KuromojiPartOfSpeechFilterFactory` if the `stoptags` are not given in the config. Also adding a test which checks that part-of-speech tokens are removed when using the kuromoji_part_of_speech filter.
* master: fix testSniffNodes to use the new error message Add check for invalid index in WildcardExpressionResolver (elastic#26409) Docs: Use single-node discovery.type for dev example Filter unsupported relation for range query builder (elastic#26620) Fix kuromoji default stoptags (elastic#26600) [Docs] Add description for missing fields in Reindex/Update/Delete By Query (elastic#26618) [Docs] Update ingest.asciidoc (elastic#26599) Better message text for ResponseException [DOCS] Remove edit link from ML node enable bwc testing fix StartRecoveryRequestTests.testSerialization Add bad_request to the rest-api-spec catch params (elastic#26539) Introduce a History UUID as a requirement for ops based recovery (elastic#26577) Add missing catch arguments to the rest api spec (elastic#26536)
Fixes #26519