Add script_filter tokenfilter#33431
Add script_filter tokenfilter#33431romseygeek merged 6 commits intoelastic:masterfrom romseygeek:scripted-stop-filter
Conversation
|
Pinging @elastic/es-search-aggs |
|
|
||
| public boolean isKeyword() { | ||
| return isKeyword; | ||
| return keywordAtt.isKeyword(); |
rjernst
left a comment
There was a problem hiding this comment.
Can we use consistent terminology in the class names and token filter name? I think we should use "predicate" terminology everywhere? This will leave the namespace open for other types of scripted token filters in the future.
| -------------------------------------------------- | ||
| // CONSOLE | ||
|
|
||
| <1> This will skip tokens that are 5 characters long or less |
There was a problem hiding this comment.
This description would make more sense with positive logic, since the predicate is positive based. So something like:
<1> This will emit tokens that are more than 5 characters long
|
@rjernst so call it |
|
How about |
rjernst
left a comment
There was a problem hiding this comment.
Thanks @romseygeek. The new naming LGTM.
| [[analysis-scriptfilter-tokenfilter]] | ||
| === Scripted Filtering Token Filter | ||
| [[analysis-predicatefilter-tokenfilter]] | ||
| === Predicate Token Script Filter |
There was a problem hiding this comment.
Token Script Filter -> Token Filter Script
This allows users to filter out tokens from a TokenStream using painless scripts, instead of having to write specialised Java code and packaging it up into a plugin. The commit also refactors the AnalysisPredicateScript.Token class so that it wraps and makes read-only an AttributeSource.
* master: Preserve cluster settings on full restart tests (elastic#33590) Use IndexWriter.getFlushingBytes() rather than tracking it ourselves (elastic#33582) Fix upgrading of list settings (elastic#33589) Add read-only Engine (elastic#33563) HLRC: Add ML get categories API (elastic#33465) SQL: Adds MONTHNAME, DAYNAME and QUARTER functions (elastic#33411) Add predicate_token_filter (elastic#33431) Fix Replace function. Adds more tests to all string functions. (elastic#33478) [ML] Rename input_fields to column_names in file structure (elastic#33568)
* master: (91 commits) Preserve cluster settings on full restart tests (elastic#33590) Use IndexWriter.getFlushingBytes() rather than tracking it ourselves (elastic#33582) Fix upgrading of list settings (elastic#33589) Add read-only Engine (elastic#33563) HLRC: Add ML get categories API (elastic#33465) SQL: Adds MONTHNAME, DAYNAME and QUARTER functions (elastic#33411) Add predicate_token_filter (elastic#33431) Fix Replace function. Adds more tests to all string functions. (elastic#33478) [ML] Rename input_fields to column_names in file structure (elastic#33568) Add full cluster restart base class (elastic#33577) Validate list values for settings (elastic#33503) Copy and validatie soft-deletes setting on resize (elastic#33517) Test: Fix package name SQL: Fix result column names for arithmetic functions (elastic#33500) Upgrade to latest Lucene snapshot (elastic#33505) Enable not wiping cluster settings after REST test (elastic#33575) MINOR: Remove Dead Code in SearchScript (elastic#33569) [Test] Remove duplicate method in TestShardRouting (elastic#32815) mute test on windows Update beats template to include apm-server metrics (elastic#33286) ...
* support predicate_token_filter elastic/elasticsearch#33431 * add new file * fix failing unit tests
* support predicate_token_filter elastic/elasticsearch#33431 * add new file * fix failing unit tests
* support predicate_token_filter elastic/elasticsearch#33431 * add new file * fix failing unit tests
* support predicate_token_filter elastic/elasticsearch#33431 * add new file * fix failing unit tests (cherry picked from commit 6d5340b)
* support predicate_token_filter elastic/elasticsearch#33431 * add new file * fix failing unit tests (cherry picked from commit 6d5340b)
This will allow users to filter out terms using scripted predicates, rather than having to write Java code and wiring things up via analysis plugins.