Move tokenizers to analysis common module#30538
Merged
martijnvg merged 4 commits intoelastic:masterfrom May 14, 2018
Merged
Conversation
The following tokenizers were moved: classic, edge_ngram, keyword, letter, lowercase, ngram, path_hierarchy, pattern, thai, uax_url_email and whitespace. Relates to elastic#23658
normalizers directly depend on it.This should be addressed on a follow up change.
Collaborator
|
Pinging @elastic/es-search-aggs |
nik9000
approved these changes
May 11, 2018
| // see #5120 | ||
| // Note: moved from org.elasticsearch.search.query.SearchQueryIT as it relies on this module now. | ||
| public void testNGramCopyField() { | ||
| CreateIndexRequestBuilder builder = prepareCreate("test").setSettings(Settings.builder() |
Member
There was a problem hiding this comment.
I wonder if we should move both of these to yaml tests. The second one feels like the kind of yaml tests that I was adding when I moved things if my memory serves me. It has been a while....
Member
There was a problem hiding this comment.
And the first one feels fairly special around copy and ngram together? But I it can be a yaml one too without too much trouble.
Member
Author
There was a problem hiding this comment.
I will move the first one to a yaml test and you're right about the second test. It already exists :), so I will just remove that one.
…test) and moved test from java integtest to yaml test and also added yaml tests for provided tokenizers
707d15e to
3a3dec9
Compare
75 tasks
martijnvg
added a commit
that referenced
this pull request
May 14, 2018
The following tokenizers were moved: classic, edge_ngram, letter, lowercase, ngram, path_hierarchy, pattern, thai, uax_url_email and whitespace. Left keyword tokenizer factory in server module, because normalizers directly depend on it.This should be addressed on a follow up change. Relates to #23658
jasontedor
added a commit
to jasontedor/elasticsearch
that referenced
this pull request
May 14, 2018
…them-all * elastic/master: Add missing dependencies on testClasses (elastic#30527) [TEST] Mute ML test that needs updating to following ml-cpp changes Document woes between auto-expand-replicas and allocation filtering (elastic#30531) Moved tokenizers to analysis common module (elastic#30538)
dnhatn
added a commit
that referenced
this pull request
May 14, 2018
* master: Default to one shard (#30539) Unmute IndexUpgradeIT tests Forbid expensive query parts in ranking evaluation (#30151) Docs: Update HighLevelRestClient migration docs (#30544) Clients: Switch to new performRequest (#30543) [TEST] Fix typo in MovAvgIT test Add missing dependencies on testClasses (#30527) [TEST] Mute ML test that needs updating to following ml-cpp changes Document woes between auto-expand-replicas and allocation filtering (#30531) Moved tokenizers to analysis common module (#30538) Adjust copy settings versions Mute ShrinkIndexIT suite SQL: SYS TABLES ordered according to *DBC specs (#30530) Deprecate not copy settings and explicitly disallow (#30404) [ML] Improve state persistence log message Build: Add mavenPlugin cluster configuration method (#30541) Re-enable FlushIT tests Bump Gradle heap to 2 GB (#30535) SQL: Use request flavored methods in tests (#30345) Suppress hdfsFixture if there are spaces in the path (#30302) Delete temporary blobs before creating index file (#30528) Watcher: Remove TriggerEngine.getJobCount() (#30395) [ML] Fix wire BWC for JobUpdate (#30512) Use simpler write-once semantics for FS repository (#30435) Derive max composite buffers from max content len Use simpler write-once semantics for HDFS repository (#30439) SQL: Improve correctness of SYS COLUMNS & TYPES (#30418) Mute two tests in FlushIT with @AwaitsFix. Fix incorrect template name in test case Build: Remove legacy bwc files from xpack (#30485) Mute UnicastZenPingTests#testSimplePings with @AwaitsFix. Security: cleanup code in file stores (#30348) Security: fix TokenMetaData equals and hashcode (#30347) Mute two tests from SmokeTestWatcherWithSecurityClientYamlTestSuiteIT. Mute SharedClusterSnapshotRestoreIT#testSnapshotSucceedsAfterSnapshotFailure with @AwaitsFix. SQL: Improve compatibility with MS query (#30516) SQL: Fix parsing of dates with milliseconds (#30419)
dnhatn
added a commit
that referenced
this pull request
May 14, 2018
* 6.x: Unmute IndexUpgradeIT tests Forbid expensive query parts in ranking evaluation (#30151) Docs: Update HighLevelRestClient migration docs (#30544) Clients: Switch to new performRequest (#30543) [TEST] Fix typo in MovAvgIT test [TEST] Mute ML test that needs updating to following ml-cpp changes Moved tokenizers to analysis common module (#30538) Document woes between auto-expand-replicas and allocation filtering (#30531) [ML] Hide internal Job update options from the REST API (#30537) Deprecate not copy settings and explicitly disallow (#30404) Mute ShrinkIndexIT suite SQL: SYS TABLES ordered according to *DBC specs (#30530) [ML] Improve state persistence log message Build: Add mavenPlugin cluster configuration method (#30541) Re-enable FlushIT tests Bump Gradle heap to 2 GB (#30535) Bump Gradle heap to 1792m (#30484) SQL: Use request flavored methods in tests (#30345) Suppress hdfsFixture if there are spaces in the path (#30302) Delete temporary blobs before creating index file (#30528) Watcher: Remove TriggerEngine.getJobCount() (#30395) Use simpler write-once semantics for FS repository (#30435) Use simpler write-once semantics for HDFS repository (#30439) SQL: Improve correctness of SYS COLUMNS & TYPES (#30418) Mute two tests in FlushIT with @AwaitsFix. Fix incorrect template name in test case Build: Remove legacy bwc files from xpack (#30485) Security: Simplify security index listeners (#30466) Mute SharedClusterSnapshotRestoreIT#testSnapshotSucceedsAfterSnapshotFailure with @AwaitsFix. Add proper longitude validation in geo_polygon_query (#30497) Mute UnicastZenPingTests#testSimplePings with @AwaitsFix. Security: cleanup code in file stores (#30348) Security: fix TokenMetaData equals and hashcode (#30347) Mute two tests from SmokeTestWatcherWithSecurityClientYamlTestSuiteIT. Fix incorrect merged entry in changelog SQL: Improve compatibility with MS query (#30516) SQL: Fix parsing of dates with milliseconds (#30419)
martijnvg
added a commit
that referenced
this pull request
May 15, 2018
* es/ccr: (37 commits) Default to one shard (#30539) Unmute IndexUpgradeIT tests Forbid expensive query parts in ranking evaluation (#30151) Docs: Update HighLevelRestClient migration docs (#30544) Clients: Switch to new performRequest (#30543) [TEST] Fix typo in MovAvgIT test Add missing dependencies on testClasses (#30527) [TEST] Mute ML test that needs updating to following ml-cpp changes Document woes between auto-expand-replicas and allocation filtering (#30531) Moved tokenizers to analysis common module (#30538) Adjust copy settings versions Mute ShrinkIndexIT suite SQL: SYS TABLES ordered according to *DBC specs (#30530) Deprecate not copy settings and explicitly disallow (#30404) [ML] Improve state persistence log message Build: Add mavenPlugin cluster configuration method (#30541) Re-enable FlushIT tests Bump Gradle heap to 2 GB (#30535) SQL: Use request flavored methods in tests (#30345) Suppress hdfsFixture if there are spaces in the path (#30302) ...
martijnvg
added a commit
that referenced
this pull request
May 15, 2018
* es/ccr: (37 commits) Default to one shard (#30539) Unmute IndexUpgradeIT tests Forbid expensive query parts in ranking evaluation (#30151) Docs: Update HighLevelRestClient migration docs (#30544) Clients: Switch to new performRequest (#30543) [TEST] Fix typo in MovAvgIT test Add missing dependencies on testClasses (#30527) [TEST] Mute ML test that needs updating to following ml-cpp changes Document woes between auto-expand-replicas and allocation filtering (#30531) Moved tokenizers to analysis common module (#30538) Adjust copy settings versions Mute ShrinkIndexIT suite SQL: SYS TABLES ordered according to *DBC specs (#30530) Deprecate not copy settings and explicitly disallow (#30404) [ML] Improve state persistence log message Build: Add mavenPlugin cluster configuration method (#30541) Re-enable FlushIT tests Bump Gradle heap to 2 GB (#30535) SQL: Use request flavored methods in tests (#30345) Suppress hdfsFixture if there are spaces in the path (#30302) ...
martijnvg
added a commit
to martijnvg/elasticsearch
that referenced
this pull request
May 15, 2018
* es/ccr: (37 commits) Default to one shard (elastic#30539) Unmute IndexUpgradeIT tests Forbid expensive query parts in ranking evaluation (elastic#30151) Docs: Update HighLevelRestClient migration docs (elastic#30544) Clients: Switch to new performRequest (elastic#30543) [TEST] Fix typo in MovAvgIT test Add missing dependencies on testClasses (elastic#30527) [TEST] Mute ML test that needs updating to following ml-cpp changes Document woes between auto-expand-replicas and allocation filtering (elastic#30531) Moved tokenizers to analysis common module (elastic#30538) Adjust copy settings versions Mute ShrinkIndexIT suite SQL: SYS TABLES ordered according to *DBC specs (elastic#30530) Deprecate not copy settings and explicitly disallow (elastic#30404) [ML] Improve state persistence log message Build: Add mavenPlugin cluster configuration method (elastic#30541) Re-enable FlushIT tests Bump Gradle heap to 2 GB (elastic#30535) SQL: Use request flavored methods in tests (elastic#30345) Suppress hdfsFixture if there are spaces in the path (elastic#30302) ...
martijnvg
added a commit
to martijnvg/elasticsearch
that referenced
this pull request
May 15, 2018
* es/ccr: (37 commits) Default to one shard (elastic#30539) Unmute IndexUpgradeIT tests Forbid expensive query parts in ranking evaluation (elastic#30151) Docs: Update HighLevelRestClient migration docs (elastic#30544) Clients: Switch to new performRequest (elastic#30543) [TEST] Fix typo in MovAvgIT test Add missing dependencies on testClasses (elastic#30527) [TEST] Mute ML test that needs updating to following ml-cpp changes Document woes between auto-expand-replicas and allocation filtering (elastic#30531) Moved tokenizers to analysis common module (elastic#30538) Adjust copy settings versions Mute ShrinkIndexIT suite SQL: SYS TABLES ordered according to *DBC specs (elastic#30530) Deprecate not copy settings and explicitly disallow (elastic#30404) [ML] Improve state persistence log message Build: Add mavenPlugin cluster configuration method (elastic#30541) Re-enable FlushIT tests Bump Gradle heap to 2 GB (elastic#30535) SQL: Use request flavored methods in tests (elastic#30345) Suppress hdfsFixture if there are spaces in the path (elastic#30302) ...
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The following tokenizers were moved: classic, edge_ngram,
letter, lowercase, ngram, path_hierarchy, pattern, thai, uax_url_email and
whitespace.
The only tokenizer that I didn't move in this PR is keyword. This is because
normalizer infrastructure directly dependents on this tokenizer and should
be tackled in a separate PR. Also quite some tests use this tokenizer. I plan
to do this after this PR.
This PR is mainly mechanical and tests were either moved to analysis-common module
or changed to use the standard tokenizer or mock whitespace tokenizer.
Relates to #23658