Introduce ValuesSource Registry & Refactor Aggregation Type System#54281
Introduce ValuesSource Registry & Refactor Aggregation Type System#54281not-napoleon merged 65 commits intoelastic:masterfrom
Conversation
Conflicts: modules/parent-join/src/main/java/org/elasticsearch/join/aggregations/ChildrenAggregationBuilder.java modules/parent-join/src/main/java/org/elasticsearch/join/aggregations/ParentAggregationBuilder.java server/src/main/java/org/elasticsearch/search/aggregations/bucket/geogrid/GeoGridAggregationBuilder.java server/src/main/java/org/elasticsearch/search/aggregations/bucket/histogram/AutoDateHistogramAggregationBuilder.java server/src/main/java/org/elasticsearch/search/aggregations/bucket/histogram/DateHistogramAggregationBuilder.java server/src/main/java/org/elasticsearch/search/aggregations/bucket/histogram/HistogramAggregationBuilder.java server/src/main/java/org/elasticsearch/search/aggregations/bucket/missing/MissingAggregationBuilder.java server/src/main/java/org/elasticsearch/search/aggregations/bucket/range/IpRangeAggregationBuilder.java server/src/main/java/org/elasticsearch/search/aggregations/bucket/sampler/DiversifiedAggregationBuilder.java server/src/main/java/org/elasticsearch/search/aggregations/bucket/significant/SignificantTermsAggregationBuilder.java server/src/main/java/org/elasticsearch/search/aggregations/bucket/terms/RareTermsAggregationBuilder.java server/src/main/java/org/elasticsearch/search/aggregations/bucket/terms/TermsAggregationBuilder.java server/src/main/java/org/elasticsearch/search/aggregations/metrics/GeoBoundsAggregationBuilder.java x-pack/plugin/core/src/test/java/org/elasticsearch/xpack/core/transform/transforms/MockDeprecatedAggregationBuilder.java
Conflicts: server/src/main/java/org/elasticsearch/search/aggregations/metrics/HDRPercentileRanksAggregatorFactory.java server/src/main/java/org/elasticsearch/search/aggregations/metrics/HDRPercentilesAggregatorFactory.java server/src/main/java/org/elasticsearch/search/aggregations/metrics/PercentileRanksAggregationBuilder.java server/src/main/java/org/elasticsearch/search/aggregations/metrics/PercentilesAggregationBuilder.java server/src/main/java/org/elasticsearch/search/aggregations/metrics/TDigestPercentileRanksAggregatorFactory.java server/src/main/java/org/elasticsearch/search/aggregations/metrics/TDigestPercentilesAggregatorFactory.java server/src/main/java/org/elasticsearch/search/aggregations/support/ValuesSourceConfig.java
Conflicts: server/src/test/java/org/elasticsearch/search/aggregations/metrics/MinAggregatorTests.java
Conflicts: server/src/main/java/org/elasticsearch/search/aggregations/bucket/composite/DateHistogramValuesSourceBuilder.java
Conflicts: server/src/main/java/org/elasticsearch/search/aggregations/bucket/histogram/DateHistogramAggregatorFactory.java
Conflicts: server/src/main/java/org/elasticsearch/search/aggregations/bucket/geogrid/GeoGridAggregationBuilder.java server/src/main/java/org/elasticsearch/search/aggregations/bucket/geogrid/GeoHashGridAggregationBuilder.java server/src/main/java/org/elasticsearch/search/aggregations/bucket/geogrid/GeoHashGridAggregatorFactory.java server/src/main/java/org/elasticsearch/search/aggregations/bucket/geogrid/GeoTileGridAggregationBuilder.java server/src/main/java/org/elasticsearch/search/aggregations/bucket/geogrid/GeoTileGridAggregatorFactory.java
Conflicts: server/src/main/java/org/elasticsearch/search/aggregations/bucket/histogram/DateHistogramAggregatorFactory.java
Also adds some basic AggTestCases for untested code paths (and boilerplate for future tests once the IT are converted over)
Conflicts: server/src/main/java/org/elasticsearch/search/aggregations/bucket/composite/CompositeValuesSourceBuilder.java server/src/main/java/org/elasticsearch/search/aggregations/bucket/histogram/DateHistogramAggregatorFactory.java
This required a bit of a refactor to percentiles itself. Before, the Builder would switch on the chosen algo to generate an algo-specific factory. This doesn't work (or at least, would be difficult) in the new VS framework. This refactor consolidates both factories together and introduces a PercentilesConfig object to act as a standardized way to pass algo-specific parameters through the factory. This object is then used when deciding which kind of aggregator to create Note: CoreValuesSourceType.HISTOGRAM still lives in core, and will be moved in a subsequent PR.
…c#53037) This commit updates the geo*_grid aggregations to depend on registering itself in the ValuesSourceRegistry relates to the values-source refactoring meta issue elastic#42949.
This commit updates the geo_centroid aggregation to depend on registering itself in the ValuesSourceRegistry. relates to the values-source refactoring meta issue elastic#42949.
Conflicts: server/src/test/java/org/elasticsearch/search/aggregations/bucket/missing/MissingAggregatorTests.java x-pack/plugin/analytics/src/main/java/org/elasticsearch/xpack/analytics/topmetrics/TopMetricsAggregatorFactory.java
…3298) - Introduces a new API (`getBareAggregatorRegistrar()`) which allows plugins to register aggregations against existing agg definitions defined in Core. - This moves the histogram VSType over to XPack where it belongs. `getHistogramValues()` still remains as a Core concept - Moves the histo-specific bits over to xpack (e.g. the actual aggregator logic). This requires extra boilerplate since we need to create a new "Analytics" Percentile/Rank aggregators to deal with the histo field. Doubly-so since percentiles/ranks are extra boiler-plate'y... should be much lighter for other aggs
Conflicts: server/src/main/java/org/elasticsearch/search/SearchModule.java server/src/main/java/org/elasticsearch/search/aggregations/bucket/geogrid/GeoGridAggregationBuilder.java server/src/main/java/org/elasticsearch/search/aggregations/bucket/histogram/AutoDateHistogramAggregationBuilder.java server/src/main/java/org/elasticsearch/search/aggregations/bucket/histogram/DateHistogramAggregationBuilder.java server/src/main/java/org/elasticsearch/search/aggregations/bucket/histogram/HistogramAggregationBuilder.java server/src/main/java/org/elasticsearch/search/aggregations/bucket/range/DateRangeAggregationBuilder.java server/src/main/java/org/elasticsearch/search/aggregations/bucket/range/IpRangeAggregationBuilder.java server/src/main/java/org/elasticsearch/search/aggregations/bucket/range/RangeAggregationBuilder.java server/src/main/java/org/elasticsearch/search/aggregations/metrics/GeoBoundsAggregationBuilder.java server/src/main/java/org/elasticsearch/search/aggregations/metrics/GeoCentroidAggregationBuilder.java server/src/main/java/org/elasticsearch/search/aggregations/support/ValuesSource.java server/src/test/java/org/elasticsearch/search/aggregations/bucket/significant/SignificantTermsAggregatorTests.java server/src/test/java/org/elasticsearch/search/aggregations/metrics/MinAggregatorTests.java test/framework/src/main/java/org/elasticsearch/search/aggregations/AggregatorTestCase.java x-pack/plugin/analytics/src/test/java/org/elasticsearch/xpack/analytics/boxplot/BoxplotAggregatorTests.java x-pack/plugin/analytics/src/test/java/org/elasticsearch/xpack/analytics/cumulativecardinality/CumulativeCardinalityAggregatorTests.java x-pack/plugin/analytics/src/test/java/org/elasticsearch/xpack/analytics/stringstats/StringStatsAggregatorTests.java
|
Pinging @elastic/es-analytics-geo (:Analytics/Aggregations) |
|
|
||
| @Override | ||
| protected ValuesSourceType defaultValueSourceType() { | ||
| // TODO: This should probably be DATE, but we're not failing tests with BYTES, so needs more tests? |
There was a problem hiding this comment.
I'm aware of this TODO, I think it's okay to merge to master as is and fix there.
| import org.elasticsearch.indices.breaker.CircuitBreakerService; | ||
| import org.elasticsearch.search.DocValueFormat; | ||
| import org.elasticsearch.search.MultiValueMode; | ||
| import org.elasticsearch.search.aggregations.support.CoreValuesSourceType; |
There was a problem hiding this comment.
I can't tell if github is being weird with the diff or if this is an actual change, but it seems these are listed below as well? Surprised precommit doesn't fail this
| doc.add(new SortedSetDocValuesField(fieldName, new BytesRef("a"))); | ||
| json = "{ \"" + fieldName + "\" : \"a\" }"; | ||
| } | ||
| } else if (vst.equals(CoreValuesSourceType.DATE)) { |
There was a problem hiding this comment.
I believe we lost the check for date_nanos (DateFieldMapper.DATE_NANOS_CONTENT_TYPE)) which I think will eventually break?
We can do a quick follow-on PR instead of changing this one imo.
polyfractal
left a comment
There was a problem hiding this comment.
Left two comments. Assuming the import issue isn't going to break the build, the testing thing we can deal with in a quick followup I think (so we don't have to respin this PR).
LGTM otherwise! 🎉
* Remove ValuesSourceType argument to ValuesSourceAggregationBuilder (elastic#48638) * ValuesSourceRegistry Prototype (elastic#48758) * Remove generics from ValuesSource related classes (elastic#49606) * fix percentile aggregation tests (elastic#50712) * Basic thread safety for ValuesSourceRegistry (elastic#50340) * Remove target value type from ValuesSourceAggregationBuilder (elastic#49943) * Cleanup default values source type (elastic#50992) * CoreValuesSourceType no longer implements Writable (elastic#51276) * Remove genereics & hard coded ValuesSource references from Matrix Stats (elastic#51131) * Put values source types on fields (elastic#51503) * Remove VST Any (elastic#51539) * Rewire terms agg to use new VS registry (elastic#51182) Also adds some basic AggTestCases for untested code paths (and boilerplate for future tests once the IT are converted over) * Wire Cardinality aggregation to work with the ValuesSourceRegistry (elastic#51337) * Wire Percentiles aggregator into new VS framework (elastic#51639) This required a bit of a refactor to percentiles itself. Before, the Builder would switch on the chosen algo to generate an algo-specific factory. This doesn't work (or at least, would be difficult) in the new VS framework. This refactor consolidates both factories together and introduces a PercentilesConfig object to act as a standardized way to pass algo-specific parameters through the factory. This object is then used when deciding which kind of aggregator to create Note: CoreValuesSourceType.HISTOGRAM still lives in core, and will be moved in a subsequent PR. * Remove generics and target value type from MultiVSAB (elastic#51647) * fix checkstyle after merge (elastic#52008) * Plumb ValuesSourceRegistry through to QuerySearchContext (elastic#51710) * Convert RareTerms to new VS registry (elastic#52166) * Wire up Value Count (elastic#52225) * Wire up Max & Min aggregations (elastic#52219) * ValuesSource refactoring: Wire up Sum aggregation (elastic#52571) * ValuesSource refactoring: Wire up SigTerms aggregation (elastic#52590) * Soft immutability for VSConfig (elastic#52729) * Unmute testSupportedFieldTypes, fix Percentiles/Ranks/Terms tests (elastic#52734) Also fixes Percentiles which was incorrectly specified to only accept numeric, but in fact also accepts Boolean and Date (because those are numeric on master - thanks `testSupportedFieldTypes` for catching it!) * VS refactoring: Wire up stats aggregation (elastic#52891) * ValuesSource refactoring: Wire up string_stats aggregation (elastic#52875) * VS refactoring: Wire up median (MAD) aggregation (elastic#52945) * fix valuesourcetype issue with constant_keyword field (elastic#53041)x-pack/plugin/rollup/src/main/java/org/elasticsearch/xpack/rollup/job/RollupIndexer.java this commit implements `getValuesSourceType` for the ConstantKeyword field type. master was merged into feature/extensible-values-source introducing a new field type that was not implementing `getValuesSourceType`. * ValuesSource refactoring: Wire up Avg aggregation (elastic#52752) * Wire PercentileRanks aggregator into new VS framework (elastic#51693) * Add a VSConfig resolver for aggregations not using the registry (elastic#53038) * Vs refactor wire up ranges and date ranges (elastic#52918) * Wire up geo_bounds aggregation to ValuesSourceRegistry (elastic#53034) This commit updates the geo_bounds aggregation to depend on registering itself in the ValuesSourceRegistry relates elastic#42949. * VS refactoring: convert Boxplot to new registry (elastic#53132) * Wire-up geotile_grid and geohash_grid to ValuesSourceRegistry (elastic#53037) This commit updates the geo*_grid aggregations to depend on registering itself in the ValuesSourceRegistry relates to the values-source refactoring meta issue elastic#42949. * Wire-up geo_centroid agg to ValuesSourceRegistry (elastic#53040) This commit updates the geo_centroid aggregation to depend on registering itself in the ValuesSourceRegistry. relates to the values-source refactoring meta issue elastic#42949. * Fix type tests for Missing aggregation (elastic#53501) * ValuesSource Refactor: move histo VSType into XPack module (elastic#53298) - Introduces a new API (`getBareAggregatorRegistrar()`) which allows plugins to register aggregations against existing agg definitions defined in Core. - This moves the histogram VSType over to XPack where it belongs. `getHistogramValues()` still remains as a Core concept - Moves the histo-specific bits over to xpack (e.g. the actual aggregator logic). This requires extra boilerplate since we need to create a new "Analytics" Percentile/Rank aggregators to deal with the histo field. Doubly-so since percentiles/ranks are extra boiler-plate'y... should be much lighter for other aggs * Wire up DateHistogram to the ValuesSourceRegistry (elastic#53484) * Vs refactor parser cleanup (elastic#53198) Co-authored-by: Zachary Tong <polyfractal@elastic.co> Co-authored-by: Zachary Tong <zach@elastic.co> Co-authored-by: Christos Soulios <1561376+csoulios@users.noreply.github.com> Co-authored-by: Tal Levy <JubBoy333@gmail.com>
* Add ValuesSource Registry and associated logic (#54281) * Remove ValuesSourceType argument to ValuesSourceAggregationBuilder (#48638) * ValuesSourceRegistry Prototype (#48758) * Remove generics from ValuesSource related classes (#49606) * fix percentile aggregation tests (#50712) * Basic thread safety for ValuesSourceRegistry (#50340) * Remove target value type from ValuesSourceAggregationBuilder (#49943) * Cleanup default values source type (#50992) * CoreValuesSourceType no longer implements Writable (#51276) * Remove genereics & hard coded ValuesSource references from Matrix Stats (#51131) * Put values source types on fields (#51503) * Remove VST Any (#51539) * Rewire terms agg to use new VS registry (#51182) Also adds some basic AggTestCases for untested code paths (and boilerplate for future tests once the IT are converted over) * Wire Cardinality aggregation to work with the ValuesSourceRegistry (#51337) * Wire Percentiles aggregator into new VS framework (#51639) This required a bit of a refactor to percentiles itself. Before, the Builder would switch on the chosen algo to generate an algo-specific factory. This doesn't work (or at least, would be difficult) in the new VS framework. This refactor consolidates both factories together and introduces a PercentilesConfig object to act as a standardized way to pass algo-specific parameters through the factory. This object is then used when deciding which kind of aggregator to create Note: CoreValuesSourceType.HISTOGRAM still lives in core, and will be moved in a subsequent PR. * Remove generics and target value type from MultiVSAB (#51647) * fix checkstyle after merge (#52008) * Plumb ValuesSourceRegistry through to QuerySearchContext (#51710) * Convert RareTerms to new VS registry (#52166) * Wire up Value Count (#52225) * Wire up Max & Min aggregations (#52219) * ValuesSource refactoring: Wire up Sum aggregation (#52571) * ValuesSource refactoring: Wire up SigTerms aggregation (#52590) * Soft immutability for VSConfig (#52729) * Unmute testSupportedFieldTypes, fix Percentiles/Ranks/Terms tests (#52734) Also fixes Percentiles which was incorrectly specified to only accept numeric, but in fact also accepts Boolean and Date (because those are numeric on master - thanks `testSupportedFieldTypes` for catching it!) * VS refactoring: Wire up stats aggregation (#52891) * ValuesSource refactoring: Wire up string_stats aggregation (#52875) * VS refactoring: Wire up median (MAD) aggregation (#52945) * fix valuesourcetype issue with constant_keyword field (#53041)x-pack/plugin/rollup/src/main/java/org/elasticsearch/xpack/rollup/job/RollupIndexer.java this commit implements `getValuesSourceType` for the ConstantKeyword field type. master was merged into feature/extensible-values-source introducing a new field type that was not implementing `getValuesSourceType`. * ValuesSource refactoring: Wire up Avg aggregation (#52752) * Wire PercentileRanks aggregator into new VS framework (#51693) * Add a VSConfig resolver for aggregations not using the registry (#53038) * Vs refactor wire up ranges and date ranges (#52918) * Wire up geo_bounds aggregation to ValuesSourceRegistry (#53034) This commit updates the geo_bounds aggregation to depend on registering itself in the ValuesSourceRegistry relates #42949. * VS refactoring: convert Boxplot to new registry (#53132) * Wire-up geotile_grid and geohash_grid to ValuesSourceRegistry (#53037) This commit updates the geo*_grid aggregations to depend on registering itself in the ValuesSourceRegistry relates to the values-source refactoring meta issue #42949. * Wire-up geo_centroid agg to ValuesSourceRegistry (#53040) This commit updates the geo_centroid aggregation to depend on registering itself in the ValuesSourceRegistry. relates to the values-source refactoring meta issue #42949. * Fix type tests for Missing aggregation (#53501) * ValuesSource Refactor: move histo VSType into XPack module (#53298) - Introduces a new API (`getBareAggregatorRegistrar()`) which allows plugins to register aggregations against existing agg definitions defined in Core. - This moves the histogram VSType over to XPack where it belongs. `getHistogramValues()` still remains as a Core concept - Moves the histo-specific bits over to xpack (e.g. the actual aggregator logic). This requires extra boilerplate since we need to create a new "Analytics" Percentile/Rank aggregators to deal with the histo field. Doubly-so since percentiles/ranks are extra boiler-plate'y... should be much lighter for other aggs * Wire up DateHistogram to the ValuesSourceRegistry (#53484) * Vs refactor parser cleanup (#53198) Co-authored-by: Zachary Tong <polyfractal@elastic.co> Co-authored-by: Zachary Tong <zach@elastic.co> Co-authored-by: Christos Soulios <1561376+csoulios@users.noreply.github.com> Co-authored-by: Tal Levy <JubBoy333@gmail.com> * First batch of easy fixes * Remove List.of from ValuesSourceRegistry Note that we intend to have a follow up PR dealing with the mutability of the registry, so I didn't even try to address that here. * More compiler fixes * More compiler fixes * More compiler fixes * Precommit is happy and so am I * Add new Core VSTs to tests * Disabled supported type test on SigTerms until we can backport it's fix * fix checkstyle * Fix test failure from semantic merge issue * Fix some metaData->metadata replacements that got lost * Fix list of supported types for MinAggregator * Fix list of supported types for Avg * remove unused import Co-authored-by: Zachary Tong <polyfractal@elastic.co> Co-authored-by: Zachary Tong <zach@elastic.co> Co-authored-by: Christos Soulios <1561376+csoulios@users.noreply.github.com> Co-authored-by: Tal Levy <JubBoy333@gmail.com>
This PR will merge the work done on #42949 in the feature branch back into master.