Conversation
Adds a new rate aggregation that can calculate a document rate for buckets of a date_histogram. Closes elastic#60674
not-napoleon
left a comment
There was a problem hiding this comment.
Overall, LGTM. The nits are ignorable, but the comments about null values source/create unmapped, and about supporting DATE values should be addressed before merging. Thanks!
| MINUTES_OF_HOUR((byte) 7, ChronoField.MINUTE_OF_HOUR) { | ||
| final long unitMillis = ChronoField.MINUTE_OF_HOUR.getBaseUnit().getDuration().toMillis(); | ||
| MINUTES_OF_HOUR((byte) 7, "minute", ChronoField.MINUTE_OF_HOUR, true, | ||
| ChronoField.MINUTE_OF_HOUR.getBaseUnit().getDuration().toMillis()) { |
There was a problem hiding this comment.
Nit: Since we're touching this anyway, we could move to the new formatting standard (which would put each param on a new line here)
|
|
||
| public class InternalRate extends InternalNumericMetricsAggregation.SingleValue implements Rate { | ||
| final double sum; | ||
| final double divider; |
There was a problem hiding this comment.
Nit: I think the technical term is divisor not divider.
...n/analytics/src/main/java/org/elasticsearch/xpack/analytics/rate/RateAggregationBuilder.java
Show resolved
Hide resolved
| } | ||
|
|
||
| @Override | ||
| protected RateAggregatorFactory innerBuild(QueryShardContext queryShardContext, ValuesSourceConfig config, |
There was a problem hiding this comment.
Nit: Formatting. This is a new file, should just run the formatter on the whole thing.
| ) throws IOException { | ||
| super(name, context, parent, metadata); | ||
| // TODO: stop expecting nulls here | ||
| this.valuesSource = valuesSourceConfig.hasValues() ? (ValuesSource.Numeric) valuesSourceConfig.getValuesSource() : null; |
There was a problem hiding this comment.
We shouldn't be using this work-around in new aggregators. Instead of using a null here, RateAggregatorFactory#createUnmapped should return an aggregator that uses a NO_OP_COLLECTOR, and we should rely on valuesSource being not null in this aggregator.
|
|
||
| static void registerAggregators(ValuesSourceRegistry.Builder builder) { | ||
| builder.register(RateAggregationBuilder.REGISTRY_KEY, | ||
| List.of(CoreValuesSourceType.NUMERIC, CoreValuesSourceType.DATE, CoreValuesSourceType.BOOLEAN), |
There was a problem hiding this comment.
Is there a use case for running Rate over a Date field? I'm having a hard time imagining one. If there is a use case for it, let's leave a comment so we remember why we want it, and if not let's not support passing in Date fields. It's much easier to add support later if we find a use case than it is to remove support for a supported data type, even if it only generates nonsense results.
|
@elasticmachine run elasticsearch-ci/packaging-sample-windows |
|
Pinging @elastic/es-analytics-geo (:Analytics/Aggregations) |
csoulios
left a comment
There was a problem hiding this comment.
LGTM!
As long as the bucket doc_count is retrieved using the BucketsAggregator#getDocCounts() it will seamlessly work with aggregate doc counts!
|
@elasticmachine update branch |
|
@elasticmachine run elasticsearch-ci/packaging-sample-windows |
Adds a new rate aggregation that can calculate a document rate for buckets
of a date_histogram.
Closes #60674