Save memory when histogram agg is not on top#57277
Conversation
This saves some memory when the `histogram` aggregation is not a top level aggregation by dropping `asMultiBucketAggregator` in favor of natively implementing multi-bucket storage in the aggregator. For the most part this just uses the `LongKeyedBucketOrds` that we built the first time we did this.
|
Pinging @elastic/es-analytics-geo (:Analytics/Aggregations) |
|
I'm going to add a test for the new debug information. |
| * Base class for functionality shared between aggregators for this | ||
| * {@code histogram} aggregation. | ||
| */ | ||
| public abstract class AbstractHistogramAggregator extends BucketsAggregator { |
There was a problem hiding this comment.
There was a TODO that the range and numeric version of the aggregator shared a ton of code. Now they share a superclass that provides all that code.
| ValuesSource valuesSource, DocValueFormat formatter, SearchContext context, | ||
| Aggregator parent, Map<String, Object> metadata) throws IOException { | ||
| ValuesSource.Range rangeValueSource = (ValuesSource.Range) valuesSource; | ||
| if (rangeValueSource.rangeType().isNumeric() == false) { |
There was a problem hiding this comment.
I moved this check into the ctor so I can use the ctor reference that we've been doing elsewhere.
| Aggregator parent, | ||
| boolean collectsFromSingleBucket, | ||
| Map<String, Object> metadata) throws IOException { | ||
| if (collectsFromSingleBucket == false) { |
There was a problem hiding this comment.
this is the important line!
| fieldType.setName("field"); | ||
| try (IndexReader reader = w.getReader()) { | ||
| IndexSearcher searcher = new IndexSearcher(reader); | ||
| InternalHistogram histogram = search(searcher, new MatchAllDocsQuery(), aggBuilder, fieldType); |
There was a problem hiding this comment.
We do this kind of thing all over the place so I figured I'd make a utility method for it.
| Releasables.close(releasables); | ||
| releasables.clear(); | ||
| } | ||
|
|
There was a problem hiding this comment.
nice touch. likely usable across many future tests
|
I'm pulling some performance numbers for this. I'll likely merge before they get done and update with them once they come in. I'm fairly confident in it though. |
|
About a 38% performance gain in the test that I ran: Before: After: |
This saves some memory when the `histogram` aggregation is not a top level aggregation by dropping `asMultiBucketAggregator` in favor of natively implementing multi-bucket storage in the aggregator. For the most part this just uses the `LongKeyedBucketOrds` that we built the first time we did this.
…7377) This saves some memory when the `histogram` aggregation is not a top level aggregation by dropping `asMultiBucketAggregator` in favor of natively implementing multi-bucket storage in the aggregator. For the most part this just uses the `LongKeyedBucketOrds` that we built the first time we did this.
This saves some memory when the
histogramaggregation is not a toplevel aggregation by dropping
asMultiBucketAggregatorin favor ofnatively implementing multi-bucket storage in the aggregator. For the
most part this just uses the
LongKeyedBucketOrdsthat we built thefirst time we did this.