Terms aggregation should remap global ordinal buckets when a sub-aggregator is used to sort the terms#24941
Conversation
colings86
left a comment
There was a problem hiding this comment.
LGTM, I left one comment but only for my own understanding, I don't think it will affect whether you merge this as is.
There was a problem hiding this comment.
why is this needed, I don't see where its used?
There was a problem hiding this comment.
It is used in https://github.com/jimczi/elasticsearch/blob/5b79cbe87c5c28cafed068a3910e1298a9dd700d/core/src/main/java/org/elasticsearch/search/aggregations/bucket/terms/GlobalOrdinalsStringTermsAggregator.java#L385 to remap the segment ordinals collected in this segment to global ordinals.
…egator is used to sort the terms `terms` aggregations at the root level use the `global_ordinals` execution hint by default. When all sub-aggregators can be run in `breadth_first` mode the collected buckets for these sub-aggs are dense (remapped after the initial pruning). But if a sub-aggregator is not deferrable and needs to collect all buckets before pruning we don't remap global ords and the aggregator needs to deal with sparse buckets. Most (if not all) aggregators expect dense buckets and uses this information to allocate memories. This change forces the remap of the global ordinals but only when there is at least one sub-aggregator that cannot be deferred. Relates elastic#24788
5b79cbe to
20d0ee9
Compare
…egator is used to sort the terms (#24941) `terms` aggregations at the root level use the `global_ordinals` execution hint by default. When all sub-aggregators can be run in `breadth_first` mode the collected buckets for these sub-aggs are dense (remapped after the initial pruning). But if a sub-aggregator is not deferrable and needs to collect all buckets before pruning we don't remap global ords and the aggregator needs to deal with sparse buckets. Most (if not all) aggregators expect dense buckets and uses this information to allocate memories. This change forces the remap of the global ordinals but only when there is at least one sub-aggregator that cannot be deferred. Relates #24788
…egator is used to sort the terms (#24941) `terms` aggregations at the root level use the `global_ordinals` execution hint by default. When all sub-aggregators can be run in `breadth_first` mode the collected buckets for these sub-aggs are dense (remapped after the initial pruning). But if a sub-aggregator is not deferrable and needs to collect all buckets before pruning we don't remap global ords and the aggregator needs to deal with sparse buckets. Most (if not all) aggregators expect dense buckets and uses this information to allocate memories. This change forces the remap of the global ordinals but only when there is at least one sub-aggregator that cannot be deferred. Relates #24788
|
Hey. Just to be sure, this bug is a performance issue right? |
|
Up? |
termsaggregations at the root level use theglobal_ordinalsexecution hint by default.When all sub-aggregators can be run in
breadth_firstmode the collected buckets for these sub-aggs are dense (remapped after the initial pruning).But if a sub-aggregator is not deferrable and needs to collect all buckets before pruning we don't remap global ords and the aggregator needs to deal with sparse buckets.
Most (if not all) aggregators expect dense buckets and uses this information to allocate memories.
This change forces the remap of the global ordinals but only when there is at least one sub-aggregator that cannot be deferred.
Relates #24788