Skip to content

Merge remaining sig_terms into terms#57397

Merged
nik9000 merged 7 commits intoelastic:masterfrom
nik9000:terms_map_merge
Jun 4, 2020
Merged

Merge remaining sig_terms into terms#57397
nik9000 merged 7 commits intoelastic:masterfrom
nik9000:terms_map_merge

Conversation

@nik9000
Copy link
Copy Markdown
Member

@nik9000 nik9000 commented May 31, 2020

Merges the remaining implementation of significant_terms into terms
so that we can more easilly make them work properly without
asMultiBucketAggregator which should save memory and speed them up.

Relates #56487

nik9000 added 4 commits May 31, 2020 10:36
Merges the remaining implementation of `significant_terms` into `terms`
so that we can more easilly make them work properly without
`asMultiBucketAggregator` which *should* save memory and speed them up.

Relates elastic#56487
@nik9000
Copy link
Copy Markdown
Member Author

nik9000 commented May 31, 2020

@elasticsearchmachine run elasticsearch-ci/default-distro

@nik9000
Copy link
Copy Markdown
Member Author

nik9000 commented May 31, 2020

run elasticsearch-ci/1

1 similar comment
@nik9000
Copy link
Copy Markdown
Member Author

nik9000 commented Jun 1, 2020

run elasticsearch-ci/1

@nik9000 nik9000 marked this pull request as ready for review June 1, 2020 12:40
@elasticmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-analytics-geo (:Analytics/Aggregations)

@nik9000 nik9000 requested a review from not-napoleon June 1, 2020 12:41
@elasticmachine elasticmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Jun 1, 2020
spare.globalOrd = globalOrd;
spare.bucketOrd = bucketOrd;
spare.docCount = docCount;
otherDocCount += docCount;
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was causing inaccurate doc counts when shard_min_doc_count was more than 1.

BucketOrdsEnum ordsEnum = bucketOrds.ordsEnum(owningBucketOrds[ordIdx]);
Supplier<B> emptyBucketBuilder = emptyBucketBuilder(owningBucketOrds[ordIdx]);
while (ordsEnum.next()) {
long docCount = bucketDocCount(ordsEnum.ord());
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I flip the if statement here and moved the spare initialization down to right before I need it to line up with the way the map version works.

/**
* An aggregator of string values.
*/
public class StringTermsAggregator extends AbstractStringTermsAggregator {
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really just renamed this class to MapStringTermsAggregator but a lot changed.

Copy link
Copy Markdown
Member

@not-napoleon not-napoleon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Much easier to see the relationship between these two aggs now.

@nik9000 nik9000 merged commit 69cd443 into elastic:master Jun 4, 2020
nik9000 added a commit to nik9000/elasticsearch that referenced this pull request Jun 4, 2020
Merges the remaining implementation of `significant_terms` into `terms`
so that we can more easilly make them work properly without
`asMultiBucketAggregator` which *should* save memory and speed them up.

Relates elastic#56487
nik9000 added a commit that referenced this pull request Jun 4, 2020
Merges the remaining implementation of `significant_terms` into `terms`
so that we can more easilly make them work properly without
`asMultiBucketAggregator` which *should* save memory and speed them up.

Relates #56487
nik9000 added a commit to nik9000/elasticsearch that referenced this pull request Jun 4, 2020
nik9000 added a commit that referenced this pull request Jun 4, 2020
@jakelandis jakelandis removed the v8.0.0 label Jul 26, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Analytics/Aggregations Aggregations >non-issue Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v7.9.0 v8.0.0-alpha1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants