Optimization in String Terms Aggregation query for Large Bucket Counts#18732
Optimization in String Terms Aggregation query for Large Bucket Counts#18732rishabhmaurya merged 15 commits intoopensearch-project:mainfrom
Conversation
fa96268 to
0cf5b78
Compare
|
❌ Gradle check result for b25271f: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
|
❌ Gradle check result for 242faae: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
242faae to
482a37e
Compare
|
❌ Gradle check result for 482a37e: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
482a37e to
81211c1
Compare
|
❌ Gradle check result for 81211c1: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
|
❌ Gradle check result for a81608e: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
|
❌ Gradle check result for a81608e: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Signed-off-by: Vinay Krishna Pudyodu <vinkrish.neo@gmail.com>
Signed-off-by: Vinay Krishna Pudyodu <vinkrish.neo@gmail.com>
Signed-off-by: Vinay Krishna Pudyodu <vinkrish.neo@gmail.com>
Signed-off-by: Vinay Krishna Pudyodu <vinkrish.neo@gmail.com>
Signed-off-by: Vinay Krishna Pudyodu <vinkrish.neo@gmail.com>
ce67085 to
88210db
Compare
Signed-off-by: Vinay Krishna Pudyodu <vinkrish.neo@gmail.com>
88210db to
b455fc7
Compare
Signed-off-by: Vinay Krishna Pudyodu <vinkrish.neo@gmail.com>
|
❌ Gradle check result for b455fc7: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Signed-off-by: Vinay Krishna Pudyodu <vinkrish.neo@gmail.com>
|
I'm guessing we are reusing the same logic we used for numeric terms #18702 i.e. LGTM |
|
Correct, decision logic is same. |
Signed-off-by: Vinay Krishna Pudyodu <vinkrish.neo@gmail.com>
|
❌ Gradle check result for e3abf3e: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Signed-off-by: Vinay Krishna Pudyodu <vinkrish.neo@gmail.com>
|
❕ Gradle check result for f42714f: UNSTABLE Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure. |
Signed-off-by: Vinay Krishna Pudyodu <vinkrish.neo@gmail.com>
opensearch-project#18732) * Optimize String terms agg Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com> * Updated the algorithm selection logic and cleanup Signed-off-by: Vinay Krishna Pudyodu <vinkrish.neo@gmail.com> * Updated the algorithm selection logic and cleanup Signed-off-by: Vinay Krishna Pudyodu <vinkrish.neo@gmail.com> * Updated bucket sorting at shard level for keyorder Signed-off-by: Vinay Krishna Pudyodu <vinkrish.neo@gmail.com> * fixed bug in the condition Signed-off-by: Vinay Krishna Pudyodu <vinkrish.neo@gmail.com> * Updated logic in topN selection depending on request size Signed-off-by: Vinay Krishna Pudyodu <vinkrish.neo@gmail.com> * use priority queue method for significant terms Signed-off-by: Vinay Krishna Pudyodu <vinkrish.neo@gmail.com> * Added some comments Signed-off-by: Vinay Krishna Pudyodu <vinkrish.neo@gmail.com> * updated partiallyBuiltBucketComparator null check logic Signed-off-by: Vinay Krishna Pudyodu <vinkrish.neo@gmail.com> * Added tests and updated GlobalOrdinalsStringTermsAggregator Signed-off-by: Vinay Krishna Pudyodu <vinkrish.neo@gmail.com> * Fixed spotless checks Signed-off-by: Vinay Krishna Pudyodu <vinkrish.neo@gmail.com> * Fixed issues in changelog Signed-off-by: Vinay Krishna Pudyodu <vinkrish.neo@gmail.com> --------- Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com> Signed-off-by: Vinay Krishna Pudyodu <vinkrish.neo@gmail.com> Co-authored-by: Rishabh Maurya <rishabhmaurya05@gmail.com>
Description
If the number of requested top-N buckets exceeds or close to the maximum bucket ordinal, making the use of a PriorityQueue for top-N selection inefficient or redundant. So we made following modifications:
Benchmark test results here :
#18704 (comment)
Related Issues
Resolves #18704
Related #18650
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.