Optimize matrix_stats agg by using primitive arrays instead of maps#19989
Optimize matrix_stats agg by using primitive arrays instead of maps#19989jainankitk merged 12 commits intoopensearch-project:mainfrom
Conversation
Signed-off-by: Peter Alfonsi <petealft@amazon.com>
Signed-off-by: Peter Alfonsi <petealft@amazon.com>
Signed-off-by: Peter Alfonsi <peter.alfonsi@gmail.com>
|
❌ Gradle check result for 7cd07d7: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Signed-off-by: Peter Alfonsi <petealft@amazon.com>
|
❌ Gradle check result for 3dcc988: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
|
distribution:bwc:maintenance:buildBwcLinuxTar is failing for 2.19.4 but not 2.19.3. I think this isn't related to serialization BWC test failures or anything like that since I think those show up as test failures. Not sure why this is happening. |
|
❌ Gradle check result for cded794: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Signed-off-by: Peter Alfonsi <petealft@amazon.com>
|
❌ Gradle check result for 0fba995: null Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Signed-off-by: Peter Alfonsi <petealft@amazon.com>
|
❌ Gradle check result for 9593a6c: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Signed-off-by: Peter Alfonsi <petealft@amazon.com>
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #19989 +/- ##
============================================
+ Coverage 73.24% 73.27% +0.03%
- Complexity 71519 71561 +42
============================================
Files 5789 5789
Lines 327121 327333 +212
Branches 47117 47159 +42
============================================
+ Hits 239591 239849 +258
+ Misses 68269 68216 -53
- Partials 19261 19268 +7 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Signed-off-by: Peter Alfonsi <petealft@amazon.com>
jainankitk
left a comment
There was a problem hiding this comment.
Change looks good to me. And results in significant speedup!!
…pensearch-project#19989) Signed-off-by: Peter Alfonsi <petealft@amazon.com> Signed-off-by: Peter Alfonsi <peter.alfonsi@gmail.com> Co-authored-by: Peter Alfonsi <petealft@amazon.com>
…pensearch-project#19989) Signed-off-by: Peter Alfonsi <petealft@amazon.com> Signed-off-by: Peter Alfonsi <peter.alfonsi@gmail.com> Co-authored-by: Peter Alfonsi <petealft@amazon.com>
Description
As described in the initial issue, the use of Map<String, Double/Long> in the hot loop of matrix_stats is slow mostly because of un/boxing. Replacing it with double[] / long[] speeds up the aggregation by 80%, from 15.639 seconds to 3.093 seconds on nyc_taxis.
As discussed with @jainankitk , we unfortunately do have to keep the older map-based logic around for BWC/serialization reasons. I wasn't able to think of a good way to remove it - while it is true that we'll never add RunningStats that are for different aggregation requests, and therefore the fieldNames keys will actually always be the same in practice, we still do need a RunningStats instance constructed with the StreamInput ctor to be functional, even before we add it to the RunningStats from another segment/shard. So we have to keep the old code around as a fallback. Sad since it adds a lot of bloat + logic around switching between the two. However I still think the performance improvement makes the overall change worthwhile.
Note we still use map-based logic in the other objects like InternalMatrixStats/MatrixStatsResults. This should have ~no perf impact since it only happens at the end, not in the per-doc hot loop. Doing it this way reduces the complexity of having multiple cases for those objects too.
Related Issues
Resolves #19741
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.