Optimize SlidingTimeWindowMovingAverages sumBuckets by schlosna · Pull Request #4936 · dropwizard/metrics

schlosna · 2025-09-03T18:39:26Z

SlidingTimeWindowMovingAverages sumBuckets() method could be optimized to perform indexed list access and remove allocations as it currently allocates a LongAdder as well as one or two streams. If one is using the 1, 5, or 15 minute rates on hot paths, these can be unnecessary expensive allocations as well as less optimized computations.

We can avoid the allocations completely and accumulate the sum directly to a long via optimized direct indexed list access.

MovingAverageBenchmarks demonstrates the difference in implementations in both execution time (22x faster) and allocations:

Benchmark                                             (recordings)                                    (type)  Mode  Cnt     Score     Error   Units
MovingAverageBenchmarks.getM1Rate                               10           SlidingTimeWindowMovingAverages  avgt   20  1364.969 ±  12.040   ns/op
MovingAverageBenchmarks.getM1Rate:gc.alloc.rate.norm            10           SlidingTimeWindowMovingAverages  avgt   20   688.037 ±   0.001    B/op
MovingAverageBenchmarks.getM1Rate                               10  OptimizedSlidingTimeWindowMovingAverages  avgt   20    59.182 ±   0.343   ns/op
MovingAverageBenchmarks.getM1Rate:gc.alloc.rate.norm            10  OptimizedSlidingTimeWindowMovingAverages  avgt   20     0.002 ±   0.001    B/op
MovingAverageBenchmarks.getM1Rate                             1000           SlidingTimeWindowMovingAverages  avgt   20  1401.864 ± 107.134   ns/op
MovingAverageBenchmarks.getM1Rate:gc.alloc.rate.norm          1000           SlidingTimeWindowMovingAverages  avgt   20   688.038 ±   0.003    B/op
MovingAverageBenchmarks.getM1Rate                             1000  OptimizedSlidingTimeWindowMovingAverages  avgt   20    61.157 ±   1.393   ns/op
MovingAverageBenchmarks.getM1Rate:gc.alloc.rate.norm          1000  OptimizedSlidingTimeWindowMovingAverages  avgt   20     0.002 ±   0.001    B/op

`SlidingTimeWindowMovingAverages` `sumBuckets()` method could be optimized to perform indexed list access and remove allocations as it currently allocates a `LongAdder` as well as one or two streams. If one is using the 1, 5, or 15 minute rates on hot paths, these can be unnecessary expensive allocations as well as less optimized computations. We can avoid the allocations completely and accumulate the sum directly to a `long` via optimized direct indexed list access. See upstream PR dropwizard/metrics#4936

pkoenig10 · 2025-09-04T00:38:53Z

metrics-core/src/main/java/com/codahale/metrics/SlidingTimeWindowMovingAverages.java

     * time window (i.e. 15 minutes, see TIME_WINDOW_DURATION_MINUTES)
     */
-    private ArrayList<LongAdder> buckets;
+    private final List<LongAdder> buckets;


This is a fixed size and could just be an array.

good call, updated

schlosna · 2025-09-04T01:18:31Z

metrics-core/src/main/java/com/codahale/metrics/SlidingTimeWindowMovingAverages.java

     * time window (i.e. 15 minutes, see TIME_WINDOW_DURATION_MINUTES)
     */
-    private ArrayList<LongAdder> buckets;
+    private final LongAdder[] buckets;


a separate optimization would be to avoid the LongAdders and use AtomicLongArray

Suggested change

private final LongAdder[] buckets;

private final AtomicLongArray buckets;

using AtomicLongArray cuts optimized getM1Rate from ~60ns to ~45ns, but there's additional overhead on the concurrent update side so going to hold off on that change.

Benchmark (recordings) (type) Mode Cnt Score Error Units MovingAverageBenchmarks.getM1Rate 10 SlidingTimeWindowMovingAverages avgt 20 1366.241 ± 24.341 ns/op MovingAverageBenchmarks.getM1Rate 10 OptimizedSlidingTimeWindowMovingAverages avgt 20 43.180 ± 0.153 ns/op MovingAverageBenchmarks.getM1Rate 1000 SlidingTimeWindowMovingAverages avgt 20 1377.545 ± 65.159 ns/op MovingAverageBenchmarks.getM1Rate 1000 OptimizedSlidingTimeWindowMovingAverages avgt 20 43.967 ± 2.058 ns/op

This reverts commit 29b0664.

ash211 · 2025-09-05T16:00:42Z

Thanks for opening this PR @schlosna !

The reasoning about doing fewer allocations and more performant array indexing makes sense, and the benchmarks show a clear performance improvement. It does not seem like a reduction in code readability either. I also checked if LongAdder does anything special for long overflow and didn't see anything, so I believe this is behavior-equivalent.

Requesting review from maintainer @joschi

joschi

@schlosna Thank you for your contribution and providing benchmarks for your change!

`SlidingTimeWindowMovingAverages` `sumBuckets()` method could be optimized to perform indexed list access and remove allocations as it currently allocates a `LongAdder` as well as one or two streams. If one is using the 1, 5, or 15 minute rates on hot paths, these can be unnecessary expensive allocations as well as less optimized computations. We can avoid the allocations completely and accumulate the sum directly to a `long` via optimized direct indexed list access. [MovingAverageBenchmarks](https://github.com/palantir/tritium/blob/davids/OptimizedSlidingTimeWindowMovingAverages/tritium-jmh/src/jmh/java/com/palantir/tritium/microbenchmarks/MovingAverageBenchmarks.java) demonstrates the difference in implementations in both execution time (22x faster) and allocations: ``` Benchmark (recordings) (type) Mode Cnt Score Error Units MovingAverageBenchmarks.getM1Rate 10 SlidingTimeWindowMovingAverages avgt 20 1364.969 ± 12.040 ns/op MovingAverageBenchmarks.getM1Rate:gc.alloc.rate.norm 10 SlidingTimeWindowMovingAverages avgt 20 688.037 ± 0.001 B/op MovingAverageBenchmarks.getM1Rate 10 OptimizedSlidingTimeWindowMovingAverages avgt 20 59.182 ± 0.343 ns/op MovingAverageBenchmarks.getM1Rate:gc.alloc.rate.norm 10 OptimizedSlidingTimeWindowMovingAverages avgt 20 0.002 ± 0.001 B/op MovingAverageBenchmarks.getM1Rate 1000 SlidingTimeWindowMovingAverages avgt 20 1401.864 ± 107.134 ns/op MovingAverageBenchmarks.getM1Rate:gc.alloc.rate.norm 1000 SlidingTimeWindowMovingAverages avgt 20 688.038 ± 0.003 B/op MovingAverageBenchmarks.getM1Rate 1000 OptimizedSlidingTimeWindowMovingAverages avgt 20 61.157 ± 1.393 ns/op MovingAverageBenchmarks.getM1Rate:gc.alloc.rate.norm 1000 OptimizedSlidingTimeWindowMovingAverages avgt 20 0.002 ± 0.001 B/op ```

`SlidingTimeWindowMovingAverages` `sumBuckets()` method could be optimized to perform indexed list access and remove allocations as it currently allocates a `LongAdder` as well as one or two streams. If one is using the 1, 5, or 15 minute rates on hot paths, these can be unnecessary expensive allocations as well as less optimized computations. We can avoid the allocations completely and accumulate the sum directly to a `long` via optimized direct indexed list access. [MovingAverageBenchmarks](https://github.com/palantir/tritium/blob/davids/OptimizedSlidingTimeWindowMovingAverages/tritium-jmh/src/jmh/java/com/palantir/tritium/microbenchmarks/MovingAverageBenchmarks.java) demonstrates the difference in implementations in both execution time (22x faster) and allocations: ``` Benchmark (recordings) (type) Mode Cnt Score Error Units MovingAverageBenchmarks.getM1Rate 10 SlidingTimeWindowMovingAverages avgt 20 1364.969 ± 12.040 ns/op MovingAverageBenchmarks.getM1Rate:gc.alloc.rate.norm 10 SlidingTimeWindowMovingAverages avgt 20 688.037 ± 0.001 B/op MovingAverageBenchmarks.getM1Rate 10 OptimizedSlidingTimeWindowMovingAverages avgt 20 59.182 ± 0.343 ns/op MovingAverageBenchmarks.getM1Rate:gc.alloc.rate.norm 10 OptimizedSlidingTimeWindowMovingAverages avgt 20 0.002 ± 0.001 B/op MovingAverageBenchmarks.getM1Rate 1000 SlidingTimeWindowMovingAverages avgt 20 1401.864 ± 107.134 ns/op MovingAverageBenchmarks.getM1Rate:gc.alloc.rate.norm 1000 SlidingTimeWindowMovingAverages avgt 20 688.038 ± 0.003 B/op MovingAverageBenchmarks.getM1Rate 1000 OptimizedSlidingTimeWindowMovingAverages avgt 20 61.157 ± 1.393 ns/op MovingAverageBenchmarks.getM1Rate:gc.alloc.rate.norm 1000 OptimizedSlidingTimeWindowMovingAverages avgt 20 0.002 ± 0.001 B/op ``` Co-authored-by: David Schlosnagle <davids@palantir.com>

Optimize SlidingTimeWindowMovingAverages sumBuckets

a63bfd0

schlosna requested review from a team as code owners September 3, 2025 18:39

github-actions bot added this to the 4.2.37 milestone Sep 3, 2025

schlosna mentioned this pull request Sep 3, 2025

Optimize SlidingTimeWindowMovingAverages sumBuckets() palantir/tritium#2254

Closed

pkoenig10 reviewed Sep 4, 2025

View reviewed changes

schlosna added 2 commits September 3, 2025 20:47

buckets is an array

81503e6

buckets is AtomicLongArray

29b0664

schlosna commented Sep 4, 2025

View reviewed changes

Revert "buckets is AtomicLongArray"

0e9431b

This reverts commit 29b0664.

joschi approved these changes Sep 15, 2025

View reviewed changes

joschi merged commit b4a19b7 into dropwizard:release/4.2.x Sep 15, 2025
4 checks passed

schlosna deleted the davids/optimize-sliding-time-window-averages branch September 16, 2025 04:36

schlosna mentioned this pull request Sep 16, 2025

Excavator: Upgrade dependencies palantir/tritium#2265

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize SlidingTimeWindowMovingAverages sumBuckets#4936

Optimize SlidingTimeWindowMovingAverages sumBuckets#4936
joschi merged 4 commits intodropwizard:release/4.2.xfrom
schlosna:davids/optimize-sliding-time-window-averages

schlosna commented Sep 3, 2025 •

edited

Loading

Uh oh!

pkoenig10 Sep 4, 2025

Uh oh!

schlosna Sep 4, 2025

Uh oh!

schlosna Sep 4, 2025

Uh oh!

schlosna Sep 4, 2025 •

edited

Loading

Uh oh!

ash211 commented Sep 5, 2025

Uh oh!

joschi left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

	private final LongAdder[] buckets;
	private final AtomicLongArray buckets;

Conversation

schlosna commented Sep 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pkoenig10 Sep 4, 2025

Choose a reason for hiding this comment

Uh oh!

schlosna Sep 4, 2025

Choose a reason for hiding this comment

Uh oh!

schlosna Sep 4, 2025

Choose a reason for hiding this comment

Uh oh!

schlosna Sep 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ash211 commented Sep 5, 2025

Uh oh!

joschi left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

schlosna commented Sep 3, 2025 •

edited

Loading

schlosna Sep 4, 2025 •

edited

Loading