Speed up date_histogram by precomputing ranges (#61467) by nik9000 · Pull Request #62881 · elastic/elasticsearch

nik9000 · 2020-09-24T14:20:27Z

A few of us were talking about ways to speed up the date_histogram
using the index for the timestamp rather than the doc values. To do that
we'd have to pre-compute all of the "round down" points in the index. It
turns out that just precomputing those values speeds up rounding
fairly significantly:

Benchmark  (count)      (interval)                   (range)            (zone)  Mode  Cnt          Score         Error  Units
before    10000000  calendar month  2000-10-28 to 2000-10-31               UTC  avgt   10   96461080.982 ±  616373.011  ns/op
before    10000000  calendar month  2000-10-28 to 2000-10-31  America/New_York  avgt   10  130598950.850 ± 1249189.867  ns/op
after     10000000  calendar month  2000-10-28 to 2000-10-31               UTC  avgt   10   52311775.080 ±  107171.092  ns/op
after     10000000  calendar month  2000-10-28 to 2000-10-31  America/New_York  avgt   10   54800134.968 ±  373844.796  ns/op

That's a 46% speed up when there isn't a time zone and a 58% speed up
when there is.

This doesn't work for every time zone, specifically those that have two
midnights in a single day due to daylight savings time will produce wonky
results. So they don't get the optimization.

Second, this requires a few expensive computation up front to make the
transition array. And if the transition array is too large then we give
up and use the original mechanism, throwing away all of the work we did
to build the array. This seems appropriate for most usages of round,
but this change uses it for all usages of round. That seems ok for
now, but it might be worth investigating in a follow up.

I ran a macrobenchmark as well which showed an 11% preformance
improvement. BUT the benchmark wasn't tuned for my desktop so it
overwhelmed it and might have produced "funny" results. I think it is
pretty clear that this is an improvement, but know the measurement is
weird:

Benchmark  (count)      (interval)                   (range)            (zone)  Mode  Cnt          Score         Error  Units
before    10000000  calendar month  2000-10-28 to 2000-10-31               UTC  avgt   10   96461080.982 ±  616373.011  ns/op
before    10000000  calendar month  2000-10-28 to 2000-10-31  America/New_York  avgt   10  g± 1249189.867  ns/op
after     10000000  calendar month  2000-10-28 to 2000-10-31               UTC  avgt   10   52311775.080 ±  107171.092  ns/op
after     10000000  calendar month  2000-10-28 to 2000-10-31  America/New_York  avgt   10   54800134.968 ±  373844.796  ns/op

Before:
|               Min Throughput | hourly_agg |        0.11 |  ops/s |
|            Median Throughput | hourly_agg |        0.11 |  ops/s |
|               Max Throughput | hourly_agg |        0.11 |  ops/s |
|      50th percentile latency | hourly_agg |      650623 |     ms |
|      90th percentile latency | hourly_agg |      821478 |     ms |
|      99th percentile latency | hourly_agg |      859780 |     ms |
|     100th percentile latency | hourly_agg |      864030 |     ms |
| 50th percentile service time | hourly_agg |     9268.71 |     ms |
| 90th percentile service time | hourly_agg |        9380 |     ms |
| 99th percentile service time | hourly_agg |     9626.88 |     ms |
|100th percentile service time | hourly_agg |     9884.27 |     ms |
|                   error rate | hourly_agg |           0 |      % |

After:
|               Min Throughput | hourly_agg |        0.12 |  ops/s |
|            Median Throughput | hourly_agg |        0.12 |  ops/s |
|               Max Throughput | hourly_agg |        0.12 |  ops/s |
|      50th percentile latency | hourly_agg |      519254 |     ms |
|      90th percentile latency | hourly_agg |      653099 |     ms |
|      99th percentile latency | hourly_agg |      683276 |     ms |
|     100th percentile latency | hourly_agg |      686611 |     ms |
| 50th percentile service time | hourly_agg |     8371.41 |     ms |
| 90th percentile service time | hourly_agg |     8407.02 |     ms |
| 99th percentile service time | hourly_agg |     8536.64 |     ms |
|100th percentile service time | hourly_agg |     8538.54 |     ms |
|                   error rate | hourly_agg |           0 |      % |

A few of us were talking about ways to speed up the `date_histogram` using the index for the timestamp rather than the doc values. To do that we'd have to pre-compute all of the "round down" points in the index. It turns out that *just* precomputing those values speeds up rounding fairly significantly: ``` Benchmark (count) (interval) (range) (zone) Mode Cnt Score Error Units before 10000000 calendar month 2000-10-28 to 2000-10-31 UTC avgt 10 96461080.982 ± 616373.011 ns/op before 10000000 calendar month 2000-10-28 to 2000-10-31 America/New_York avgt 10 130598950.850 ± 1249189.867 ns/op after 10000000 calendar month 2000-10-28 to 2000-10-31 UTC avgt 10 52311775.080 ± 107171.092 ns/op after 10000000 calendar month 2000-10-28 to 2000-10-31 America/New_York avgt 10 54800134.968 ± 373844.796 ns/op ``` That's a 46% speed up when there isn't a time zone and a 58% speed up when there is. This doesn't work for every time zone, specifically those that have two midnights in a single day due to daylight savings time will produce wonky results. So they don't get the optimization. Second, this requires a few expensive computation up front to make the transition array. And if the transition array is too large then we give up and use the original mechanism, throwing away all of the work we did to build the array. This seems appropriate for most usages of `round`, but this change uses it for *all* usages of `round`. That seems ok for now, but it might be worth investigating in a follow up. I ran a macrobenchmark as well which showed an 11% preformance improvement. *BUT* the benchmark wasn't tuned for my desktop so it overwhelmed it and might have produced "funny" results. I think it is pretty clear that this is an improvement, but know the measurement is weird: ``` Benchmark (count) (interval) (range) (zone) Mode Cnt Score Error Units before 10000000 calendar month 2000-10-28 to 2000-10-31 UTC avgt 10 96461080.982 ± 616373.011 ns/op before 10000000 calendar month 2000-10-28 to 2000-10-31 America/New_York avgt 10 g± 1249189.867 ns/op after 10000000 calendar month 2000-10-28 to 2000-10-31 UTC avgt 10 52311775.080 ± 107171.092 ns/op after 10000000 calendar month 2000-10-28 to 2000-10-31 America/New_York avgt 10 54800134.968 ± 373844.796 ns/op Before: | Min Throughput | hourly_agg | 0.11 | ops/s | | Median Throughput | hourly_agg | 0.11 | ops/s | | Max Throughput | hourly_agg | 0.11 | ops/s | | 50th percentile latency | hourly_agg | 650623 | ms | | 90th percentile latency | hourly_agg | 821478 | ms | | 99th percentile latency | hourly_agg | 859780 | ms | | 100th percentile latency | hourly_agg | 864030 | ms | | 50th percentile service time | hourly_agg | 9268.71 | ms | | 90th percentile service time | hourly_agg | 9380 | ms | | 99th percentile service time | hourly_agg | 9626.88 | ms | |100th percentile service time | hourly_agg | 9884.27 | ms | | error rate | hourly_agg | 0 | % | After: | Min Throughput | hourly_agg | 0.12 | ops/s | | Median Throughput | hourly_agg | 0.12 | ops/s | | Max Throughput | hourly_agg | 0.12 | ops/s | | 50th percentile latency | hourly_agg | 519254 | ms | | 90th percentile latency | hourly_agg | 653099 | ms | | 99th percentile latency | hourly_agg | 683276 | ms | | 100th percentile latency | hourly_agg | 686611 | ms | | 50th percentile service time | hourly_agg | 8371.41 | ms | | 90th percentile service time | hourly_agg | 8407.02 | ms | | 99th percentile service time | hourly_agg | 8536.64 | ms | |100th percentile service time | hourly_agg | 8538.54 | ms | | error rate | hourly_agg | 0 | % | ```

nik9000 added backport v7.10.0 labels Sep 24, 2020

nik9000 merged commit 77757b2 into elastic:7.x Oct 5, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Speed up date_histogram by precomputing ranges (#61467)#62881

Speed up date_histogram by precomputing ranges (#61467)#62881
nik9000 merged 1 commit intoelastic:7.xfrom
nik9000:binary_search_for_rounding_7_x

nik9000 commented Sep 24, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

nik9000 commented Sep 24, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant