Skip to content

Optimize PFCOUNT, PFMERGE command by SIMD acceleration#13558

Merged
ShooterIT merged 16 commits intoredis:unstablefrom
Nugine:hll-simd
Nov 8, 2024
Merged

Optimize PFCOUNT, PFMERGE command by SIMD acceleration#13558
ShooterIT merged 16 commits intoredis:unstablefrom
Nugine:hll-simd

Conversation

@Nugine
Copy link
Copy Markdown
Contributor

@Nugine Nugine commented Sep 18, 2024

This PR optimizes the performance of HyperLogLog commands (PFCOUNT, PFMERGE) by adding AVX2 fast paths.

Two AVX2 functions are added for conversion between raw representation and dense representation. They are 15 ~ 30 times faster than scalar implementaion. Note that sparse representation is not accelerated.

AVX2 fast paths are enabled when the CPU supports AVX2 (checked at runtime) and the hyperloglog configuration is default (HLL_REGISTERS == 16384 && HLL_BITS == 6).

When merging 3 dense hll structures, the benchmark shows a 12x speedup compared to the scalar version.

pfcount key1 key2 key3
pfmerge keyall key1 key2 key3
======================================================================================================
Type             Ops/sec    Avg. Latency     p50 Latency     p99 Latency   p99.9 Latency       KB/sec 
------------------------------------------------------------------------------------------------------
PFCOUNT-scalar    5570.09        35.89060        32.51100        65.27900        69.11900       299.17
PFCOUNT-avx2     72604.92         2.82072         2.73500         5.50300         7.13500      3899.68
------------------------------------------------------------------------------------------------------
PFMERGE-scalar    7879.13        25.52156        24.19100        46.33500        48.38300       492.45
PFMERGE-avx2    126448.64         1.58120         1.53500         3.08700         4.89500      7903.04
------------------------------------------------------------------------------------------------------

scalar: redis:unstable   9906daf5c9fdb836a5b3f04829c75701a4e90eb4
avx2:   Nugine:hll-simd  02e09f85ac07eace50ebdddd0fd70822f7b9152d 

CPU:    13th Gen Intel® Core™ i9-13900H × 20
Memory: 32.0 GiB
OS:     Ubuntu 22.04.5 LTS

Experiment repo: https://github.com/Nugine/redis-hyperloglog
Benchmark script: https://github.com/Nugine/redis-hyperloglog/blob/main/scripts/memtier.sh
Algorithm: https://github.com/Nugine/redis-hyperloglog/blob/main/cpp/bench.cpp

resolves #13551

Copy link
Copy Markdown
Member

@ShooterIT ShooterIT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for you PR.
I think some CPU instruction sets may significantly improve performance of redis for some computing scenarios, also maybe for crc or hash algorithms.

But actually i lacks this knowledge, i will learn it. Hi @filipecosta90, could you also take a look.

@ShooterIT ShooterIT added the action:run-benchmark Triggers the benchmark suite for this Pull Request label Oct 24, 2024
@fcostaoliveira
Copy link
Copy Markdown
Collaborator

fcostaoliveira commented Oct 24, 2024

CE Performance Automation : step 1 of 2 (build) DONE.

This comment was automatically generated given a benchmark was triggered.
Started building at 2024-11-08 06:59:38.746304 and took 64 seconds.
You can check each build/benchmark progress in grafana:

  • git hash: 986c4e4
  • git branch: Nugine:hll-simd
  • commit date and time: n/a
  • commit summary: n/a
  • test filters:
    • command priority lower limit: 0
    • command priority upper limit: 10000
    • test name regex: .*
    • command group regex: .*

You can check a comparison in detail via the grafana link

@fcostaoliveira
Copy link
Copy Markdown
Collaborator

fcostaoliveira commented Oct 24, 2024

CE Performance Automation : step 2 of 2 (benchmark) FINISHED.

This comment was automatically generated given a benchmark was triggered.

Started benchmark suite at 2024-11-13 19:19:28.887445 and took 48079.879437 seconds to finish.
Status: [################################################################################] 100.0% completed.

In total will run 141 benchmarks.
- 0 pending.
- 141 completed:
- 141 successful.
- 0 failed.
You can check a the status in detail via the grafana link

@fcostaoliveira
Copy link
Copy Markdown
Collaborator

fcostaoliveira commented Oct 24, 2024

Automated performance analysis summary

This comment was automatically generated given there is performance data available.

Using platform named: intel64-ubuntu22.04-redis-icx1 to do the comparison.

In summary:

  • Detected a total of 264 stable tests between versions.
  • Detected a total of 8 highly unstable benchmarks.
  • Detected a total of 1 improvements above the improvement water line.
  • Detected a total of 4 regressions bellow the regression water line 10.0.
    • Median/Common-Case regression was -14.0% and ranged from [-15.0%,-12.0%].

You can check a comparison in detail via the grafana link

Comparison between unstable and Nugine:hll-simd.

Time Period from 5 months ago. (environment used: oss-standalone)

Unstable Table

Test Case Baseline redis/redis unstable (median obs. +- std.dev) Comparison redis/redis Nugine:hll-simd (median obs. +- std.dev) % change (higher-better) Note
memtier_benchmark-10Kkeys-load-hash-50-fields-with-10000B-values 3378 3241 +- 12.7% UNSTABLE (17 datapoints) -4.0% UNSTABLE (very high variance) potential REGRESSION
memtier_benchmark-1Mkeys-generic-ttl-pipeline-10 1155026 1138705 +- 10.2% UNSTABLE (15 datapoints) -1.4% UNSTABLE (very high variance) No Change
memtier_benchmark-1Mkeys-load-zset-listpack-with-100-elements-double-score 3653 3643 +- 11.0% UNSTABLE (15 datapoints) -0.3% UNSTABLE (very high variance) No Change
memtier_benchmark-1Mkeys-string-incrbyfloat-pipeline-10 534977 529753 +- 12.8% UNSTABLE (13 datapoints) -1.0% UNSTABLE (very high variance) No Change
memtier_benchmark-1Mkeys-string-mget-1KiB 147465 140120 +- 13.9% UNSTABLE (13 datapoints) -5.0% UNSTABLE (very high variance) potential REGRESSION
memtier_benchmark-1key-zset-1K-elements-zrange-all-elements 4853 4811 +- 10.1% UNSTABLE (13 datapoints) -0.9% UNSTABLE (very high variance) No Change
memtier_benchmark-2keys-set-10-100-elements-sunion 45848 45730 +- 11.5% UNSTABLE (13 datapoints) -0.3% UNSTABLE (very high variance) No Change
memtier_benchmark-3Mkeys-string-mixed-20-80-with-512B-values-pipeline-10-5200_conns 140884 141667 +- 12.0% UNSTABLE (13 datapoints) 0.6% UNSTABLE (very high variance) No Change

Unstable test regexp names: memtier_benchmark-10Kkeys-load-hash-50-fields-with-10000B-values|memtier_benchmark-1Mkeys-generic-ttl-pipeline-10|memtier_benchmark-1Mkeys-load-zset-listpack-with-100-elements-double-score|memtier_benchmark-1Mkeys-string-incrbyfloat-pipeline-10|memtier_benchmark-1Mkeys-string-mget-1KiB|memtier_benchmark-1key-zset-1K-elements-zrange-all-elements|memtier_benchmark-2keys-set-10-100-elements-sunion|memtier_benchmark-3Mkeys-string-mixed-20-80-with-512B-values-pipeline-10-5200_conns

Regressions Table

Test Case Baseline redis/redis unstable (median obs. +- std.dev) Comparison redis/redis Nugine:hll-simd (median obs. +- std.dev) % change (higher-better) Note
memtier_benchmark-1key-list-10K-elements-linsert-lrem-integer 8330 7077 +- 9.6% (13 datapoints) -15.0% REGRESSION
memtier_benchmark-1key-list-10K-elements-linsert-lrem-string 10716 9268 +- 9.5% (13 datapoints) -13.5% REGRESSION
memtier_benchmark-1key-list-10K-elements-lpos-integer 8204 7010 +- 9.2% (15 datapoints) -14.6% REGRESSION
memtier_benchmark-1key-list-10K-elements-lpos-string 9915 8721 +- 9.6% (13 datapoints) -12.0% REGRESSION

Regressions test regexp names: memtier_benchmark-1key-list-10K-elements-linsert-lrem-integer|memtier_benchmark-1key-list-10K-elements-linsert-lrem-string|memtier_benchmark-1key-list-10K-elements-lpos-integer|memtier_benchmark-1key-list-10K-elements-lpos-string

Improvements Table

Test Case Baseline redis/redis unstable (median obs. +- std.dev) Comparison redis/redis Nugine:hll-simd (median obs. +- std.dev) % change (higher-better) Note
memtier_benchmark-1Mkeys-10B-expire-use-case 182306 204919 +- 6.6% (15 datapoints) 12.4% IMPROVEMENT

Improvements test regexp names: memtier_benchmark-1Mkeys-10B-expire-use-case

Full Results table:
Test Case Baseline redis/redis unstable (median obs. +- std.dev) Comparison redis/redis Nugine:hll-simd (median obs. +- std.dev) % change (higher-better) Note
latency-rate-limited-10000_qps-memtier_benchmark-100Kkeys-hash-hgetall-50-fields-100B-values 10001 10001 +- 0.3% (8 datapoints) -0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-100Kkeys-load-hash-50-fields-with-1000B-values 9996 9994 +- 0.3% (8 datapoints) -0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-100Kkeys-load-hash-50-fields-with-100B-values 9999 9998 +- 0.1% (9 datapoints) -0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-100Kkeys-load-hash-50-fields-with-10B-values 9991 9998 +- 0.0% (8 datapoints) 0.1% No Change
latency-rate-limited-10000_qps-memtier_benchmark-10Mkeys-load-hash-5-fields-with-100B-values 10000 9998 +- 0.1% (8 datapoints) -0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-10Mkeys-load-hash-5-fields-with-100B-values-pipeline-10 10000 10000 +- 0.0% (9 datapoints) -0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-10Mkeys-load-hash-5-fields-with-10B-values 10000 9998 +- 0.0% (8 datapoints) -0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-10Mkeys-load-hash-5-fields-with-10B-values-pipeline-10 10001 10000 +- 0.0% (9 datapoints) -0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1Mkeys-100B-expire-use-case 9999 10000 +- 0.0% (8 datapoints) 0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1Mkeys-10B-expire-use-case 10000 10000 +- 0.0% (8 datapoints) -0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1Mkeys-1KiB-expire-use-case 10000 10000 +- 0.0% (8 datapoints) -0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1Mkeys-4KiB-expire-use-case 10000 10000 +- 0.0% (8 datapoints) -0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1Mkeys-bitmap-getbit-pipeline-10 10000 10000 +- 0.0% (8 datapoints) -0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1Mkeys-generic-exists-pipeline-10 10000 10000 +- 0.1% (8 datapoints) -0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1Mkeys-generic-expire-pipeline-10 10000 10000 +- 0.0% (9 datapoints) -0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1Mkeys-generic-expireat-pipeline-10 9983 10000 +- 0.2% (8 datapoints) 0.2% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1Mkeys-generic-pexpire-pipeline-10 10000 10000 +- 0.0% (8 datapoints) -0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1Mkeys-generic-scan-pipeline-10 10000 10000 +- 0.0% (8 datapoints) -0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1Mkeys-generic-touch-pipeline-10 10000 10000 +- 0.0% (8 datapoints) 0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1Mkeys-generic-ttl-pipeline-10 10000 10000 +- 0.0% (8 datapoints) 0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1Mkeys-hash-hexists 9984 9984 +- 0.0% (8 datapoints) -0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1Mkeys-hash-hget-hgetall-hkeys-hvals-with-100B-values 9996 10000 +- 0.0% (8 datapoints) 0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1Mkeys-hash-hgetall-50-fields-10B-values 10001 10001 +- 0.1% (8 datapoints) -0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1Mkeys-hash-hincrby 10001 10000 +- 0.6% (8 datapoints) -0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1Mkeys-hash-hmget-5-fields-with-100B-values-pipeline-10 10001 10000 +- 0.1% (10 datapoints) -0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1Mkeys-hash-transactions-multi-exec-pipeline-20 10000 10000 +- 0.1% (8 datapoints) -0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1Mkeys-list-lpop-rpop-with-100B-values 10000 10000 +- 0.6% (8 datapoints) -0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1Mkeys-list-lpop-rpop-with-10B-values 10000 10000 +- 0.0% (8 datapoints) -0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1Mkeys-list-lpop-rpop-with-1KiB-values 10000 10000 +- 0.1% (8 datapoints) -0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1Mkeys-load-hash-5-fields-with-1000B-values 9998 9998 +- 0.3% (8 datapoints) 0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1Mkeys-load-hash-5-fields-with-1000B-values-pipeline-10 10000 10000 +- 0.0% (8 datapoints) 0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1Mkeys-load-hash-hmset-5-fields-with-1000B-values 9997 10000 +- 0.0% (8 datapoints) 0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1Mkeys-load-list-with-100B-values 10000 10000 +- 0.0% (8 datapoints) -0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1Mkeys-load-list-with-10B-values 9999 9999 +- 0.0% (8 datapoints) -0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1Mkeys-load-list-with-1KiB-values 9999 9997 +- 0.2% (8 datapoints) -0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1Mkeys-load-set-intset-with-100-elements 9999 9999 +- 0.0% (8 datapoints) -0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1Mkeys-load-set-intset-with-100-elements-pipeline-10 10000 10000 +- 0.0% (8 datapoints) -0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1Mkeys-load-stream-1-fields-with-100B-values 10000 9999 +- 0.1% (8 datapoints) -0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1Mkeys-load-stream-1-fields-with-100B-values-pipeline-10 10000 10000 +- 0.1% (9 datapoints) -0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1Mkeys-load-stream-5-fields-with-100B-values 9999 9999 +- 0.0% (8 datapoints) -0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1Mkeys-load-stream-5-fields-with-100B-values-pipeline-10 10000 10000 +- 0.0% (8 datapoints) 0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1Mkeys-load-string-with-100B-values 10000 9999 +- 0.0% (8 datapoints) -0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1Mkeys-load-string-with-100B-values-pipeline-10 10001 10001 +- 0.0% (8 datapoints) -0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1Mkeys-load-string-with-10B-values 10000 9999 +- 0.0% (8 datapoints) -0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1Mkeys-load-string-with-10B-values-pipeline-10 10001 10001 +- 0.0% (8 datapoints) -0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1Mkeys-load-string-with-1KiB-values 10000 9998 +- 0.0% (8 datapoints) -0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1Mkeys-load-string-with-20KiB-values 9996 9999 +- 0.1% (8 datapoints) 0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1Mkeys-load-zset-with-10-elements-double-score 10000 9998 +- 0.2% (8 datapoints) -0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1Mkeys-load-zset-with-10-elements-int-score 9998 9998 +- 0.1% (8 datapoints) 0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1Mkeys-string-append-1-100B 10000 9999 +- 0.1% (8 datapoints) -0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1Mkeys-string-append-1-100B-pipeline-10 10000 10001 +- 0.1% (8 datapoints) 0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1Mkeys-string-decr 9985 9984 +- 0.0% (8 datapoints) -0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1Mkeys-string-get-100B 10000 10000 +- 0.0% (9 datapoints) -0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1Mkeys-string-get-100B-pipeline-10 10001 10001 +- 0.0% (8 datapoints) -0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1Mkeys-string-get-10B 10000 10000 +- 0.2% (8 datapoints) 0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1Mkeys-string-get-10B-pipeline-10 10001 10001 +- 0.0% (8 datapoints) -0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1Mkeys-string-get-1KiB 10000 10000 +- 0.2% (8 datapoints) -0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1Mkeys-string-get-1KiB-pipeline-10 10001 10001 +- 0.0% (8 datapoints) 0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1Mkeys-string-get-20KiB 10056 10000 +- 0.0% (8 datapoints) -0.6% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1Mkeys-string-incrby 9999 9999 +- 0.0% (8 datapoints) 0.0%
latency-rate-limited-10000_qps-memtier_benchmark-1Mkeys-string-incrby-pipeline-10 10001 10000 +- 0.0% (8 datapoints) -0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1Mkeys-string-incrbyfloat 9997 9998 +- 0.1% (8 datapoints) 0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1Mkeys-string-incrbyfloat-pipeline-10 10001 10000 +- 0.0% (8 datapoints) -0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1Mkeys-string-mget-1KiB 10000 10000 +- 0.0% (8 datapoints) -0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1Mkeys-string-setex-100B-pipeline-10 10001 10001 +- 0.1% (8 datapoints) 0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1Mkeys-string-setrange-100B 10001 9998 +- 0.2% (8 datapoints) -0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1Mkeys-string-setrange-100B-pipeline-10 10001 10001 +- 0.0% (8 datapoints) -0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1key-geo-2-elements-geopos 9999 9997 +- 0.0% (8 datapoints) -0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1key-geo-2-elements-geosearch-fromlonlat-withcoord 9992 9999 +- 0.0% (8 datapoints) 0.1% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1key-geo-60M-elements-geodist 10000 9999 +- 0.0% (9 datapoints) -0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1key-geo-60M-elements-geodist-pipeline-10 10000 10000 +- 0.0% (8 datapoints) -0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1key-geo-60M-elements-geohash 10000 9999 +- 0.0% (9 datapoints) -0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1key-geo-60M-elements-geohash-pipeline-10 10000 10000 +- 0.0% (8 datapoints) 0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1key-geo-60M-elements-geopos 10000 10000 +- 0.0% (9 datapoints) -0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1key-geo-60M-elements-geopos-pipeline-10 10000 10000 +- 0.1% (8 datapoints) -0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1key-geo-60M-elements-geosearch-fromlonlat 10000 10000 +- 0.0% (8 datapoints) 0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1key-geo-60M-elements-geosearch-fromlonlat-bybox 10000 10000 +- 0.0% (7 datapoints) -0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1key-geo-60M-elements-geosearch-fromlonlat-pipeline-10 10000 10000 +- 0.0% (8 datapoints) -0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1key-hash-hscan-50-fields-10B-values 9999 9998 +- 0.0% (8 datapoints) -0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1key-list-10-elements-lrange-all-elements 9999 9999 +- 0.1% (8 datapoints) -0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1key-list-10-elements-lrange-all-elements-pipeline-10 9995 10000 +- 0.2% (8 datapoints) 0.1% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1key-list-100-elements-lrange-all-elements 10025 9999 +- 0.1% (11 datapoints) -0.3% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1key-list-100-elements-lrange-all-elements-pipeline-10 10000 10000 +- 0.0% (8 datapoints) -0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1key-list-10K-elements-lindex-integer 10000 9997 +- 0.2% (9 datapoints) -0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1key-list-10K-elements-lindex-string 9998 9998 +- 0.0% (8 datapoints) -0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1key-list-1K-elements-lrange-all-elements 9997 9996 +- 0.3% (8 datapoints) -0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1key-list-1K-elements-lrange-all-elements-pipeline-10 10000 10000 +- 0.4% (8 datapoints) -0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1key-pfadd-4KB-values-pipeline-10 10000 10000 +- 0.0% (8 datapoints) 0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1key-set-10-elements-smembers 9999 9998 +- 0.2% (8 datapoints) -0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1key-set-10-elements-smembers-pipeline-10 10000 10000 +- 0.0% (8 datapoints) 0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1key-set-10-elements-smismember 10000 9998 +- 0.5% (8 datapoints) -0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1key-set-100-elements-sismember-is-a-member 9999 9999 +- 0.3% (8 datapoints) -0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1key-set-100-elements-sismember-not-a-member 9995 9999 +- 0.0% (10 datapoints) 0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1key-set-100-elements-smembers 9998 9998 +- 0.0% (9 datapoints) 0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1key-set-100-elements-smismember 9999 9999 +- 0.1% (8 datapoints) -0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1key-set-100-elements-sscan 9999 9999 +- 0.1% (8 datapoints) -0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1key-set-10M-elements-sismember-50pct-chance 10000 9998 +- 0.2% (8 datapoints) -0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1key-set-1K-elements-smembers 9996 9996 +- 0.0% (9 datapoints) -0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1key-set-1M-elements-sismember-50pct-chance 9998 9998 +- 0.0% (8 datapoints) 0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1key-set-200K-elements-sadd-constant 9995 9999 +- 0.1% (8 datapoints) 0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1key-set-2M-elements-sadd-increasing 10000 10000 +- 0.0% (9 datapoints) 0.0%
latency-rate-limited-10000_qps-memtier_benchmark-1key-zincrby-1M-elements-pipeline-1 10000 10000 +- 0.0% (8 datapoints) 0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1key-zrank-1M-elements-pipeline-1 10000 10000 +- 0.0% (8 datapoints) 0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1key-zrem-5M-elements-pipeline-1 10000 10000 +- 0.0% (8 datapoints) 0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1key-zrevrangebyscore-256K-elements-pipeline-1 10000 9999 +- 0.0% (8 datapoints) -0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1key-zrevrank-1M-elements-pipeline-1 10000 10000 +- 0.0% (8 datapoints) -0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1key-zset-10-elements-zrange-all-elements 10000 9999 +- 0.1% (8 datapoints) -0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1key-zset-10-elements-zrange-all-elements-long-scores 9999 9997 +- 0.2% (8 datapoints) -0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1key-zset-100-elements-zrange-all-elements 9998 9997 +- 0.0% (8 datapoints) -0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1key-zset-100-elements-zrangebyscore-all-elements 9999 9998 +- 0.0% (8 datapoints) -0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1key-zset-100-elements-zrangebyscore-all-elements-long-scores 10000 9999 +- 0.1% (8 datapoints) -0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1key-zset-100-elements-zscan 10000 9998 +- 0.2% (9 datapoints) -0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1key-zset-1M-elements-zcard-pipeline-10 10000 10000 +- 0.1% (8 datapoints) -0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1key-zset-1M-elements-zrevrange-5-elements 10001 10000 +- 0.1% (8 datapoints) -0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-1key-zset-1M-elements-zscore-pipeline-10 10000 10000 +- 0.2% (11 datapoints) -0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-2keys-lua-eval-hset-expire 10000 9996 +- 0.1% (8 datapoints) -0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-2keys-lua-evalsha-hset-expire 9998 9999 +- 0.1% (9 datapoints) 0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-2keys-set-10-100-elements-sdiff 9999 9999 +- 0.0% (8 datapoints) -0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-2keys-set-10-100-elements-sinter 10000 9999 +- 0.0% (9 datapoints) -0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-2keys-set-10-100-elements-sunion 9998 9998 +- 0.6% (8 datapoints) 0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-2keys-stream-5-entries-xread-all-entries 10000 9999 +- 0.0% (8 datapoints) -0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-2keys-stream-5-entries-xread-all-entries-pipeline-10 10000 10000 +- 0.5% (8 datapoints) -0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-3Mkeys-load-string-with-512B-values 9754 9755 +- 0.1% (10 datapoints) 0.0% No Change
latency-rate-limited-10000_qps-memtier_benchmark-connection-hello 10000 9998 +- 0.1% (8 datapoints) -0.0% No Change
latency-rate-limited-1000_qps-memtier_benchmark-10Kkeys-load-hash-50-fields-with-10000B-values 1001 1001 +- 0.2% (8 datapoints) 0.0% No Change
latency-rate-limited-1000_qps-memtier_benchmark-1Mkeys-load-zset-listpack-with-100-elements-double-score 1001 1001 +- 0.2% (8 datapoints) -0.0% No Change
latency-rate-limited-1000_qps-memtier_benchmark-1key-100M-bits-bitmap-bitcount 1001 1001 +- 0.0% (9 datapoints) 0.0% No Change
latency-rate-limited-1000_qps-memtier_benchmark-1key-list-10K-elements-linsert-lrem-integer 1001 1001 +- 0.0% (8 datapoints) -0.0% No Change
latency-rate-limited-1000_qps-memtier_benchmark-1key-list-10K-elements-linsert-lrem-string 1001 1001 +- 0.0% (8 datapoints) -0.0% No Change
latency-rate-limited-1000_qps-memtier_benchmark-1key-list-10K-elements-lpos-integer 1001 1001 +- 0.0% (8 datapoints) -0.0% No Change
latency-rate-limited-1000_qps-memtier_benchmark-1key-list-10K-elements-lpos-string 1001 1001 +- 0.2% (8 datapoints) -0.0% No Change
latency-rate-limited-1000_qps-memtier_benchmark-1key-list-2K-elements-quicklist-lrange-all-elements-longs 1001 1001 +- 0.0% (8 datapoints) 0.0% No Change
latency-rate-limited-1000_qps-memtier_benchmark-1key-zset-1K-elements-zrange-all-elements 1006 1001 +- 0.0% (10 datapoints) -0.5% No Change
latency-rate-limited-1000_qps-memtier_benchmark-2keys-zset-300-elements-skiplist-encoded-zunion 1001 1001 +- 0.0% (8 datapoints) 0.0% No Change
latency-rate-limited-1000_qps-memtier_benchmark-2keys-zset-300-elements-skiplist-encoded-zunionstore 1001 1001 +- 0.0% (8 datapoints) -0.0% No Change
latency-rate-limited-100_qps-memtier_benchmark-1key-1Billion-bits-bitmap-bitcount 202 202 +- 0.3% (8 datapoints) -0.0% No Change
memtier_benchmark-100Kkeys-hash-hgetall-50-fields-100B-values 175198 172899 +- 3.7% (13 datapoints) -1.3% No Change
memtier_benchmark-100Kkeys-load-hash-50-fields-with-1000B-values 19280 18751 +- 7.8% (13 datapoints) -2.7% No Change
memtier_benchmark-100Kkeys-load-hash-50-fields-with-100B-values 46288 45542 +- 5.9% (15 datapoints) -1.6% No Change
memtier_benchmark-100Kkeys-load-hash-50-fields-with-10B-values 44987 43059 +- 8.8% (13 datapoints) -4.3% potential REGRESSION
memtier_benchmark-10Kkeys-load-hash-50-fields-with-10000B-values 3378 3241 +- 12.7% UNSTABLE (17 datapoints) -4.0% UNSTABLE (very high variance) potential REGRESSION
memtier_benchmark-10Mkeys-load-hash-5-fields-with-100B-values 127254 124853 +- 2.7% (15 datapoints) -1.9% No Change
memtier_benchmark-10Mkeys-load-hash-5-fields-with-100B-values-pipeline-10 384595 370298 +- 7.8% (13 datapoints) -3.7% potential REGRESSION
memtier_benchmark-10Mkeys-load-hash-5-fields-with-10B-values 146060 144096 +- 3.4% (13 datapoints) -1.3% No Change
memtier_benchmark-10Mkeys-load-hash-5-fields-with-10B-values-pipeline-10 482419 475787 +- 8.4% (13 datapoints) -1.4% No Change
memtier_benchmark-1Mkeys-100B-expire-use-case 207080 202490 +- 7.2% (13 datapoints) -2.2% No Change
memtier_benchmark-1Mkeys-10B-expire-use-case 182306 204919 +- 6.6% (15 datapoints) 12.4% IMPROVEMENT
memtier_benchmark-1Mkeys-1KiB-expire-use-case 203716 200192 +- 6.5% (15 datapoints) -1.7% No Change
memtier_benchmark-1Mkeys-4KiB-expire-use-case 196710 191767 +- 6.5% (15 datapoints) -2.5% No Change
memtier_benchmark-1Mkeys-bitmap-getbit-pipeline-10 1120887 1052005 +- 6.8% (13 datapoints) -6.1% potential REGRESSION
memtier_benchmark-1Mkeys-generic-exists-pipeline-10 1192018 1154857 +- 7.3% (13 datapoints) -3.1% potential REGRESSION
memtier_benchmark-1Mkeys-generic-expire-pipeline-10 1090974 1041581 +- 7.6% (13 datapoints) -4.5% potential REGRESSION
memtier_benchmark-1Mkeys-generic-expireat-pipeline-10 1061064 991959 +- 6.9% (13 datapoints) -6.5% potential REGRESSION
memtier_benchmark-1Mkeys-generic-pexpire-pipeline-10 1061092 1027749 +- 6.7% (13 datapoints) -3.1% potential REGRESSION
memtier_benchmark-1Mkeys-generic-scan-pipeline-10 605806 575290 +- 8.9% (15 datapoints) -5.0% potential REGRESSION
memtier_benchmark-1Mkeys-generic-touch-pipeline-10 1193587 1125499 +- 7.8% (13 datapoints) -5.7% potential REGRESSION
memtier_benchmark-1Mkeys-generic-ttl-pipeline-10 1155026 1138705 +- 10.2% UNSTABLE (15 datapoints) -1.4% UNSTABLE (very high variance) No Change
memtier_benchmark-1Mkeys-hash-hexists 193779 189145 +- 7.6% (13 datapoints) -2.4% No Change
memtier_benchmark-1Mkeys-hash-hget-hgetall-hkeys-hvals-with-100B-values 193472 190259 +- 3.2% (15 datapoints) -1.7% No Change
memtier_benchmark-1Mkeys-hash-hgetall-50-fields-10B-values 185406 179371 +- 3.4% (13 datapoints) -3.3% potential REGRESSION
memtier_benchmark-1Mkeys-hash-hincrby 205213 200241 +- 8.0% (13 datapoints) -2.4% No Change
memtier_benchmark-1Mkeys-hash-hmget-5-fields-with-100B-values-pipeline-10 853412 809468 +- 6.7% (13 datapoints) -5.1% potential REGRESSION
memtier_benchmark-1Mkeys-hash-transactions-multi-exec-pipeline-20 1293827 1297628 +- 9.0% (13 datapoints) 0.3% No Change
memtier_benchmark-1Mkeys-list-lpop-rpop-with-100B-values 190803 188646 +- 4.9% (13 datapoints) -1.1% No Change
memtier_benchmark-1Mkeys-list-lpop-rpop-with-10B-values 193133 190066 +- 3.5% (13 datapoints) -1.6% No Change
memtier_benchmark-1Mkeys-list-lpop-rpop-with-1KiB-values 193153 189357 +- 3.5% (15 datapoints) -2.0% No Change
memtier_benchmark-1Mkeys-load-hash-5-fields-with-1000B-values 112031 108137 +- 2.3% (13 datapoints) -3.5% potential REGRESSION
memtier_benchmark-1Mkeys-load-hash-5-fields-with-1000B-values-pipeline-10 178491 171105 +- 6.1% (13 datapoints) -4.1% potential REGRESSION
memtier_benchmark-1Mkeys-load-hash-hmset-5-fields-with-1000B-values 110136 107771 +- 2.6% (15 datapoints) -2.1% No Change
memtier_benchmark-1Mkeys-load-list-with-100B-values 156012 153397 +- 3.1% (13 datapoints) -1.7% No Change
memtier_benchmark-1Mkeys-load-list-with-10B-values 169898 167313 +- 2.2% (15 datapoints) -1.5% No Change
memtier_benchmark-1Mkeys-load-list-with-1KiB-values 120344 117647 +- 4.8% (13 datapoints) -2.2% No Change
memtier_benchmark-1Mkeys-load-set-intset-with-100-elements 79634 79940 +- 5.8% (13 datapoints) 0.4% No Change
memtier_benchmark-1Mkeys-load-set-intset-with-100-elements-pipeline-10 136706 137897 +- 8.9% (15 datapoints) 0.9% No Change
memtier_benchmark-1Mkeys-load-stream-1-fields-with-100B-values 135385 134374 +- 2.9% (13 datapoints) -0.7% No Change
memtier_benchmark-1Mkeys-load-stream-1-fields-with-100B-values-pipeline-10 405199 408744 +- 6.7% (13 datapoints) 0.9% No Change
memtier_benchmark-1Mkeys-load-stream-5-fields-with-100B-values 111859 110643 +- 3.2% (13 datapoints) -1.1% No Change
memtier_benchmark-1Mkeys-load-stream-5-fields-with-100B-values-pipeline-10 258520 257512 +- 6.5% (17 datapoints) -0.4% No Change
memtier_benchmark-1Mkeys-load-string-with-100B-values 175179 170887 +- 2.4% (15 datapoints) -2.4% No Change
memtier_benchmark-1Mkeys-load-string-with-100B-values-pipeline-10 720851 679070 +- 4.4% (15 datapoints) -5.8% potential REGRESSION
memtier_benchmark-1Mkeys-load-string-with-10B-values 187394 180100 +- 2.5% (15 datapoints) -3.9% potential REGRESSION
memtier_benchmark-1Mkeys-load-string-with-10B-values-pipeline-10 878492 825780 +- 5.4% (13 datapoints) -6.0% potential REGRESSION
memtier_benchmark-1Mkeys-load-string-with-1KiB-values 166621 161382 +- 2.3% (13 datapoints) -3.1% potential REGRESSION
memtier_benchmark-1Mkeys-load-string-with-20KiB-values 76245 77669 +- 7.5% (15 datapoints) 1.9% No Change
memtier_benchmark-1Mkeys-load-zset-listpack-with-100-elements-double-score 3653 3643 +- 11.0% UNSTABLE (15 datapoints) -0.3% UNSTABLE (very high variance) No Change
memtier_benchmark-1Mkeys-load-zset-with-10-elements-double-score 116128 114452 +- 4.7% (13 datapoints) -1.4% No Change
memtier_benchmark-1Mkeys-load-zset-with-10-elements-int-score 126796 124985 +- 3.7% (13 datapoints) -1.4% No Change
memtier_benchmark-1Mkeys-string-append-1-100B 183609 181750 +- 2.8% (13 datapoints) -1.0% No Change
memtier_benchmark-1Mkeys-string-append-1-100B-pipeline-10 856166 818235 +- 4.2% (13 datapoints) -4.4% potential REGRESSION
memtier_benchmark-1Mkeys-string-decr 191206 190671 +- 7.7% (13 datapoints) -0.3% No Change
memtier_benchmark-1Mkeys-string-get-100B 199363 196123 +- 5.0% (13 datapoints) -1.6% No Change
memtier_benchmark-1Mkeys-string-get-100B-pipeline-10 1133748 1063343 +- 4.8% (13 datapoints) -6.2% potential REGRESSION
memtier_benchmark-1Mkeys-string-get-10B 201296 196045 +- 5.2% (13 datapoints) -2.6% No Change
memtier_benchmark-1Mkeys-string-get-10B-pipeline-10 1125273 1088160 +- 6.1% (13 datapoints) -3.3% potential REGRESSION
memtier_benchmark-1Mkeys-string-get-1KiB 199656 195835 +- 5.0% (15 datapoints) -1.9% No Change
memtier_benchmark-1Mkeys-string-get-1KiB-pipeline-10 1087196 1025780 +- 5.3% (13 datapoints) -5.6% potential REGRESSION
memtier_benchmark-1Mkeys-string-incrby 192314 188762 +- 3.5% (15 datapoints) -1.8% No Change
memtier_benchmark-1Mkeys-string-incrby-pipeline-10 988661 963540 +- 4.7% (13 datapoints) -2.5% No Change
memtier_benchmark-1Mkeys-string-incrbyfloat 162939 160915 +- 3.5% (13 datapoints) -1.2% No Change
memtier_benchmark-1Mkeys-string-incrbyfloat-pipeline-10 534977 529753 +- 12.8% UNSTABLE (13 datapoints) -1.0% UNSTABLE (very high variance) No Change
memtier_benchmark-1Mkeys-string-mget-1KiB 147465 140120 +- 13.9% UNSTABLE (13 datapoints) -5.0% UNSTABLE (very high variance) potential REGRESSION
memtier_benchmark-1Mkeys-string-setex-100B-pipeline-10 778384 745532 +- 5.8% (13 datapoints) -4.2% potential REGRESSION
memtier_benchmark-1Mkeys-string-setrange-100B 186770 182700 +- 3.3% (13 datapoints) -2.2% No Change
memtier_benchmark-1Mkeys-string-setrange-100B-pipeline-10 895746 895375 +- 5.2% (15 datapoints) -0.0% No Change
memtier_benchmark-1key-100M-bits-bitmap-bitcount 26419 24374 +- 9.1% (13 datapoints) -7.7% potential REGRESSION
memtier_benchmark-1key-1Billion-bits-bitmap-bitcount 1654 1640 +- 4.7% (13 datapoints) -0.8% No Change
memtier_benchmark-1key-geo-2-elements-geopos 164972 164646 +- 3.6% (13 datapoints) -0.2% No Change
memtier_benchmark-1key-geo-2-elements-geosearch-fromlonlat-withcoord 110746 109630 +- 6.7% (13 datapoints) -1.0% No Change
memtier_benchmark-1key-geo-60M-elements-geodist 195285 190956 +- 4.0% (13 datapoints) -2.2% No Change
memtier_benchmark-1key-geo-60M-elements-geodist-pipeline-10 1114070 1063672 +- 6.5% (13 datapoints) -4.5% potential REGRESSION
memtier_benchmark-1key-geo-60M-elements-geohash 197219 193982 +- 4.3% (15 datapoints) -1.6% No Change
memtier_benchmark-1key-geo-60M-elements-geohash-pipeline-10 1157976 1089264 +- 4.0% (13 datapoints) -5.9% potential REGRESSION
memtier_benchmark-1key-geo-60M-elements-geopos 198481 193260 +- 5.9% (13 datapoints) -2.6% No Change
memtier_benchmark-1key-geo-60M-elements-geopos-pipeline-10 1153549 1123101 +- 4.9% (13 datapoints) -2.6% No Change
memtier_benchmark-1key-geo-60M-elements-geosearch-fromlonlat 174084 169798 +- 7.9% (13 datapoints) -2.5% No Change
memtier_benchmark-1key-geo-60M-elements-geosearch-fromlonlat-bybox 173462 166562 +- 7.4% (15 datapoints) -4.0% potential REGRESSION
memtier_benchmark-1key-geo-60M-elements-geosearch-fromlonlat-pipeline-10 725278 713976 +- 9.4% (15 datapoints) -1.6% No Change
memtier_benchmark-1key-hash-hscan-50-fields-10B-values 116298 112762 +- 7.0% (13 datapoints) -3.0% potential REGRESSION
memtier_benchmark-1key-list-10-elements-lrange-all-elements 179582 173983 +- 3.0% (13 datapoints) -3.1% potential REGRESSION
memtier_benchmark-1key-list-10-elements-lrange-all-elements-pipeline-10 757646 727730 +- 9.6% (13 datapoints) -3.9% potential REGRESSION
memtier_benchmark-1key-list-100-elements-lrange-all-elements 115974 114739 +- 6.5% (15 datapoints) -1.1% No Change
memtier_benchmark-1key-list-100-elements-lrange-all-elements-pipeline-10 198832 194382 +- 9.8% (13 datapoints) -2.2% No Change
memtier_benchmark-1key-list-10K-elements-lindex-integer 161982 158662 +- 4.1% (15 datapoints) -2.0% No Change
memtier_benchmark-1key-list-10K-elements-lindex-string 143466 140088 +- 4.8% (13 datapoints) -2.4% No Change
memtier_benchmark-1key-list-10K-elements-linsert-lrem-integer 8330 7077 +- 9.6% (13 datapoints) -15.0% REGRESSION
memtier_benchmark-1key-list-10K-elements-linsert-lrem-string 10716 9268 +- 9.5% (13 datapoints) -13.5% REGRESSION
memtier_benchmark-1key-list-10K-elements-lpos-integer 8204 7010 +- 9.2% (15 datapoints) -14.6% REGRESSION
memtier_benchmark-1key-list-10K-elements-lpos-string 9915 8721 +- 9.6% (13 datapoints) -12.0% REGRESSION
memtier_benchmark-1key-list-1K-elements-lrange-all-elements 21809 21807 +- 8.8% (15 datapoints) -0.0% No Change
memtier_benchmark-1key-list-1K-elements-lrange-all-elements-pipeline-10 22378 21359 +- 8.7% (15 datapoints) -4.6% potential REGRESSION
memtier_benchmark-1key-list-2K-elements-quicklist-lrange-all-elements-longs 8714 8716 +- 9.7% (13 datapoints) 0.0% No Change
memtier_benchmark-1key-pfadd-4KB-values-pipeline-10 330443 322927 +- 8.5% (13 datapoints) -2.3% No Change
memtier_benchmark-1key-set-10-elements-smembers 182379 178025 +- 3.9% (13 datapoints) -2.4% No Change
memtier_benchmark-1key-set-10-elements-smembers-pipeline-10 790108 779344 +- 8.0% (13 datapoints) -1.4% No Change
memtier_benchmark-1key-set-10-elements-smismember 192882 184861 +- 3.3% (15 datapoints) -4.2% potential REGRESSION
memtier_benchmark-1key-set-100-elements-sismember-is-a-member 185423 183532 +- 3.7% (15 datapoints) -1.0% No Change
memtier_benchmark-1key-set-100-elements-sismember-not-a-member 180401 176112 +- 4.0% (13 datapoints) -2.4% No Change
memtier_benchmark-1key-set-100-elements-smembers 114594 111541 +- 6.6% (15 datapoints) -2.7% No Change
memtier_benchmark-1key-set-100-elements-smismember 170577 169067 +- 5.2% (15 datapoints) -0.9% No Change
memtier_benchmark-1key-set-100-elements-sscan 110888 109205 +- 6.7% (15 datapoints) -1.5% No Change
memtier_benchmark-1key-set-10M-elements-sismember-50pct-chance 182166 184957 +- 3.2% (13 datapoints) 1.5% No Change
memtier_benchmark-1key-set-1K-elements-smembers 19617 19699 +- 9.5% (13 datapoints) 0.4% No Change
memtier_benchmark-1key-set-1M-elements-sismember-50pct-chance 192794 187774 +- 3.6% (13 datapoints) -2.6% No Change
memtier_benchmark-1key-set-200K-elements-sadd-constant 191837 188048 +- 4.0% (13 datapoints) -2.0% No Change
memtier_benchmark-1key-set-2M-elements-sadd-increasing 189791 188337 +- 6.9% (15 datapoints) -0.8% No Change
memtier_benchmark-1key-zincrby-1M-elements-pipeline-1 50194 51836 +- 7.1% (15 datapoints) 3.3% potential IMPROVEMENT
memtier_benchmark-1key-zrank-1M-elements-pipeline-1 55854 55205 +- 7.6% (13 datapoints) -1.2% No Change
memtier_benchmark-1key-zrem-5M-elements-pipeline-1 56801 55511 +- 7.3% (13 datapoints) -2.3% No Change
memtier_benchmark-1key-zrevrangebyscore-256K-elements-pipeline-1 121267 118100 +- 6.7% (13 datapoints) -2.6% No Change
memtier_benchmark-1key-zrevrank-1M-elements-pipeline-1 55938 55065 +- 7.3% (15 datapoints) -1.6% No Change
memtier_benchmark-1key-zset-10-elements-zrange-all-elements 108242 106344 +- 5.8% (15 datapoints) -1.8% No Change
memtier_benchmark-1key-zset-10-elements-zrange-all-elements-long-scores 137662 133916 +- 4.9% (15 datapoints) -2.7% No Change
memtier_benchmark-1key-zset-100-elements-zrange-all-elements 29582 29740 +- 9.2% (13 datapoints) 0.5% No Change
memtier_benchmark-1key-zset-100-elements-zrangebyscore-all-elements 29664 29824 +- 9.2% (13 datapoints) 0.5% No Change
memtier_benchmark-1key-zset-100-elements-zrangebyscore-all-elements-long-scores 65451 65035 +- 8.1% (13 datapoints) -0.6% No Change
memtier_benchmark-1key-zset-100-elements-zscan 80487 79021 +- 7.3% (15 datapoints) -1.8% No Change
memtier_benchmark-1key-zset-1K-elements-zrange-all-elements 4853 4811 +- 10.1% UNSTABLE (13 datapoints) -0.9% UNSTABLE (very high variance) No Change
memtier_benchmark-1key-zset-1M-elements-zcard-pipeline-10 1177857 1087658 +- 6.3% (15 datapoints) -7.7% potential REGRESSION
memtier_benchmark-1key-zset-1M-elements-zrevrange-5-elements 183353 178068 +- 3.8% (13 datapoints) -2.9% No Change
memtier_benchmark-1key-zset-1M-elements-zscore-pipeline-10 981837 937725 +- 5.7% (15 datapoints) -4.5% potential REGRESSION
memtier_benchmark-2keys-lua-eval-hset-expire 99575 98646 +- 6.1% (13 datapoints) -0.9% No Change
memtier_benchmark-2keys-lua-evalsha-hset-expire 114363 114535 +- 5.5% (13 datapoints) 0.2% No Change
memtier_benchmark-2keys-set-10-100-elements-sdiff 33416 32999 +- 9.8% (13 datapoints) -1.2% No Change
memtier_benchmark-2keys-set-10-100-elements-sinter 100864 98669 +- 6.3% (13 datapoints) -2.2% No Change
memtier_benchmark-2keys-set-10-100-elements-sunion 45848 45730 +- 11.5% UNSTABLE (13 datapoints) -0.3% UNSTABLE (very high variance) No Change
memtier_benchmark-2keys-stream-5-entries-xread-all-entries 92285 91280 +- 6.5% (15 datapoints) -1.1% No Change
memtier_benchmark-2keys-stream-5-entries-xread-all-entries-pipeline-10 158852 155881 +- 9.5% (13 datapoints) -1.9% No Change
memtier_benchmark-2keys-zset-300-elements-skiplist-encoded-zunion 4446 4504 +- 8.6% (17 datapoints) 1.3% No Change
memtier_benchmark-2keys-zset-300-elements-skiplist-encoded-zunionstore 5297 5278 +- 9.9% (13 datapoints) -0.4% No Change
memtier_benchmark-3Mkeys-load-string-with-512B-values 171607 167482 +- 5.0% (13 datapoints) -2.4% No Change
memtier_benchmark-3Mkeys-string-get-with-1KiB-values-pipeline-10-2000_conns 161675 167969 +- 7.5% (13 datapoints) 3.9% potential IMPROVEMENT
memtier_benchmark-3Mkeys-string-get-with-1KiB-values-pipeline-10-400_conns 162771 159975 +- 5.1% (13 datapoints) -1.7% No Change
memtier_benchmark-3Mkeys-string-get-with-1KiB-values-pipeline-10-40_conns 163528 160897 +- 4.4% (13 datapoints) -1.6% No Change
memtier_benchmark-3Mkeys-string-mixed-20-80-with-512B-values-pipeline-10-2000_conns 167799 178092 +- 7.3% (13 datapoints) 6.1% potential IMPROVEMENT
memtier_benchmark-3Mkeys-string-mixed-20-80-with-512B-values-pipeline-10-400_conns 167257 163763 +- 2.3% (15 datapoints) -2.1% No Change
memtier_benchmark-3Mkeys-string-mixed-20-80-with-512B-values-pipeline-10-5200_conns 140884 141667 +- 12.0% UNSTABLE (13 datapoints) 0.6% UNSTABLE (very high variance) No Change
memtier_benchmark-connection-hello 182522 177169 +- 3.2% (13 datapoints) -2.9% No Change

WARNING: There were 13 benchmarks with NO datapoints for both baseline and comparison.

NO DATAPOINTS test regexp names: m|m|m|m|m|m|m|m|m|r|r|r|r

Copy link
Copy Markdown
Member

@ShooterIT ShooterIT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I read it carefully, thank you for your impressive PR which improves a lot for HLL, it is exciting. And the code looks very good to me, maybe we just need to update comments to let people understand it easily.

small optimization, just skip 8 bytes at the first as below

unstable : Pfcounts     20097.37         9.93275        11.00700        16.31900        18.94300      1079.45
current  : Pfcounts     79993.65         2.49851         2.46300         4.92700         5.82300      4296.53
cr       : Pfcounts     80441.25         2.48559         2.43100         4.86300         6.14300      4320.57

unstable : Pfmerges     15863.74        12.56612        14.84700        18.55900        20.99100       991.48
current  : Pfmerges    148759.06         1.34511         1.31900         2.63900         4.25500      9297.44
cr.      : Pfmerges    152308.85         1.31122         1.28700         2.57500         4.25500      9519.30

Nugine and others added 3 commits October 28, 2024 13:59
Co-authored-by: Yuan Wang <wangyuancode@163.com>
@ShooterIT
Copy link
Copy Markdown
Member

currently this PR looks good to me, please also have a look @sundb @fcostaoliveira @YaacovHazan

@sundb
Copy link
Copy Markdown
Collaborator

sundb commented Oct 30, 2024

fully CI on arm64: https://github.com/redis/redis-extra-ci/actions/runs/11586127029
fully CI: https://github.com/sundb/redis/actions/runs/11586131243

ShooterIT
ShooterIT previously approved these changes Nov 7, 2024
@ShooterIT ShooterIT requested a review from sundb November 8, 2024 02:40
sundb
sundb previously approved these changes Nov 8, 2024
Co-authored-by: debing.sun <debing.sun@redis.com>
@Nugine Nugine dismissed stale reviews from sundb and ShooterIT via 986c4e4 November 8, 2024 06:59
@ShooterIT ShooterIT merged commit fdeb976 into redis:unstable Nov 8, 2024
@ShooterIT
Copy link
Copy Markdown
Member

@Nugine merged, thank you

@Nugine
Copy link
Copy Markdown
Contributor Author

Nugine commented Nov 12, 2024

Hi @ShooterIT @sundb
I'd like to submit this performance improvement to valkey. But I'm not sure what to do with the co-authored parts considering the Redis licenses. https://github.com/orgs/valkey-io/discussions/1286
What do you think?

@sundb
Copy link
Copy Markdown
Collaborator

sundb commented Nov 12, 2024

Hi @ShooterIT @sundb
I'd like to submit this performance improvement to valkey. But I'm not sure what to do with the co-authored parts considering the Redis licenses. https://github.com/orgs/valkey-io/discussions/1286
What do you think?

@Nugine you have the right to apply this PR to valkey.

ShooterIT pushed a commit that referenced this pull request Dec 18, 2024
The bug was introduced in #13558 . 

When merging dense hll structures, `hllDenseCompress` writes to wrong
location and the result will be zero. The unit tests didn't cover this
case.

This PR
+ fixes the bug
+ adds `PFDEBUG SIMD (ON|OFF)` for unit tests
+ adds a new TCL test to cover the cases

Synchronized from valkey-io/valkey#1293

---------

Signed-off-by: Xuyang Wang <xuyangwang@link.cuhk.edu.cn>
Co-authored-by: debing.sun <debing.sun@redis.com>
@alonre24
Copy link
Copy Markdown
Contributor

alonre24 commented Dec 18, 2024

Note that after merging this PR, building the code with clang 18 gives the following error -

In file included from /usr/lib/llvm-18/lib/clang/18/include/immintrin.h:26:
In file included from /usr/lib/llvm-18/lib/clang/18/include/xmmintrin.h:31:
/usr/lib/llvm-18/lib/clang/18/include/mm_malloc.h:35:12: error: 'malloc' is deprecated [-Werror,-Wdeprecated-declarations]
   35 |     return malloc(__size);
      |            ^
./server.h:3873:43: note: 'malloc' has been explicitly marked deprecated here
 3873 | void *malloc(size_t size) __attribute__ ((deprecated));

@sundb
Copy link
Copy Markdown
Collaborator

sundb commented Dec 18, 2024

@alonre24 Have you seen it on github action? This complaint is not caused by this, but because ubuntu-latest is using 24.04 in the ci of some users, I want to wait until this repository ubuntu is upgraded before fixing it.

you can try locally with ubuntu 24.04:
CC=clang make

@alonre24
Copy link
Copy Markdown
Contributor

@sundb not only in github actions, I reproduced it locally with ubuntu 22. From what I see, it happens when using clang-18 (older clang are ok).
I believe that #include <immintrin.h> might causing this, since it assumes GCC compiler. Perhaps we can use the following logic instead:

#include <x86intrin.h>
#elif defined(__clang__)
#include <xmmintrin.h>
#elif defined(_MSC_VER)
#include <intrin.h>
#include <stdexcept>
#endif

?

@ShooterIT
Copy link
Copy Markdown
Member

cool, @alonre24 could you make a PR for this issue?

@sundb
Copy link
Copy Markdown
Collaborator

sundb commented Dec 18, 2024

@alonre24 yeah, you're right.
I found that the complaint was caused by #include <mm_malloc.h> in <xmmintrin.h>, which caused the compiler to mistakenly assume that we were using malloc and free, which were set to deprecated in server.h.

@sundb
Copy link
Copy Markdown
Collaborator

sundb commented Dec 18, 2024

one of the alternative is forbiding attribute deprecated for clang.

#if defined(__GNUC__) && !defined(__clang__)
void *calloc(size_t count, size_t size) __attribute__ ((deprecated));
void free(void *ptr) __attribute__ ((deprecated));
void *malloc(size_t size) __attribute__ ((deprecated));
void *realloc(void *ptr, size_t size) __attribute__ ((deprecated));
#endif

@sundb sundb added this to Redis 8.0 Aug 15, 2025
@sundb sundb removed this from Redis 8.2 Aug 15, 2025
@sundb sundb moved this from Todo to Done in Redis 8.0 Aug 15, 2025
funny-dog pushed a commit to funny-dog/redis that referenced this pull request Sep 17, 2025
This PR optimizes the performance of HyperLogLog commands (PFCOUNT,
PFMERGE) by adding AVX2 fast paths.

Two AVX2 functions are added for conversion between raw representation
and dense representation. They are 15 ~ 30 times faster than scalar
implementaion. Note that sparse representation is not accelerated.

AVX2 fast paths are enabled when the CPU supports AVX2 (checked at
runtime) and the hyperloglog configuration is default (HLL_REGISTERS ==
16384 && HLL_BITS == 6).

When merging 3 dense hll structures, the benchmark shows a 12x speedup
compared to the scalar version.

```
pfcount key1 key2 key3
pfmerge keyall key1 key2 key3
```

```
======================================================================================================
Type             Ops/sec    Avg. Latency     p50 Latency     p99 Latency   p99.9 Latency       KB/sec 
------------------------------------------------------------------------------------------------------
PFCOUNT-scalar    5570.09        35.89060        32.51100        65.27900        69.11900       299.17
PFCOUNT-avx2     72604.92         2.82072         2.73500         5.50300         7.13500      3899.68
------------------------------------------------------------------------------------------------------
PFMERGE-scalar    7879.13        25.52156        24.19100        46.33500        48.38300       492.45
PFMERGE-avx2    126448.64         1.58120         1.53500         3.08700         4.89500      7903.04
------------------------------------------------------------------------------------------------------

scalar: redis:unstable   7f38c7b
avx2:   Nugine:hll-simd  02e09f8 

CPU:    13th Gen Intel® Core™ i9-13900H × 20
Memory: 32.0 GiB
OS:     Ubuntu 22.04.5 LTS
```

Experiment repo: https://github.com/Nugine/redis-hyperloglog
Benchmark script:
https://github.com/Nugine/redis-hyperloglog/blob/main/scripts/memtier.sh
Algorithm:
https://github.com/Nugine/redis-hyperloglog/blob/main/cpp/bench.cpp

resolves redis#13551

---------

Co-authored-by: Yuan Wang <wangyuancode@163.com>
Co-authored-by: debing.sun <debing.sun@redis.com>
funny-dog pushed a commit to funny-dog/redis that referenced this pull request Sep 17, 2025
The bug was introduced in redis#13558 . 

When merging dense hll structures, `hllDenseCompress` writes to wrong
location and the result will be zero. The unit tests didn't cover this
case.

This PR
+ fixes the bug
+ adds `PFDEBUG SIMD (ON|OFF)` for unit tests
+ adds a new TCL test to cover the cases

Synchronized from valkey-io/valkey#1293

---------

Signed-off-by: Xuyang Wang <xuyangwang@link.cuhk.edu.cn>
Co-authored-by: debing.sun <debing.sun@redis.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

action:run-benchmark Triggers the benchmark suite for this Pull Request

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

[NEW] Optimize HyperLogLog by SIMD acceleration

5 participants