-
Notifications
You must be signed in to change notification settings - Fork 24.4k
Description
since #9166 fix due to #10020 we've largely increased the amount of times we call getClientType() on a single command. LRANGE is a good example because we call getClientType() for all reply elements.
As we can see below, this is visible even on small reply sizes -- on a 10 elements reply we drop from 640K ops/sec on v6.2.6 to 626K ops/sec on unstable, which is a drop we can directly connect to the increase of CPU cycles spent on getClientType() -- ~3%.
@ShooterIT is there some logic/#calls we can reduce to avoid this extra CPU consumption?
PS: this was caught via https://github.com/redis/redis-benchmarks-specification/blob/main/redis_benchmarks_specification/test-suites/memtier_benchmark-1key-list-10-elements-lrange-all-elements.yml
confirmation with profile data
unstable (8192625)

benchmark info
To populate the data:
"RPUSH" "list:10" "lysbgqqfqw" "mtccjerdon" "jekkafodvk" "nmgxcctxpn" "vyqqkuszzh" "pytrnqdhvs" "oguwnmniig" "gekntrykfh" "nhfnbxqgol" "cgoeihlnei"
To benchmark:
quick check on an LRANGE command reply with a 10 elements list:
v6.2.6
root@hpe10:~/redis# taskset -c 1-5 memtier_benchmark --command="LRANGE list:10 0 -1" --hide-histogram --test-time 60 --pipeline 10
Writing results to stdout
[RUN #1] Preparing benchmark client...
[RUN #1] Launching threads now...
[RUN #1 100%, 60 secs] 0 threads: 38447210 ops, 641149 (avg: 640765) ops/sec, 133.91MB/sec (avg: 133.83MB/sec), 3.11 (avg: 3.11) msec latency
4 Threads
50 Connections per thread
60 Seconds
ALL STATS
==================================================================================================
Type Ops/sec Avg. Latency p50 Latency p99 Latency p99.9 Latency KB/sec
--------------------------------------------------------------------------------------------------
Lranges 640765.27 3.11013 3.34300 4.25500 4.54300 137038.67
Totals 640765.27 3.11013 3.34300 4.25500 4.54300 137038.67
unstable
root@hpe10:~/redis# taskset -c 1-5 memtier_benchmark --command="LRANGE list:10 0 -1" --hide-histogram --test-time 60 --pipeline 10
Writing results to stdout
[RUN #1] Preparing benchmark client...
[RUN #1] Launching threads now...
[RUN #1 100%, 60 secs] 0 threads: 37609260 ops, 627934 (avg: 626798) ops/sec, 131.15MB/sec (avg: 130.91MB/sec), 3.17 (avg: 3.18) msec latency
4 Threads
50 Connections per thread
60 Seconds
ALL STATS
==================================================================================================
Type Ops/sec Avg. Latency p50 Latency p99 Latency p99.9 Latency KB/sec
--------------------------------------------------------------------------------------------------
Lranges 626801.72 3.17964 3.42300 4.35100 4.60700 134052.32
Totals 626801.72 3.17964 3.42300 4.35100 4.60700 134052.32
