Improve performance during rehashing by chzhoo · Pull Request #3073 · valkey-io/valkey

chzhoo · 2026-01-18T16:17:24Z

Description

As the hashtable grows, rehash is triggered. Rehash consumes a significant amount of CPU time, resulting in noticeable performance degradation. Much of this CPU time is spent on random memory access (when accessing hashtable entries). This CPU usage can be optimized through concurrent memory access.

//  The original rehash process:
For each bucket and its child buckets, all entries are processed as follows:
    1) Access the entry, which involves random memory access.
    2) Compute the hash value of the entry.
    3) Based on the hash value, insert the entries into the new hashtable.

//  The optimized rehash process:
The original steps are transformed into batch processing and phased execution, leveraging
the CPU's multi-issue capability to achieve concurrent memory access. The steps are as follows:
    1) Access multiple entries serially within a for loop. Since each entry access is independent,
       modern CPU architectures can perform concurrent memory access.
    2) Compute the hash values for all entries.
    3) Based on the hash values, insert the entries into the new hashtable.

Benchmark

Benchmark step

Write a batch of data into the server and record the QPS with the following commands.

// Write command
valkey-benchmark  -n 75000000 -r 75000000 --sequential --threads 2 set key:__rand_int__ __data__

// Record QPS
for((x=0;x<100000;x+=1));do valkey-cli info|grep ops;sleep 1;done

Benchmark result

The benchmark results are shown below, with the Y-axis representing the recorded QPS and the X-axis representing time (in seconds). The green line indicates the optimized performance, while the blue line represents the original performance.

Benchmark env

CPU: Intel(R) Xeon(R) 6983P-C * 8
OS: Ubuntu Server 24.04 LTS 64bit
Memory: 32GB
VM: Tencent Cloud | S9e.2XLARGE32
Server start command: valkey-server --save '' --appendonly no

Signed-off-by: chzhoo <czawyx@163.com>

codecov · 2026-01-18T16:35:49Z

Codecov Report

❌ Patch coverage is 98.36066% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 74.35%. Comparing base (4524c23) to head (87717aa).
⚠️ Report is 19 commits behind head on unstable.

Files with missing lines	Patch %	Lines
src/hashtable.c	98.36%	1 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff              @@
##           unstable    #3073      +/-   ##
============================================
+ Coverage     74.31%   74.35%   +0.03%     
============================================
  Files           129      129              
  Lines         71010    71037      +27     
============================================
+ Hits          52773    52821      +48     
+ Misses        18237    18216      -21

Files with missing lines	Coverage Δ
src/hashtable.c	`89.40% <98.36%> (+0.30%)`	⬆️

... and 24 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Signed-off-by: chzhoo <czawyx@163.com>

zuiderkwast

This is very clever!

With this approach, the CPU (or the compiler?) can be smart enough to do concurrent memory access and we don't need to add any explicit __builtin_prefetch() (valkey_prefetch) at all?

Signed-off-by: chzhoo <czawyx@163.com>

chzhoo · 2026-01-19T16:41:10Z

With this approach, the CPU (or the compiler?) can be smart enough to do concurrent memory access and we don't need to add any explicit __builtin_prefetch() (valkey_prefetch) at all?

Yes, I have benchmarked the performance of manually calling __builtin_prefetch() against this approach, and they deliver identical performance improvement.

zuiderkwast

Looks good to me. Thank you!

I will merge it but first I will wait some days to let others have a chance to review too.

github-actions · 2026-01-20T08:57:32Z

Benchmark ran on this commit: 87717aa

Benchmark Comparison: unstable vs `8a8a090` (averaged) - rps metrics

Run Summary:

unstable: 80 total runs, 16 configurations (avg 5.00 runs per config)
8a8a090: 80 total runs, 16 configurations (avg 5.00 runs per config)

Statistical Notes:

CI99%: 99% Confidence Interval - range where the true population mean is likely to fall
PI99%: 99% Prediction Interval - range where a single future observation is likely to fall
CV: Coefficient of Variation - relative variability (σ/μ × 100%)

Note: Values with (n=X, σ=Y, CV=Z%, CI99%=±W%, PI99%=±V%) indicate averages from X runs with standard deviation Y, coefficient of variation Z%, 99% confidence interval margin of error ±W% of the mean, and 99% prediction interval margin of error ±V% of the mean. CI bounds [A, B] and PI bounds [C, D] show the actual interval ranges.

Configuration:

architecture: aarch64
benchmark_mode: duration
clients: 1600
cluster_mode: False
data_size: 16
duration: 180
tls: False
valkey_benchmark_threads: 90
warmup: 30

Command	Metric	Pipeline	io_threads	unstable	`8a8a090`	Diff	% Change
GET	rps	1	1	227331.550 (n=5, σ=1679.731, CV=0.74%, CI99%=±1.521%, PI99%=±3.727%, CI[223872.961, 230790.139], PI[218859.771, 235803.329])	225675.060 (n=5, σ=2445.363, CV=1.08%, CI99%=±2.231%, PI99%=±5.465%, CI[220640.025, 230710.095], PI[213341.793, 238008.327])	-1656.490	-0.729%
GET	rps	1	9	1537483.500 (n=5, σ=11843.678, CV=0.77%, CI99%=±1.586%, PI99%=±3.885%, CI[1513097.203, 1561869.797], PI[1477749.516, 1597217.484])	1548560.374 (n=5, σ=13544.967, CV=0.87%, CI99%=±1.801%, PI99%=±4.411%, CI[1520671.100, 1576449.648], PI[1480245.884, 1616874.864])	11076.874	+0.720%
GET	rps	10	1	1188247.100 (n=5, σ=2880.378, CV=0.24%, CI99%=±0.499%, PI99%=±1.223%, CI[1182316.361, 1194177.839], PI[1173719.816, 1202774.384])	1192075.802 (n=5, σ=2149.847, CV=0.18%, CI99%=±0.371%, PI99%=±0.910%, CI[1187649.237, 1196502.367], PI[1181232.977, 1202918.627])	3828.702	+0.322%
GET	rps	10	9	2484702.500 (n=5, σ=19454.777, CV=0.78%, CI99%=±1.612%, PI99%=±3.949%, CI[2444644.844, 2524760.156], PI[2386581.684, 2582823.316])	2489419.900 (n=5, σ=21994.250, CV=0.88%, CI99%=±1.819%, PI99%=±4.456%, CI[2444133.435, 2534706.365], PI[2378491.168, 2600348.632])	4717.400	+0.190%
SET	rps	1	1	219381.952 (n=5, σ=2182.163, CV=0.99%, CI99%=±2.048%, PI99%=±5.017%, CI[214888.848, 223875.056], PI[208376.139, 230387.765])	219427.970 (n=5, σ=2600.889, CV=1.19%, CI99%=±2.441%, PI99%=±5.978%, CI[214072.703, 224783.237], PI[206310.298, 232545.642])	46.018	+0.021%
SET	rps	1	9	1458590.524 (n=5, σ=11579.609, CV=0.79%, CI99%=±1.635%, PI99%=±4.004%, CI[1434747.949, 1482433.099], PI[1400188.382, 1516992.666])	1476453.150 (n=5, σ=15442.674, CV=1.05%, CI99%=±2.154%, PI99%=±5.275%, CI[1444656.472, 1508249.828], PI[1398567.513, 1554338.787])	17862.626	+1.225%
SET	rps	10	1	1015219.774 (n=5, σ=6897.148, CV=0.68%, CI99%=±1.399%, PI99%=±3.426%, CI[1001018.450, 1029421.098], PI[980433.777, 1050005.771])	1011002.426 (n=5, σ=6741.589, CV=0.67%, CI99%=±1.373%, PI99%=±3.363%, CI[997121.401, 1024883.451], PI[977000.998, 1045003.854])	-4217.348	-0.415%
SET	rps	10	9	1863229.624 (n=5, σ=15821.845, CV=0.85%, CI99%=±1.748%, PI99%=±4.283%, CI[1830652.226, 1895807.022], PI[1783431.622, 1943027.626])	1861866.750 (n=5, σ=10942.478, CV=0.59%, CI99%=±1.210%, PI99%=±2.964%, CI[1839336.036, 1884397.464], PI[1806677.997, 1917055.503])	-1362.874	-0.073%

Configuration:

architecture: aarch64
benchmark_mode: duration
clients: 1600
cluster_mode: False
data_size: 96
duration: 180
tls: False
valkey_benchmark_threads: 90
warmup: 30

Command	Metric	Pipeline	io_threads	unstable	`8a8a090`	Diff	% Change
GET	rps	1	1	219633.368 (n=5, σ=558.156, CV=0.25%, CI99%=±0.523%, PI99%=±1.282%, CI[218484.117, 220782.619], PI[216818.288, 222448.448])	218949.982 (n=5, σ=1720.146, CV=0.79%, CI99%=±1.618%, PI99%=±3.962%, CI[215408.177, 222491.787], PI[210274.366, 227625.598])	-683.386	-0.311%
GET	rps	1	9	1475438.548 (n=5, σ=9322.927, CV=0.63%, CI99%=±1.301%, PI99%=±3.187%, CI[1456242.512, 1494634.584], PI[1428418.055, 1522459.041])	1474213.050 (n=5, σ=5102.297, CV=0.35%, CI99%=±0.713%, PI99%=±1.746%, CI[1463707.349, 1484718.751], PI[1448479.444, 1499946.656])	-1225.498	-0.083%
GET	rps	10	1	1125427.102 (n=5, σ=6092.618, CV=0.54%, CI99%=±1.115%, PI99%=±2.730%, CI[1112882.317, 1137971.887], PI[1094698.779, 1156155.425])	1117606.922 (n=5, σ=5580.038, CV=0.50%, CI99%=±1.028%, PI99%=±2.518%, CI[1106117.546, 1129096.298], PI[1089463.814, 1145750.030])	-7820.180	-0.695%
GET	rps	10	9	1991558.974 (n=5, σ=24397.528, CV=1.23%, CI99%=±2.522%, PI99%=±6.179%, CI[1941324.126, 2041793.822], PI[1868509.229, 2114608.719])	1990530.000 (n=5, σ=15990.396, CV=0.80%, CI99%=±1.654%, PI99%=±4.052%, CI[1957605.553, 2023454.447], PI[1909881.905, 2071178.095])	-1028.974	-0.052%
SET	rps	1	1	211375.438 (n=5, σ=1606.674, CV=0.76%, CI99%=±1.565%, PI99%=±3.834%, CI[208067.273, 214683.603], PI[203272.122, 219478.754])	209347.666 (n=5, σ=369.357, CV=0.18%, CI99%=±0.363%, PI99%=±0.890%, CI[208587.155, 210108.177], PI[207484.802, 211210.530])	-2027.772	-0.959%
SET	rps	1	9	1476752.000 (n=5, σ=8144.125, CV=0.55%, CI99%=±1.136%, PI99%=±2.781%, CI[1459983.134, 1493520.866], PI[1435676.835, 1517827.165])	1478347.126 (n=5, σ=11548.269, CV=0.78%, CI99%=±1.608%, PI99%=±3.940%, CI[1454569.080, 1502125.172], PI[1420103.046, 1536591.206])	1595.126	+0.108%
SET	rps	10	1	997713.824 (n=5, σ=5102.691, CV=0.51%, CI99%=±1.053%, PI99%=±2.579%, CI[987207.312, 1008220.336], PI[971978.231, 1023449.417])	993768.276 (n=5, σ=11079.534, CV=1.11%, CI99%=±2.296%, PI99%=±5.623%, CI[970955.362, 1016581.190], PI[937888.277, 1049648.275])	-3945.548	-0.395%
SET	rps	10	9	1783128.126 (n=5, σ=24470.219, CV=1.37%, CI99%=±2.826%, PI99%=±6.921%, CI[1732743.607, 1833512.645], PI[1659711.764, 1906544.488])	1746892.800 (n=5, σ=25555.703, CV=1.46%, CI99%=±3.012%, PI99%=±7.378%, CI[1694273.255, 1799512.345], PI[1618001.765, 1875783.835])	-36235.326	-2.032%

dvkashapov · 2026-01-20T09:39:26Z

I don't think this benchmark is representative as rehashing would mainly occur in warmup duration of write load and would go unnoticed in result.

roshkhatri · 2026-01-20T21:03:30Z

I don't think this benchmark is representative as rehashing would mainly occur in warmup duration of write load and would go unnoticed in result.

Do we want a label for benchmarks without warmups?

zuiderkwast · 2026-01-20T22:47:14Z

Do we want a label for benchmarks without warmups?

@roshkhatri It would help for this PR... but I feel that every optimization has some special scenario that we want to target so every optimization requires some specific benchmark to visualize the improvement. I guess whatever we add to the automated benchmarking, it might not be enough for the next optimization PR.

If we could chat with the github actions bot and give it some very specific instructions, like give it a specific valkey-benchmark command line to run, then it might covering more special scenarios. Maybe. I think it's fine to run special benchmarks manually though.

@chzhoo When you run the benchmark, do you see a difference in the latencies in the various percentiles? I would guess we have fewer requests with a high latency.

chzhoo · 2026-01-21T16:22:31Z

@chzhoo When you run the benchmark, do you see a difference in the latencies in the various percentiles? I would guess we have fewer requests with a high latency.

I re-ran the benchmarks. The new version shows latency improvements at the p95 and p99 percentiles, while p50 latency remains unchanged. The commands and detailed results are below.

// Server start command
valkey-server --save '' --appendonly no

// Benchmark command
valkey-benchmark  -n 75000000 -r 75000000 --sequential --threads 2 set key:__rand_int__ __data__

// Before optimization
====== set key:__rand_int__ __data__ ======
  75000000 requests completed in 401.04 seconds
  50 parallel clients
  45 bytes payload
  keep alive: 1
  host configuration "save":
  host configuration "appendonly": no
  multi-thread: yes
  threads: 2

Latency by percentile distribution:
0.000% <= 0.055 milliseconds (cumulative count 71)
50.000% <= 0.215 milliseconds (cumulative count 40922633)
75.000% <= 0.263 milliseconds (cumulative count 56555392)
87.500% <= 0.319 milliseconds (cumulative count 65903563)
93.750% <= 0.399 milliseconds (cumulative count 70397312)
96.875% <= 0.479 milliseconds (cumulative count 72881842)
98.438% <= 0.519 milliseconds (cumulative count 73863507)
99.219% <= 0.567 milliseconds (cumulative count 74452328)
99.609% <= 0.623 milliseconds (cumulative count 74732718)
99.805% <= 0.679 milliseconds (cumulative count 74860330)
99.902% <= 0.743 milliseconds (cumulative count 74927923)
99.951% <= 1.463 milliseconds (cumulative count 74963383)
99.976% <= 1.679 milliseconds (cumulative count 74982290)
99.988% <= 1.767 milliseconds (cumulative count 74990863)
99.994% <= 1.855 milliseconds (cumulative count 74995514)
99.997% <= 2.095 milliseconds (cumulative count 74997713)
99.998% <= 3.207 milliseconds (cumulative count 74998870)
99.999% <= 4.007 milliseconds (cumulative count 74999445)
100.000% <= 4.167 milliseconds (cumulative count 74999720)
100.000% <= 6.231 milliseconds (cumulative count 74999861)
100.000% <= 12.399 milliseconds (cumulative count 74999929)
100.000% <= 23.487 milliseconds (cumulative count 74999965)
100.000% <= 23.583 milliseconds (cumulative count 74999985)
100.000% <= 23.615 milliseconds (cumulative count 74999992)
100.000% <= 23.647 milliseconds (cumulative count 74999998)
100.000% <= 23.663 milliseconds (cumulative count 74999999)
100.000% <= 23.727 milliseconds (cumulative count 75000000)
100.000% <= 23.727 milliseconds (cumulative count 75000000)

Cumulative distribution of latencies:
1.302% <= 0.103 milliseconds (cumulative count 976737)
47.353% <= 0.207 milliseconds (cumulative count 35515010)
86.106% <= 0.303 milliseconds (cumulative count 64579615)
94.164% <= 0.407 milliseconds (cumulative count 70623085)
98.033% <= 0.503 milliseconds (cumulative count 73524966)
99.566% <= 0.607 milliseconds (cumulative count 74674580)
99.858% <= 0.703 milliseconds (cumulative count 74893337)
99.933% <= 0.807 milliseconds (cumulative count 74949875)
99.940% <= 0.903 milliseconds (cumulative count 74954650)
99.942% <= 1.007 milliseconds (cumulative count 74956149)
99.945% <= 1.103 milliseconds (cumulative count 74959100)
99.948% <= 1.207 milliseconds (cumulative count 74961313)
99.949% <= 1.303 milliseconds (cumulative count 74961666)
99.949% <= 1.407 milliseconds (cumulative count 74961944)
99.954% <= 1.503 milliseconds (cumulative count 74965572)
99.965% <= 1.607 milliseconds (cumulative count 74974115)
99.980% <= 1.703 milliseconds (cumulative count 74984978)
99.991% <= 1.807 milliseconds (cumulative count 74993477)
99.995% <= 1.903 milliseconds (cumulative count 74996586)
99.997% <= 2.007 milliseconds (cumulative count 74997458)
99.997% <= 2.103 milliseconds (cumulative count 74997746)
99.998% <= 3.103 milliseconds (cumulative count 74998598)
100.000% <= 4.103 milliseconds (cumulative count 74999649)
100.000% <= 5.103 milliseconds (cumulative count 74999850)
100.000% <= 7.103 milliseconds (cumulative count 74999900)
100.000% <= 13.103 milliseconds (cumulative count 74999950)
100.000% <= 24.111 milliseconds (cumulative count 75000000)

Summary:
  throughput summary: 187012.36 requests per second
  latency summary (msec):
          avg       min       p50       p95       p99       max
        0.238     0.048     0.215     0.431     0.551    23.727

// After optimization
====== set key:__rand_int__ __data__ ======
  75000000 requests completed in 380.29 seconds
  50 parallel clients
  45 bytes payload
  keep alive: 1
  host configuration "save":
  host configuration "appendonly": no
  multi-thread: yes
  threads: 2

Latency by percentile distribution:
0.000% <= 0.055 milliseconds (cumulative count 187)
50.000% <= 0.215 milliseconds (cumulative count 40193388)
75.000% <= 0.263 milliseconds (cumulative count 58981653)
87.500% <= 0.287 milliseconds (cumulative count 66221892)
93.750% <= 0.351 milliseconds (cumulative count 70899319)
96.875% <= 0.375 milliseconds (cumulative count 72936000)
98.438% <= 0.399 milliseconds (cumulative count 73850462)
99.219% <= 0.447 milliseconds (cumulative count 74449557)
99.609% <= 0.487 milliseconds (cumulative count 74711787)
99.805% <= 0.535 milliseconds (cumulative count 74864717)
99.902% <= 0.583 milliseconds (cumulative count 74928538)
99.951% <= 0.695 milliseconds (cumulative count 74963970)
99.976% <= 1.479 milliseconds (cumulative count 74981951)
99.988% <= 1.583 milliseconds (cumulative count 74991117)
99.994% <= 1.871 milliseconds (cumulative count 74995436)
99.997% <= 2.303 milliseconds (cumulative count 74997743)
99.998% <= 2.663 milliseconds (cumulative count 74998865)
99.999% <= 3.007 milliseconds (cumulative count 74999428)
100.000% <= 5.239 milliseconds (cumulative count 74999715)
100.000% <= 8.255 milliseconds (cumulative count 74999861)
100.000% <= 11.615 milliseconds (cumulative count 74999930)
100.000% <= 22.591 milliseconds (cumulative count 74999965)
100.000% <= 22.671 milliseconds (cumulative count 74999984)
100.000% <= 22.703 milliseconds (cumulative count 74999995)
100.000% <= 22.719 milliseconds (cumulative count 74999996)
100.000% <= 22.735 milliseconds (cumulative count 74999998)
100.000% <= 22.751 milliseconds (cumulative count 74999999)
100.000% <= 22.799 milliseconds (cumulative count 75000000)
100.000% <= 22.799 milliseconds (cumulative count 75000000)

Cumulative distribution of latencies:
0.756% <= 0.103 milliseconds (cumulative count 567372)
45.669% <= 0.207 milliseconds (cumulative count 34251414)
90.677% <= 0.303 milliseconds (cumulative count 68007486)
98.712% <= 0.407 milliseconds (cumulative count 74033626)
99.713% <= 0.503 milliseconds (cumulative count 74784748)
99.926% <= 0.607 milliseconds (cumulative count 74944588)
99.953% <= 0.703 milliseconds (cumulative count 74964512)
99.955% <= 0.807 milliseconds (cumulative count 74966227)
99.955% <= 0.903 milliseconds (cumulative count 74966420)
99.955% <= 1.007 milliseconds (cumulative count 74966549)
99.956% <= 1.103 milliseconds (cumulative count 74966676)
99.956% <= 1.207 milliseconds (cumulative count 74966834)
99.956% <= 1.303 milliseconds (cumulative count 74967179)
99.968% <= 1.407 milliseconds (cumulative count 74975656)
99.980% <= 1.503 milliseconds (cumulative count 74984656)
99.989% <= 1.607 milliseconds (cumulative count 74992018)
99.992% <= 1.703 milliseconds (cumulative count 74994079)
99.993% <= 1.807 milliseconds (cumulative count 74995066)
99.994% <= 1.903 milliseconds (cumulative count 74995589)
99.995% <= 2.007 milliseconds (cumulative count 74996212)
99.996% <= 2.103 milliseconds (cumulative count 74996775)
99.999% <= 3.103 milliseconds (cumulative count 74999477)
100.000% <= 4.103 milliseconds (cumulative count 74999650)
100.000% <= 5.103 milliseconds (cumulative count 74999705)
100.000% <= 6.103 milliseconds (cumulative count 74999775)
100.000% <= 7.103 milliseconds (cumulative count 74999801)
100.000% <= 8.103 milliseconds (cumulative count 74999826)
100.000% <= 9.103 milliseconds (cumulative count 74999900)
100.000% <= 12.103 milliseconds (cumulative count 74999950)
100.000% <= 23.103 milliseconds (cumulative count 75000000)

Summary:
  throughput summary: 197220.50 requests per second
  latency summary (msec):
          avg       min       p50       p95       p99       max
        0.231     0.048     0.215     0.359     0.423    22.799

roshkhatri · 2026-01-23T17:31:22Z

If we could chat with the github actions bot and give it some very specific instructions, like give it a specific valkey-benchmark command line to run, then it might covering more special scenarios. Maybe. I think it's fine to run special benchmarks manually though.

Yes I think I can create wf that you can interact with, mention the PR and give the configs that we want

zuiderkwast · 2026-01-23T18:15:34Z

For some bots, you can post a github comment with some commands it will act on. For example dependabot accepts commands in this way: https://docs.github.com/en/code-security/reference/supply-chain-security/dependabot-pull-request-comment-commands

As the hashtable grows, rehash is triggered. Rehash consumes a significant amount of CPU time, resulting in noticeable performance degradation. Much of this CPU time is spent on random memory access (when accessing hashtable entries). This CPU usage can be optimized through concurrent memory access. ``` // The original rehash process: For each bucket and its child buckets, all entries are processed as follows: 1) Access the entry, which involves random memory access. 2) Compute the hash value of the entry. 3) Based on the hash value, insert the entries into the new hashtable. // The optimized rehash process: The original steps are transformed into batch processing and phased execution, leveraging the CPU's multi-issue capability to achieve concurrent memory access. The steps are as follows: 1) Access multiple entries serially within a for loop. Since each entry access is independent, modern CPU architectures can perform concurrent memory access. 2) Compute the hash values for all entries. 3) Based on the hash values, insert the entries into the new hashtable. ``` --------- Signed-off-by: chzhoo <czawyx@163.com>

As the hashtable grows, rehash is triggered. Rehash consumes a significant amount of CPU time, resulting in noticeable performance degradation. Much of this CPU time is spent on random memory access (when accessing hashtable entries). This CPU usage can be optimized through concurrent memory access. ``` // The original rehash process: For each bucket and its child buckets, all entries are processed as follows: 1) Access the entry, which involves random memory access. 2) Compute the hash value of the entry. 3) Based on the hash value, insert the entries into the new hashtable. // The optimized rehash process: The original steps are transformed into batch processing and phased execution, leveraging the CPU's multi-issue capability to achieve concurrent memory access. The steps are as follows: 1) Access multiple entries serially within a for loop. Since each entry access is independent, modern CPU architectures can perform concurrent memory access. 2) Compute the hash values for all entries. 3) Based on the hash values, insert the entries into the new hashtable. ``` --------- Signed-off-by: chzhoo <czawyx@163.com> Signed-off-by: Harkrishn Patro <bunty.hari@gmail.com>

Reduce rehash cost time

12aed62

Signed-off-by: chzhoo <czawyx@163.com>

github-actions Bot assigned chzhoo Jan 18, 2026

zuiderkwast requested a review from rainsupreme January 18, 2026 18:27

dvkashapov reviewed Jan 19, 2026

View reviewed changes

Comment thread src/hashtable.c Outdated

optimize comments

3e341ac

Signed-off-by: chzhoo <czawyx@163.com>

chzhoo changed the title ~~Optimize the performance degradation during the rehash~~ Mitigate performance degradation during rehash Jan 19, 2026

zuiderkwast reviewed Jan 19, 2026

View reviewed changes

Comment thread src/hashtable.c Outdated

Comment thread src/hashtable.c Outdated

cleanup logic and optimize comments

87717aa

Signed-off-by: chzhoo <czawyx@163.com>

zuiderkwast approved these changes Jan 19, 2026

View reviewed changes

zuiderkwast added release-notes This issue should get a line item in the release notes to-be-merged Almost ready to merge run-benchmark labels Jan 19, 2026

zuiderkwast added this to Valkey 9.1 Jan 19, 2026

zuiderkwast moved this to In Progress in Valkey 9.1 Jan 19, 2026

github-actions Bot removed the run-benchmark label Jan 20, 2026

dvkashapov approved these changes Jan 20, 2026

View reviewed changes

zuiderkwast changed the title ~~Mitigate performance degradation during rehash~~ Improve performance during rehashing Jan 20, 2026

zuiderkwast merged commit d8315cf into valkey-io:unstable Jan 23, 2026
121 of 123 checks passed

github-project-automation Bot moved this from In Progress to Done in Valkey 9.1 Jan 23, 2026

zuiderkwast removed the to-be-merged Almost ready to merge label Jan 23, 2026

chzhoo deleted the opt_rehash branch January 24, 2026 04:32

Uh oh!

Conversation

chzhoo commented Jan 18, 2026

Description

Benchmark

Benchmark step

Benchmark result

Benchmark env

Uh oh!

codecov Bot commented Jan 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

zuiderkwast left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

chzhoo commented Jan 19, 2026

Uh oh!

zuiderkwast left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented Jan 20, 2026

Benchmark Comparison: unstable vs 8a8a090 (averaged) - rps metrics

Uh oh!

dvkashapov commented Jan 20, 2026

Uh oh!

roshkhatri commented Jan 20, 2026

Uh oh!

zuiderkwast commented Jan 20, 2026

Uh oh!

chzhoo commented Jan 21, 2026

Uh oh!

Uh oh!

roshkhatri commented Jan 23, 2026

Uh oh!

zuiderkwast commented Jan 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

codecov Bot commented Jan 18, 2026 •

edited

Loading

Benchmark Comparison: unstable vs `8a8a090` (averaged) - rps metrics