Replace dict with thin wrapper around hashtable by zuiderkwast · Pull Request #3366 · valkey-io/valkey

zuiderkwast · 2026-03-16T18:20:25Z

Replace the dict.c implementation with a header-only wrapper (dict.h) around the hashtable API. The dict types, iterators and API functions are now typedefs, macros and inline functions that delegate to hashtable. This unifies the hashtable implementations in the project and removes duplicated logic.

Changes to dict:

Remove dict.c; dict.h is now the entire implementation
dict, dictType and dictIterator are direct aliases for the hashtable
counterparts.
dictEntry is a struct allocated by dict wrapper functions to hold key
and value. It doesn't have a next pointer anymore.
Fix key duplication for dictTypes that had keyDup callback by
calling sdsdup() at call sites in functions.c
Remove unused functions, macros, includes and casts
Move some dict defrag logic to defrag.c
Remove obsolete dict unit tests (covered by test_hashtable.cpp)

Changes to hashtable:

Change hashtable keyCompare convention to match dict: non-zero means
keys are equal, so existing dict compare functions can be reused
Add const to hashtableMemUsage parameter

Changes to server implementation:

Deduplicate common dict/hashtable callbacks in server.c
Change configured hash-seed to only apply to data hashtables. In
particular, it must not modify the hash seed for dicts already
initialized during startup for reading configs and similar.

Changes to libvalkey:

Let libvalkey use its own dict implementation.

Stop overriding libvalkey's dict with valkey's. Remove the DICT_INCLUDE_DIR mechanism from libvalkey's build system since it is no longer needed. Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech>

Data hashtables (keys, sets, zsets, hashes) now use a configurable seed separate from the global hashtable seed. This allows the hash-seed config to control SCAN iteration order without affecting internal hashtables (commands, ACL, modules, etc.) that are populated before config loading. The configurable seed defaults to the random seed and is overridden after config loading if hash-seed is set. Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech>

codecov · 2026-03-16T18:42:52Z

Codecov Report

❌ Patch coverage is 90.09009% with 22 lines in your changes missing coverage. Please review.
✅ Project coverage is 76.52%. Comparing base (8bb8d91) to head (dab6559).
⚠️ Report is 2 commits behind head on unstable.

Files with missing lines	Patch %	Lines
src/server.c	87.50%	7 Missing ⚠️
src/rdb.c	0.00%	5 Missing ⚠️
src/sentinel.c	0.00%	5 Missing ⚠️
src/valkey-cli.c	77.77%	4 Missing ⚠️
src/valkey-benchmark.c	0.00%	1 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff              @@
##           unstable    #3366      +/-   ##
============================================
- Coverage     76.82%   76.52%   -0.30%     
============================================
  Files           159      157       -2     
  Lines         79704    79025     -679     
============================================
- Hits          61232    60474     -758     
- Misses        18472    18551      +79

Files with missing lines	Coverage Δ
src/cluster_legacy.c	`88.17% <100.00%> (+0.13%)`	⬆️
src/config.c	`77.70% <ø> (ø)`
src/defrag.c	`81.12% <100.00%> (-0.81%)`	⬇️
src/dict.h	`100.00% <100.00%> (ø)`
src/eval.c	`91.50% <100.00%> (+0.07%)`	⬆️
src/expire.c	`98.12% <ø> (ø)`
src/functions.c	`96.64% <100.00%> (+0.07%)`	⬆️
src/fuzzer_command_generator.c	`76.82% <100.00%> (-0.04%)`	⬇️
src/hashtable.c	`97.82% <100.00%> (+0.39%)`	⬆️
src/latency.c	`83.33% <100.00%> (+0.04%)`	⬆️
... and 12 more

... and 17 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Replace the dict.c implementation with a header-only wrapper (dict.h) around the hashtable API. The dict types, iterators and API functions are now typedefs, macros and inline functions that delegate to hashtable. This unifies the hashtable implementations in the project and removes duplicated logic. Changes to dict: - Remove dict.c; dict.h is now the entire implementation - dict, dictType and dictIterator are direct aliases for the hashtable counterparts. - dictEntry is a struct allocated by dict wrapper functions to hold key and value. It doesn't have a next pointer anymore. - Fix key duplication for dictTypes that had keyDup callback by calling sdsdup() at call sites in functions.c - Remove unused functions, macros, includes and casts - Move some dict defrag logic to defrag.c - Remove obsolete dict unit tests (covered by test_hashtable.cpp) Changes to hashtable: - Change hashtable keyCompare convention to match dict: non-zero means keys are equal, so existing dict compare functions can be reused - Add const to hashtableMemUsage parameter Changes to server implementation: - Deduplicate common dict/hashtable callbacks in server.c Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech>

hpatro

Pretty pleasant refactoring.

rainsupreme

The code looks good to me! I'd like to see the benchmark results, otherwise looks good to go! 😁

dvkashapov

Awesome work! My only concern was always allocating new entry before checking if key exists in dictReplace() and freeing new entry if key already exists. But this function is not that frequently used so impact is minimal.
BTW for module dict API should we mention somewhere that now its another implementation?

zuiderkwast · 2026-03-18T17:26:57Z

Awesome work! My only concern was always allocating new entry before checking if key exists in dictReplace() and freeing new entry if key already exists. But this function is not that frequently used so impact is minimal.

Exactly, that was my conclusion too, so it's fine.

BTW for module dict API should we mention somewhere that now its another implementation?

It's a rax! Crazy... but it's not affected by this PR.

sarthakaggarwal97

Minor comments! Looks pretty goood!

Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech>

github-actions · 2026-03-19T19:22:54Z

Benchmark ran on this commit: 49e86af

Benchmark Comparison: unstable vs `8ca73b3` (averaged) - rps metrics

Run Summary:

unstable: 80 total runs, 16 configurations (avg 5.00 runs per config)
8ca73b3: 80 total runs, 16 configurations (avg 5.00 runs per config)

Statistical Notes:

CI99%: 99% Confidence Interval - range where the true population mean is likely to fall
PI99%: 99% Prediction Interval - range where a single future observation is likely to fall
CV: Coefficient of Variation - relative variability (σ/μ × 100%)

Note: Values with (n=X, σ=Y, CV=Z%, CI99%=±W%, PI99%=±V%) indicate averages from X runs with standard deviation Y, coefficient of variation Z%, 99% confidence interval margin of error ±W% of the mean, and 99% prediction interval margin of error ±V% of the mean. CI bounds [A, B] and PI bounds [C, D] show the actual interval ranges.

Configuration:

architecture: aarch64
benchmark_mode: duration
clients: 1600
cluster_mode: False
data_size: 16
duration: 180
tls: False
valkey_benchmark_threads: 90
warmup: 30

Command	Metric	Pipeline	io_threads	unstable	`8ca73b3`	Diff	% Change
GET	rps	1	1	229532.856 (n=5, σ=2288.134, CV=1.00%, CI99%=±2.053%, PI99%=±5.028%, CI[224821.557, 234244.155], PI[217992.578, 241073.134])	229742.702 (n=5, σ=2668.886, CV=1.16%, CI99%=±2.392%, PI99%=±5.859%, CI[224247.430, 235237.974], PI[216282.089, 243203.315])	209.846	+0.091%
GET	rps	1	9	1500239.250 (n=5, σ=5846.989, CV=0.39%, CI99%=±0.802%, PI99%=±1.966%, CI[1488200.219, 1512278.281], PI[1470749.767, 1529728.733])	1490869.976 (n=5, σ=14738.836, CV=0.99%, CI99%=±2.036%, PI99%=±4.986%, CI[1460522.509, 1521217.443], PI[1416534.166, 1565205.786])	-9369.274	-0.625%
GET	rps	10	1	1252881.948 (n=5, σ=6082.645, CV=0.49%, CI99%=±1.000%, PI99%=±2.449%, CI[1240357.698, 1265406.198], PI[1222203.925, 1283559.971])	1261843.672 (n=5, σ=4111.061, CV=0.33%, CI99%=±0.671%, PI99%=±1.643%, CI[1253378.940, 1270308.404], PI[1241109.399, 1282577.945])	8961.724	+0.715%
GET	rps	10	9	2845821.550 (n=5, σ=25909.625, CV=0.91%, CI99%=±1.875%, PI99%=±4.592%, CI[2792473.274, 2899169.826], PI[2715145.495, 2976497.605])	2927557.850 (n=5, σ=22465.211, CV=0.77%, CI99%=±1.580%, PI99%=±3.870%, CI[2881301.669, 2973814.031], PI[2814253.810, 3040861.890])	81736.300	+2.872%
SET	rps	1	1	219963.862 (n=5, σ=1861.550, CV=0.85%, CI99%=±1.743%, PI99%=±4.268%, CI[216130.904, 223796.820], PI[210575.071, 229352.653])	221007.348 (n=5, σ=2049.630, CV=0.93%, CI99%=±1.910%, PI99%=±4.677%, CI[216787.131, 225227.565], PI[210669.969, 231344.727])	1043.486	+0.474%
SET	rps	1	9	1464119.024 (n=5, σ=16397.047, CV=1.12%, CI99%=±2.306%, PI99%=±5.648%, CI[1430357.277, 1497880.771], PI[1381419.971, 1546818.077])	1480615.428 (n=5, σ=14716.978, CV=0.99%, CI99%=±2.047%, PI99%=±5.013%, CI[1450312.968, 1510917.888], PI[1406389.863, 1554840.993])	16496.404	+1.127%
SET	rps	10	1	1043928.138 (n=5, σ=5226.437, CV=0.50%, CI99%=±1.031%, PI99%=±2.525%, CI[1033166.833, 1054689.443], PI[1017568.432, 1070287.844])	1056213.750 (n=5, σ=6521.983, CV=0.62%, CI99%=±1.271%, PI99%=±3.114%, CI[1042784.896, 1069642.604], PI[1023319.910, 1089107.590])	12285.612	+1.177%
SET	rps	10	9	1946280.350 (n=5, σ=19116.738, CV=0.98%, CI99%=±2.022%, PI99%=±4.954%, CI[1906918.722, 1985641.978], PI[1849864.447, 2042696.253])	1987682.874 (n=5, σ=16748.335, CV=0.84%, CI99%=±1.735%, PI99%=±4.250%, CI[1953197.821, 2022167.927], PI[1903212.090, 2072153.658])	41402.524	+2.127%

Configuration:

architecture: aarch64
benchmark_mode: duration
clients: 1600
cluster_mode: False
data_size: 96
duration: 180
tls: False
valkey_benchmark_threads: 90
warmup: 30

Command	Metric	Pipeline	io_threads	unstable	`8ca73b3`	Diff	% Change
GET	rps	1	1	222176.852 (n=5, σ=1206.090, CV=0.54%, CI99%=±1.118%, PI99%=±2.738%, CI[219693.496, 224660.208], PI[216093.896, 228259.808])	220376.370 (n=5, σ=2155.231, CV=0.98%, CI99%=±2.014%, PI99%=±4.932%, CI[215938.719, 224814.021], PI[209506.389, 231246.351])	-1800.482	-0.810%
GET	rps	1	9	1457394.624 (n=5, σ=11412.368, CV=0.78%, CI99%=±1.612%, PI99%=±3.949%, CI[1433896.400, 1480892.848], PI[1399835.964, 1514953.284])	1454072.198 (n=5, σ=12938.939, CV=0.89%, CI99%=±1.832%, PI99%=±4.488%, CI[1427430.743, 1480713.653], PI[1388814.227, 1519330.169])	-3322.426	-0.228%
GET	rps	10	1	1185527.750 (n=5, σ=7371.858, CV=0.62%, CI99%=±1.280%, PI99%=±3.136%, CI[1170348.993, 1200706.507], PI[1148347.541, 1222707.959])	1197302.726 (n=5, σ=6737.916, CV=0.56%, CI99%=±1.159%, PI99%=±2.838%, CI[1183429.263, 1211176.189], PI[1163319.821, 1231285.631])	11774.976	+0.993%
GET	rps	10	9	2230588.250 (n=5, σ=25678.874, CV=1.15%, CI99%=±2.370%, PI99%=±5.806%, CI[2177715.093, 2283461.407], PI[2101075.994, 2360100.506])	2288793.500 (n=5, σ=26069.578, CV=1.14%, CI99%=±2.345%, PI99%=±5.745%, CI[2235115.878, 2342471.122], PI[2157310.716, 2420276.284])	58205.250	+2.609%
SET	rps	1	1	213710.366 (n=5, σ=987.679, CV=0.46%, CI99%=±0.952%, PI99%=±2.331%, CI[211676.721, 215744.011], PI[208728.973, 218691.759])	213811.440 (n=5, σ=1010.977, CV=0.47%, CI99%=±0.974%, PI99%=±2.385%, CI[211729.824, 215893.056], PI[208712.542, 218910.338])	101.074	+0.047%
SET	rps	1	9	1430660.198 (n=5, σ=5161.627, CV=0.36%, CI99%=±0.743%, PI99%=±1.820%, CI[1420032.337, 1441288.059], PI[1404627.361, 1456693.035])	1447783.876 (n=5, σ=7657.616, CV=0.53%, CI99%=±1.089%, PI99%=±2.668%, CI[1432016.739, 1463551.013], PI[1409162.435, 1486405.317])	17123.678	+1.197%
SET	rps	10	1	1034570.686 (n=5, σ=2241.983, CV=0.22%, CI99%=±0.446%, PI99%=±1.093%, CI[1029954.413, 1039186.959], PI[1023263.173, 1045878.199])	1051506.962 (n=5, σ=3530.386, CV=0.34%, CI99%=±0.691%, PI99%=±1.693%, CI[1044237.848, 1058776.076], PI[1033701.342, 1069312.582])	16936.276	+1.637%
SET	rps	10	9	1830460.300 (n=5, σ=23823.894, CV=1.30%, CI99%=±2.680%, PI99%=±6.564%, CI[1781406.572, 1879514.028], PI[1710303.697, 1950616.903])	1886475.202 (n=5, σ=30806.601, CV=1.63%, CI99%=±3.362%, PI99%=±8.236%, CI[1823043.984, 1949906.420], PI[1731101.085, 2041849.319])	56014.902	+3.060%

rainsupreme

🙌 good stuff!

enjoy-binbin

LGTM, thanks! I am guessing #3360 will have some conflicts.

Replace the dict.c implementation with a header-only wrapper (dict.h) around the hashtable API. The dict types, iterators and API functions are now typedefs, macros and inline functions that delegate to hashtable. This unifies the hashtable implementations in the project and removes duplicated logic. Changes to dict: - Remove dict.c; dict.h is now the entire implementation - dict, dictType and dictIterator are direct aliases for the hashtable counterparts. - dictEntry is a struct allocated by dict wrapper functions to hold key and value. It doesn't have a next pointer anymore. - Fix key duplication for dictTypes that had keyDup callback by calling sdsdup() at call sites in functions.c - Remove unused functions, macros, includes and casts - Move some dict defrag logic to defrag.c - Remove obsolete dict unit tests (covered by test_hashtable.cpp) Changes to hashtable: - Change hashtable keyCompare convention to match dict: non-zero means keys are equal, so existing dict compare functions can be reused - Add const to hashtableMemUsage parameter Changes to server implementation: - Deduplicate common dict/hashtable callbacks in server.c - Change configured hash-seed to only apply to data hashtables. In particular, it must not modify the hash seed for dicts already initialized during startup for reading configs and similar. Changes to libvalkey: - Let libvalkey use its own dict implementation. --------- Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech>

JimB123 · 2026-04-07T18:59:35Z

I didn't notice this PR before it was merged. But there is a concern about using inline functions in the .h file and eliminating the .c file. We need to be cautious with this approach.

The problem is that an inline function in the .h file MUST be inlined by the compiler. It will be instantiated separately in each compilation unit. This code bloat can lead to an increase in L1 cache usage, and an overall decrease in performance vs. functions which exist in a single compilation unit (in a .c file).

Now that we have LTO, there is limited benefit to using inline functions in the .h file.

Inline functions make the interface/API arguably harder to read - as now we have code mixed into the API.
LTO automatically inlines across compilation units - but it is bounded by configuration to ensure that we are not inlining large functions. We can use this configuration to limit impact on L1 cache.

JimB123 · 2026-04-07T19:24:53Z

    /* Compare function, returns 0 if the keys are equal. Defaults to just
     * comparing the pointers for equality. */


This comment is now incorrect, right? It looks like the compare function is supposed to return 1 if the keys are equal.

You're right.

JimB123 · 2026-04-07T19:36:11Z

    .entryGetKey = watchedKeyGetKey,
    .hashFunction = dictEncObjHash,
-    .keyCompare = hashtableEncObjKeyCompare,
+    .keyCompare = dictEncObjKeyCompare,


It's a bit confusing now that we have hashtable callbacks being provided as dict callback functions.

If we can relocate all of these utility helpers, it might be worth having equivalent function with both names just to eliminate the confusion. Maybe something like:

int hashtableCStrKeyCompare(const void *key1, const void *key2) { return strcmp(key1, key2) == 0; } int dictCStrKeyCompare(const void *key1, const void *key2) { return hashtableCStrKeyCompare(key1, key2); }

Now, dicts are hashtables and dictType is a synonym of hashtableType, so I think it's fine that there's only one copy of these. We could unify on one set of names though.

Another approach is that we continue moving towards hashtable terminology. We can replace all dictType occurrences with hashtableType and unifying on the hashtable-prefixed functions, since it's the actual implementation. The only thing that is really dict specific is dictEntry.

JimB123

Some after-the-fact comments.

zuiderkwast · 2026-04-07T21:18:01Z

But there is a concern about using inline functions in the .h file and eliminating the .c file. We need to be cautious with this approach.

Yes, the main reason for deleting the .c file is to avoid symbol collisions with libvalkey's copy of dict.c, which is the old dict implementation. If it wasn't for this reason, I would have kept these functions in dict.c.

The problem is that an inline function in the .h file MUST be inlined by the compiler. It will be instantiated separately in each compilation unit. This code bloat can lead to an increase in L1 cache usage, and an overall decrease in performance vs. functions which exist in a single compilation unit (in a .c file).

Yes, but this is only with -O0 i.e. test builds, right? -O1 has dead code elimination.

Still, I think we should create a macro like dict_inline to use as a qualifier for these. It can expand to __attribute__((always inline)) or similar depending on the compiler.

JimB123 · 2026-04-09T16:43:09Z

Yes, but this is only with -O0 i.e. test builds, right? -O1 has dead code elimination.

This isn't a question of dead code. Dead code never gets loaded into L1 cache.

Consider 3 files which include dict.h:

foo.c includes dict.h and calls dictReplace().
bar.c includes dict.h and calls dictReplace().
baz.c includes dict.h but never calls dictReplace().

In foo there will be an instantiated copy of dictReplace.
In bar there will be an instantiated copy of dictReplace.
In baz there will NOT be an instantiated copy of dictReplace - because the function isn't used, and the compiler will eliminate unused static functions.

The problem occurs when execution cycles back and forth between foo and bar. They have 2 different copies of this function and both get loaded into L1 cache.

If, instead, the code was contained in the .c file, the linker could decide (based on configuration) if the function is small enough to inline (essentially creating duplicates in L1) or large enough that we don't want to inline (resulting in a single copy).

zuiderkwast · 2026-04-09T19:06:58Z

OK, now I get what you mean.

I did think about this when deciding to go this way. The largest function is dictReplace, and I checked where it's called and concluded that it's very rarly used: only when updating cluster configs and by valkey-cli and valkey-benchmark for handling cluster nodes and slots.

Second largest is dictAddRaw. It's only used for clients blocked on keys. Only in blocked.c. Additionally it's used by dictAddOrFind which is used for replica keys with expire and cluster nodes black list. Both pretty rare operations.

Third largest is dictAdd. But it's really not large.

These functions are larger than the others just because they provide a different API than the hashtable. We've already converted the most critical ones to hashtable: the keyspace, hashes, sets, sorted sets, command lookup table, etc. We could phase out dictReplace and do these things differently at the call site and then delete dictReplace.

We could put these few nontrivial functions in a C file, but with a name that doesn't clash with the names used in libvalkey's dict, i.e. we can rename them with macros. But I'm not really convinced that it's necessary.

JimB123 · 2026-04-09T21:48:20Z

Right. So back to my original statement:

We need to be cautious with this approach.

In the old days, this technique (static inline in the .h file) used to make sense. But now, with LTO, there's no value in this approach. Using a .c file can be similarly optimized.

We need to be cautious because this change may represent tech debt. You surveyed the current usage and concluded that there's not currently an issue. This may not be true in the future.

BUT MY BIGGER CONCERN is that I want to ensure that other people don't copy this technique to other places. (Where it may represent a larger concern.) We need to question this technique if/when we see it.

## Problem `Fix cluster` in `tests/unit/cluster/many-slot-migration.tcl` has been timing out daily on valgrind jobs since April 3, 2026. The test runs 10 cluster nodes under valgrind, migrating 40,000 keys across 1,000 slots — too much work for valgrind-instrumented builds. The slowdown is caused by #3366 (dict→hashtable wrapper). Under `-O0` (valgrind builds), the `static inline` wrappers become real function calls that valgrind instruments, adding ~75% overhead to hot paths like `dictSize`. This compounds across 10 valgrind processes over a 20-minute migration test. No impact on production builds (`-O2` inlines everything). ## Fix Scale the test workload down under valgrind: 10,000 keys / 250 slots instead of 40,000 / 1,000. Normal runs are unchanged. Still exercises the same cluster repair path. Signed-off-by: Roshan Khatri <rvkhatri@amazon.com> Co-authored-by: sarthakaggarwal97 <sarthakaggarwal97@users.noreply.github.com>

## Problem `Fix cluster` in `tests/unit/cluster/many-slot-migration.tcl` has been timing out daily on valgrind jobs since April 3, 2026. The test runs 10 cluster nodes under valgrind, migrating 40,000 keys across 1,000 slots — too much work for valgrind-instrumented builds. The slowdown is caused by valkey-io#3366 (dict→hashtable wrapper). Under `-O0` (valgrind builds), the `static inline` wrappers become real function calls that valgrind instruments, adding ~75% overhead to hot paths like `dictSize`. This compounds across 10 valgrind processes over a 20-minute migration test. No impact on production builds (`-O2` inlines everything). ## Fix Scale the test workload down under valgrind: 10,000 keys / 250 slots instead of 40,000 / 1,000. Normal runs are unchanged. Still exercises the same cluster repair path. Signed-off-by: Roshan Khatri <rvkhatri@amazon.com> Co-authored-by: sarthakaggarwal97 <sarthakaggarwal97@users.noreply.github.com>

## Problem `Fix cluster` in `tests/unit/cluster/many-slot-migration.tcl` has been timing out daily on valgrind jobs since April 3, 2026. The test runs 10 cluster nodes under valgrind, migrating 40,000 keys across 1,000 slots — too much work for valgrind-instrumented builds. The slowdown is caused by valkey-io#3366 (dict→hashtable wrapper). Under `-O0` (valgrind builds), the `static inline` wrappers become real function calls that valgrind instruments, adding ~75% overhead to hot paths like `dictSize`. This compounds across 10 valgrind processes over a 20-minute migration test. No impact on production builds (`-O2` inlines everything). ## Fix Scale the test workload down under valgrind: 10,000 keys / 250 slots instead of 40,000 / 1,000. Normal runs are unchanged. Still exercises the same cluster repair path. Signed-off-by: Roshan Khatri <rvkhatri@amazon.com> Co-authored-by: sarthakaggarwal97 <sarthakaggarwal97@users.noreply.github.com> (cherry picked from commit 66b50d8)

## Problem `Fix cluster` in `tests/unit/cluster/many-slot-migration.tcl` has been timing out daily on valgrind jobs since April 3, 2026. The test runs 10 cluster nodes under valgrind, migrating 40,000 keys across 1,000 slots — too much work for valgrind-instrumented builds. The slowdown is caused by valkey-io#3366 (dict→hashtable wrapper). Under `-O0` (valgrind builds), the `static inline` wrappers become real function calls that valgrind instruments, adding ~75% overhead to hot paths like `dictSize`. This compounds across 10 valgrind processes over a 20-minute migration test. No impact on production builds (`-O2` inlines everything). ## Fix Scale the test workload down under valgrind: 10,000 keys / 250 slots instead of 40,000 / 1,000. Normal runs are unchanged. Still exercises the same cluster repair path. Signed-off-by: Roshan Khatri <rvkhatri@amazon.com> Co-authored-by: sarthakaggarwal97 <sarthakaggarwal97@users.noreply.github.com> (cherry picked from commit 66b50d8) (cherry picked from commit a104a94)

## Problem `Fix cluster` in `tests/unit/cluster/many-slot-migration.tcl` has been timing out daily on valgrind jobs since April 3, 2026. The test runs 10 cluster nodes under valgrind, migrating 40,000 keys across 1,000 slots — too much work for valgrind-instrumented builds. The slowdown is caused by #3366 (dict→hashtable wrapper). Under `-O0` (valgrind builds), the `static inline` wrappers become real function calls that valgrind instruments, adding ~75% overhead to hot paths like `dictSize`. This compounds across 10 valgrind processes over a 20-minute migration test. No impact on production builds (`-O2` inlines everything). ## Fix Scale the test workload down under valgrind: 10,000 keys / 250 slots instead of 40,000 / 1,000. Normal runs are unchanged. Still exercises the same cluster repair path. Signed-off-by: Roshan Khatri <rvkhatri@amazon.com> Co-authored-by: sarthakaggarwal97 <sarthakaggarwal97@users.noreply.github.com>

This bug was introduced in #3366. Before PR #3366, hash-seed config was applied directly via hashtableSetHashFunctionSeed(), so clusterscanFingerprint() correctly used hash_function_seed to derive the fingerprint. ```c if (server.hash_seed != NULL) { memset(hashseed, 0, sizeof(hashseed)); getHashSeedFromString(hashseed, sizeof(hashseed), server.hash_seed); hashtableSetHashFunctionSeed(hashseed); } ``` PR #3366 introduced a separate configurable_hash_seed for data hashtables and kept hash_function_seed as a random per-process value. ```c /* Set the configured hash seed used by data hashtables (keys, sets, zsets, * hashes) or use the random seed if not configured. */ if (server.hash_seed) { uint8_t seed[16] = {0}; getHashSeedFromString(seed, sizeof(seed), server.hash_seed); setConfigurableHashSeed(seed); } else { setConfigurableHashSeed(hashtableGetHashFunctionSeed()); } ``` However, clusterscanFingerprint() was not updated accordingly — it still reads hash_function_seed, which is now random on every node. This makes fingerprints differ across nodes even when they share the same hash-seed config, causing cursors to restart on failover. CLUSTERSCAN was introduced in #2934. Signed-off-by: Binbin <binloveplay1314@qq.com>

lucasyonge · 2026-05-14T15:33:08Z

@zuiderkwast @ranshid I just add tag Valkey 10 for this PR, Let me know if you have any concern, Thanks

Replace the dict.c implementation with a header-only wrapper (dict.h) around the hashtable API. The dict types, iterators and API functions are now typedefs, macros and inline functions that delegate to hashtable. This unifies the hashtable implementations in the project and removes duplicated logic. Changes to dict: - Remove dict.c; dict.h is now the entire implementation - dict, dictType and dictIterator are direct aliases for the hashtable counterparts. - dictEntry is a struct allocated by dict wrapper functions to hold key and value. It doesn't have a next pointer anymore. - Fix key duplication for dictTypes that had keyDup callback by calling sdsdup() at call sites in functions.c - Remove unused functions, macros, includes and casts - Move some dict defrag logic to defrag.c - Remove obsolete dict unit tests (covered by test_hashtable.cpp) Changes to hashtable: - Change hashtable keyCompare convention to match dict: non-zero means keys are equal, so existing dict compare functions can be reused - Add const to hashtableMemUsage parameter Changes to server implementation: - Deduplicate common dict/hashtable callbacks in server.c - Change configured hash-seed to only apply to data hashtables. In particular, it must not modify the hash seed for dicts already initialized during startup for reading configs and similar. Changes to libvalkey: - Let libvalkey use its own dict implementation. --------- Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech> Signed-off-by: Roshan Khatri <rvkhatri@amazon.com>

zuiderkwast requested a review from rainsupreme March 16, 2026 18:20

github-actions Bot assigned zuiderkwast Mar 16, 2026

zuiderkwast added 2 commits March 16, 2026 19:29

Let libvalkey use its own dict implementation

9871cc6

Stop overriding libvalkey's dict with valkey's. Remove the DICT_INCLUDE_DIR mechanism from libvalkey's build system since it is no longer needed. Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech>

zuiderkwast force-pushed the thin-dict branch from e3dfd87 to bee4fc5 Compare March 16, 2026 19:31

zuiderkwast added run-extra-tests Run extra tests on this PR (Runs all tests from daily except valgrind and RESP) run-benchmark labels Mar 16, 2026

hpatro reviewed Mar 16, 2026

View reviewed changes

Comment thread src/dict.h

zuiderkwast marked this pull request as ready for review March 17, 2026 02:04

dvkashapov added run-benchmark and removed run-benchmark labels Mar 17, 2026

github-actions Bot removed the run-benchmark label Mar 17, 2026

rainsupreme reviewed Mar 17, 2026

View reviewed changes

Comment thread src/functions.c

dvkashapov approved these changes Mar 18, 2026

View reviewed changes

sarthakaggarwal97 approved these changes Mar 18, 2026

View reviewed changes

Comment thread src/dict.h

Comment thread src/valkey-benchmark.c

zuiderkwast added 2 commits March 19, 2026 02:22

Address review comments

5a3859b

Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech>

Merge remote-tracking branch 'valkey/unstable' into thin-dict

49e86af

zuiderkwast added the run-benchmark label Mar 19, 2026

github-actions Bot removed the run-benchmark label Mar 19, 2026

rainsupreme approved these changes Mar 24, 2026

View reviewed changes

rainsupreme mentioned this pull request Mar 24, 2026

Remove unused dict callback parameter from functionReset and related functions #3357

Merged

zuiderkwast requested review from enjoy-binbin and madolson March 26, 2026 15:30

enjoy-binbin approved these changes Mar 31, 2026

View reviewed changes

zuiderkwast mentioned this pull request Apr 2, 2026

add key_len parameter to hashtable lookup api - enables copy-avoidance work #3434

Closed

zuiderkwast deleted the thin-dict branch April 2, 2026 10:30

hpatro mentioned this pull request Apr 7, 2026

[TEST-FAILURE] Gossip count scales with higher percentage of cluster-message-gossip-perc in tests/unit/cluster/packet.tcl #3454

Closed

JimB123 reviewed Apr 7, 2026

View reviewed changes

Comment thread src/server.h

JimB123 reviewed Apr 7, 2026

View reviewed changes

roshkhatri mentioned this pull request Apr 8, 2026

Deflake many-slot-migration under valgrind #3462

Merged

zuiderkwast mentioned this pull request Apr 9, 2026

Use attribute always inline for dict macro-like functions #3470

Closed

zuiderkwast mentioned this pull request Apr 13, 2026

Unique samples in hashtableSampleEntries #3460

Merged

sarthakaggarwal97 mentioned this pull request Apr 14, 2026

Merge unstable into 9.1 #3507

Closed

JimB123 mentioned this pull request Apr 24, 2026

Refactor dict restoring abstraction #3561

Closed

enjoy-binbin mentioned this pull request May 12, 2026

Fix CLUSTERSCAN fingerprint to use configurable_hash_seed #3679

Merged

lucasyonge added this to Valkey 10 May 14, 2026

zuiderkwast moved this to Merged in Valkey 10 May 14, 2026

sarthakaggarwal97 mentioned this pull request May 18, 2026

Revert "Update deps/libvalkey to version 0.5.0 (#3697)" on 9.1 #3755

Closed

		/* Compare function, returns 0 if the keys are equal. Defaults to just
		* comparing the pointers for equality. */

Uh oh!

Conversation

zuiderkwast commented Mar 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov Bot commented Mar 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

hpatro left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

rainsupreme left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

dvkashapov left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zuiderkwast commented Mar 18, 2026

Uh oh!

sarthakaggarwal97 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented Mar 19, 2026

Benchmark Comparison: unstable vs 8ca73b3 (averaged) - rps metrics

Uh oh!

rainsupreme left a comment

Choose a reason for hiding this comment

Uh oh!

enjoy-binbin left a comment

Choose a reason for hiding this comment

Uh oh!

JimB123 commented Apr 7, 2026

Uh oh!

JimB123 Apr 7, 2026

Choose a reason for hiding this comment

Uh oh!

zuiderkwast Apr 7, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

JimB123 Apr 7, 2026

Choose a reason for hiding this comment

Uh oh!

zuiderkwast Apr 7, 2026

Choose a reason for hiding this comment

Uh oh!

JimB123 left a comment

Choose a reason for hiding this comment

Uh oh!

zuiderkwast commented Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

JimB123 commented Apr 9, 2026

Uh oh!

zuiderkwast commented Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

JimB123 commented Apr 9, 2026

Uh oh!

lucasyonge commented May 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

9 participants

zuiderkwast commented Mar 16, 2026 •

edited

Loading

codecov Bot commented Mar 16, 2026 •

edited

Loading

hpatro left a comment •

edited

Loading

dvkashapov left a comment •

edited

Loading

Benchmark Comparison: unstable vs `8ca73b3` (averaged) - rps metrics

zuiderkwast commented Apr 7, 2026 •

edited

Loading

zuiderkwast commented Apr 9, 2026 •

edited

Loading