Optimize WATCH duplicate key check from O(N) to O(1) using per-db hashtable by enjoy-binbin · Pull Request #3360 · valkey-io/valkey

enjoy-binbin · 2026-03-13T04:29:12Z

Previously, watchForKey() checked for duplicate watched keys by iterating
through the client's entire watched_keys list with O(N) complexity, where
N is the total number of keys watched by the client. So the time complexity
for the WATCH command could be quite poor and become a slow command.

This commit introduces a per-db hashtable (watched_keys_by_db) in the
client's multiState structure to enable O(1) duplicate key detection.
The hashtable is lazily allocated only when the client starts watching
keys, minimizing memory overhead for clients that don't use WATCH.

The per-db hashtable stores watchedKey* directly as the hashtable entry
since it already contains the key, so no custom destructors are needed.
Memory management remains centralized in the watched_keys list.

This optimization is especially beneficial when a client watches many
keys across different databases, as the check no longer scales with
the total watched key count.

This might be a minor scenario, but there's no harm in optimizing it.

There is a test in multi.tcl, before this patch, it took 15s, and after
this patch, it only took 50ms.

        set elements {}
        for {set i 0} {$i < 50000} {incr i} {
            lappend elements key-$i
        }
        r watch {*}$elements
        r watch {*}$elements

Previously, watchForKey() checked for duplicate watched keys by iterating through the client's entire watched_keys list with O(N) complexity, where N is the total number of keys watched by the client. So the time complexity for the WATCH command could be quite poor and become a slow command. This commit introduces a per-db dictionary (watched_keys_by_db) in the client's multiState structure to enable O(1) duplicate key detection. The dictionary is lazily allocated only when the client starts watching keys, minimizing memory overhead for clients that don't use WATCH. The per-db dict stores borrowed references (keys from watchedKey->key, values are watchedKey pointers), so no custom destructors are needed. Memory management remains centralized in the watched_keys list. This optimization is especially beneficial when a client watches many keys across different databases, as the check no longer scales with the total watched key count. This might be a minor scenario, but there's no harm in optimizing it. There is a test in multi.tcl, before this patch, it took 15s, and after this patch, it only took 50ms. ``` set elements {} for {set i 0} {$i < 50000} {incr i} { lappend elements key-$i } r watch {*}$elements r watch {*}$elements ``` Signed-off-by: Binbin <binloveplay1314@qq.com>

dvkashapov

Overall LGTM but have a couple of suggestions/questions

codecov · 2026-03-13T06:41:18Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 74.35%. Comparing base (ac5e44c) to head (9fe68f5).
⚠️ Report is 77 commits behind head on unstable.

Additional details and impacted files

@@             Coverage Diff              @@
##           unstable    #3360      +/-   ##
============================================
- Coverage     74.53%   74.35%   -0.18%     
============================================
  Files           130      130              
  Lines         72731    72752      +21     
============================================
- Hits          54208    54096     -112     
- Misses        18523    18656     +133

Files with missing lines	Coverage Δ
src/multi.c	`97.90% <100.00%> (+0.92%)`	⬆️
src/server.c	`89.49% <ø> (-0.06%)`	⬇️
src/server.h	`100.00% <ø> (ø)`

... and 27 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Signed-off-by: Binbin <binloveplay1314@qq.com>

madolson · 2026-03-16T22:45:04Z

This could use the hashtable interface instead of dict to avoid the extra dictEntry allocation (24 bytes) per watched key. The watchedKey* can be stored directly as the hashtable entry since it already contains the key:

static const void *watchedKeyGetKey(const void *entry) {
    const watchedKey *wk = entry;
    return wk->key;
}

hashtableType watchedKeysHashtableType = {
    .hashFunction = dictEncObjHash,
    .entryGetKey = watchedKeyGetKey,
    .keyCompare = hashtableEncObjKeyCompare,
};

Then hashtableFind(ht, key, NULL) / hashtableAdd(ht, wk) replace the dict calls, and the watchedKeysDictType in server.c can be removed.

Signed-off-by: Binbin <binloveplay1314@qq.com>

…htable (valkey-io#3360) Previously, watchForKey() checked for duplicate watched keys by iterating through the client's entire watched_keys list with O(N) complexity, where N is the total number of keys watched by the client. So the time complexity for the WATCH command could be quite poor and become a slow command. This commit introduces a per-db hashtable (watched_keys_by_db) in the client's multiState structure to enable O(1) duplicate key detection. The hashtable is lazily allocated only when the client starts watching keys, minimizing memory overhead for clients that don't use WATCH. The per-db hashtable stores watchedKey* directly as the hashtable entry since it already contains the key, so no custom destructors are needed. Memory management remains centralized in the watched_keys list. This optimization is especially beneficial when a client watches many keys across different databases, as the check no longer scales with the total watched key count. This might be a minor scenario, but there's no harm in optimizing it. There is a test in multi.tcl, before this patch, it took 15s, and after this patch, it only took 50ms. ``` set elements {} for {set i 0} {$i < 50000} {incr i} { lappend elements key-$i } r watch {*}$elements r watch {*}$elements ``` Signed-off-by: Binbin <binloveplay1314@qq.com>

…htable (#3360) Previously, watchForKey() checked for duplicate watched keys by iterating through the client's entire watched_keys list with O(N) complexity, where N is the total number of keys watched by the client. So the time complexity for the WATCH command could be quite poor and become a slow command. This commit introduces a per-db hashtable (watched_keys_by_db) in the client's multiState structure to enable O(1) duplicate key detection. The hashtable is lazily allocated only when the client starts watching keys, minimizing memory overhead for clients that don't use WATCH. The per-db hashtable stores watchedKey* directly as the hashtable entry since it already contains the key, so no custom destructors are needed. Memory management remains centralized in the watched_keys list. This optimization is especially beneficial when a client watches many keys across different databases, as the check no longer scales with the total watched key count. This might be a minor scenario, but there's no harm in optimizing it. There is a test in multi.tcl, before this patch, it took 15s, and after this patch, it only took 50ms. ``` set elements {} for {set i 0} {$i < 50000} {incr i} { lappend elements key-$i } r watch {*}$elements r watch {*}$elements ``` Signed-off-by: Binbin <binloveplay1314@qq.com>

…htable (valkey-io#3360) Previously, watchForKey() checked for duplicate watched keys by iterating through the client's entire watched_keys list with O(N) complexity, where N is the total number of keys watched by the client. So the time complexity for the WATCH command could be quite poor and become a slow command. This commit introduces a per-db hashtable (watched_keys_by_db) in the client's multiState structure to enable O(1) duplicate key detection. The hashtable is lazily allocated only when the client starts watching keys, minimizing memory overhead for clients that don't use WATCH. The per-db hashtable stores watchedKey* directly as the hashtable entry since it already contains the key, so no custom destructors are needed. Memory management remains centralized in the watched_keys list. This optimization is especially beneficial when a client watches many keys across different databases, as the check no longer scales with the total watched key count. This might be a minor scenario, but there's no harm in optimizing it. There is a test in multi.tcl, before this patch, it took 15s, and after this patch, it only took 50ms. ``` set elements {} for {set i 0} {$i < 50000} {incr i} { lappend elements key-$i } r watch {*}$elements r watch {*}$elements ``` Signed-off-by: Binbin <binloveplay1314@qq.com> Signed-off-by: Roshan Khatri <rvkhatri@amazon.com>

enjoy-binbin requested a review from zuiderkwast March 13, 2026 04:29

dvkashapov reviewed Mar 13, 2026

View reviewed changes

Comment thread src/multi.c

Comment thread src/multi.c Outdated

github-actions Bot assigned enjoy-binbin Mar 13, 2026

Fix crossslot test

99b5977

Signed-off-by: Binbin <binloveplay1314@qq.com>

enjoy-binbin added the run-extra-tests Run extra tests on this PR (Runs all tests from daily except valgrind and RESP) label Mar 13, 2026

Code review from Madelyn

b4db0d9

Signed-off-by: Binbin <binloveplay1314@qq.com>

enjoy-binbin changed the title ~~Optimize WATCH duplicate key check from O(N) to O(1) using per-db dict~~ Optimize WATCH duplicate key check from O(N) to O(1) using per-db hashtable Mar 17, 2026

Merge branch 'unstable' into watch

9fe68f5

Signed-off-by: Binbin <binloveplay1314@qq.com>

zuiderkwast approved these changes Mar 30, 2026

View reviewed changes

enjoy-binbin added this to Valkey 9.1 Mar 31, 2026

enjoy-binbin merged commit 9586093 into valkey-io:unstable Mar 31, 2026
66 checks passed

github-project-automation Bot moved this to To be backported in Valkey 9.1 Mar 31, 2026

enjoy-binbin deleted the watch branch March 31, 2026 02:34

enjoy-binbin mentioned this pull request Mar 31, 2026

Replace dict with thin wrapper around hashtable #3366

Merged

sarthakaggarwal97 mentioned this pull request Apr 16, 2026

Backport Unstable to 9.1 for RC2 #3519

Merged

sarthakaggarwal97 added the release-notes This issue should get a line item in the release notes label Apr 28, 2026

sarthakaggarwal97 moved this from To be backported to Done in Valkey 9.1 May 16, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Optimize WATCH duplicate key check from O(N) to O(1) using per-db hashtable#3360

Optimize WATCH duplicate key check from O(N) to O(1) using per-db hashtable#3360
enjoy-binbin merged 4 commits into
valkey-io:unstablefrom
enjoy-binbin:watch

enjoy-binbin commented Mar 13, 2026 •

edited

Loading

Uh oh!

dvkashapov left a comment

Uh oh!

Uh oh!

Uh oh!

codecov Bot commented Mar 13, 2026 •

edited

Loading

Uh oh!

madolson commented Mar 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Uh oh!

Conversation

enjoy-binbin commented Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dvkashapov left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

codecov Bot commented Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

madolson commented Mar 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

enjoy-binbin commented Mar 13, 2026 •

edited

Loading

codecov Bot commented Mar 13, 2026 •

edited

Loading