Optimize WATCH duplicate key check from O(N) to O(1) using per-db hashtable#3360
Merged
Conversation
Previously, watchForKey() checked for duplicate watched keys by iterating
through the client's entire watched_keys list with O(N) complexity, where
N is the total number of keys watched by the client. So the time complexity
for the WATCH command could be quite poor and become a slow command.
This commit introduces a per-db dictionary (watched_keys_by_db) in the
client's multiState structure to enable O(1) duplicate key detection.
The dictionary is lazily allocated only when the client starts watching
keys, minimizing memory overhead for clients that don't use WATCH.
The per-db dict stores borrowed references (keys from watchedKey->key,
values are watchedKey pointers), so no custom destructors are needed.
Memory management remains centralized in the watched_keys list.
This optimization is especially beneficial when a client watches many
keys across different databases, as the check no longer scales with
the total watched key count.
This might be a minor scenario, but there's no harm in optimizing it.
There is a test in multi.tcl, before this patch, it took 15s, and after
this patch, it only took 50ms.
```
set elements {}
for {set i 0} {$i < 50000} {incr i} {
lappend elements key-$i
}
r watch {*}$elements
r watch {*}$elements
```
Signed-off-by: Binbin <binloveplay1314@qq.com>
dvkashapov
reviewed
Mar 13, 2026
dvkashapov
left a comment
Member
There was a problem hiding this comment.
Overall LGTM but have a couple of suggestions/questions
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## unstable #3360 +/- ##
============================================
- Coverage 74.53% 74.35% -0.18%
============================================
Files 130 130
Lines 72731 72752 +21
============================================
- Hits 54208 54096 -112
- Misses 18523 18656 +133
🚀 New features to boost your workflow:
|
Signed-off-by: Binbin <binloveplay1314@qq.com>
Member
|
This could use the static const void *watchedKeyGetKey(const void *entry) {
const watchedKey *wk = entry;
return wk->key;
}
hashtableType watchedKeysHashtableType = {
.hashFunction = dictEncObjHash,
.entryGetKey = watchedKeyGetKey,
.keyCompare = hashtableEncObjKeyCompare,
};Then |
Signed-off-by: Binbin <binloveplay1314@qq.com>
Signed-off-by: Binbin <binloveplay1314@qq.com>
zuiderkwast
approved these changes
Mar 30, 2026
Nikhil-Manglore
pushed a commit
to Nikhil-Manglore/valkey
that referenced
this pull request
Apr 7, 2026
…htable (valkey-io#3360) Previously, watchForKey() checked for duplicate watched keys by iterating through the client's entire watched_keys list with O(N) complexity, where N is the total number of keys watched by the client. So the time complexity for the WATCH command could be quite poor and become a slow command. This commit introduces a per-db hashtable (watched_keys_by_db) in the client's multiState structure to enable O(1) duplicate key detection. The hashtable is lazily allocated only when the client starts watching keys, minimizing memory overhead for clients that don't use WATCH. The per-db hashtable stores watchedKey* directly as the hashtable entry since it already contains the key, so no custom destructors are needed. Memory management remains centralized in the watched_keys list. This optimization is especially beneficial when a client watches many keys across different databases, as the check no longer scales with the total watched key count. This might be a minor scenario, but there's no harm in optimizing it. There is a test in multi.tcl, before this patch, it took 15s, and after this patch, it only took 50ms. ``` set elements {} for {set i 0} {$i < 50000} {incr i} { lappend elements key-$i } r watch {*}$elements r watch {*}$elements ``` Signed-off-by: Binbin <binloveplay1314@qq.com>
sarthakaggarwal97
pushed a commit
to sarthakaggarwal97/valkey
that referenced
this pull request
Apr 16, 2026
…htable (valkey-io#3360) Previously, watchForKey() checked for duplicate watched keys by iterating through the client's entire watched_keys list with O(N) complexity, where N is the total number of keys watched by the client. So the time complexity for the WATCH command could be quite poor and become a slow command. This commit introduces a per-db hashtable (watched_keys_by_db) in the client's multiState structure to enable O(1) duplicate key detection. The hashtable is lazily allocated only when the client starts watching keys, minimizing memory overhead for clients that don't use WATCH. The per-db hashtable stores watchedKey* directly as the hashtable entry since it already contains the key, so no custom destructors are needed. Memory management remains centralized in the watched_keys list. This optimization is especially beneficial when a client watches many keys across different databases, as the check no longer scales with the total watched key count. This might be a minor scenario, but there's no harm in optimizing it. There is a test in multi.tcl, before this patch, it took 15s, and after this patch, it only took 50ms. ``` set elements {} for {set i 0} {$i < 50000} {incr i} { lappend elements key-$i } r watch {*}$elements r watch {*}$elements ``` Signed-off-by: Binbin <binloveplay1314@qq.com>
madolson
pushed a commit
that referenced
this pull request
Apr 27, 2026
…htable (#3360) Previously, watchForKey() checked for duplicate watched keys by iterating through the client's entire watched_keys list with O(N) complexity, where N is the total number of keys watched by the client. So the time complexity for the WATCH command could be quite poor and become a slow command. This commit introduces a per-db hashtable (watched_keys_by_db) in the client's multiState structure to enable O(1) duplicate key detection. The hashtable is lazily allocated only when the client starts watching keys, minimizing memory overhead for clients that don't use WATCH. The per-db hashtable stores watchedKey* directly as the hashtable entry since it already contains the key, so no custom destructors are needed. Memory management remains centralized in the watched_keys list. This optimization is especially beneficial when a client watches many keys across different databases, as the check no longer scales with the total watched key count. This might be a minor scenario, but there's no harm in optimizing it. There is a test in multi.tcl, before this patch, it took 15s, and after this patch, it only took 50ms. ``` set elements {} for {set i 0} {$i < 50000} {incr i} { lappend elements key-$i } r watch {*}$elements r watch {*}$elements ``` Signed-off-by: Binbin <binloveplay1314@qq.com>
This was referenced May 3, 2026
roshkhatri
pushed a commit
to roshkhatri/valkey
that referenced
this pull request
May 26, 2026
…htable (valkey-io#3360) Previously, watchForKey() checked for duplicate watched keys by iterating through the client's entire watched_keys list with O(N) complexity, where N is the total number of keys watched by the client. So the time complexity for the WATCH command could be quite poor and become a slow command. This commit introduces a per-db hashtable (watched_keys_by_db) in the client's multiState structure to enable O(1) duplicate key detection. The hashtable is lazily allocated only when the client starts watching keys, minimizing memory overhead for clients that don't use WATCH. The per-db hashtable stores watchedKey* directly as the hashtable entry since it already contains the key, so no custom destructors are needed. Memory management remains centralized in the watched_keys list. This optimization is especially beneficial when a client watches many keys across different databases, as the check no longer scales with the total watched key count. This might be a minor scenario, but there's no harm in optimizing it. There is a test in multi.tcl, before this patch, it took 15s, and after this patch, it only took 50ms. ``` set elements {} for {set i 0} {$i < 50000} {incr i} { lappend elements key-$i } r watch {*}$elements r watch {*}$elements ``` Signed-off-by: Binbin <binloveplay1314@qq.com> Signed-off-by: Roshan Khatri <rvkhatri@amazon.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Previously, watchForKey() checked for duplicate watched keys by iterating
through the client's entire watched_keys list with O(N) complexity, where
N is the total number of keys watched by the client. So the time complexity
for the WATCH command could be quite poor and become a slow command.
This commit introduces a per-db hashtable (watched_keys_by_db) in the
client's multiState structure to enable O(1) duplicate key detection.
The hashtable is lazily allocated only when the client starts watching
keys, minimizing memory overhead for clients that don't use WATCH.
The per-db hashtable stores watchedKey* directly as the hashtable entry
since it already contains the key, so no custom destructors are needed.
Memory management remains centralized in the watched_keys list.
This optimization is especially beneficial when a client watches many
keys across different databases, as the check no longer scales with
the total watched key count.
This might be a minor scenario, but there's no harm in optimizing it.
There is a test in multi.tcl, before this patch, it took 15s, and after
this patch, it only took 50ms.