Skip to content

CLUSTERSCAN Command#2934

Merged
madolson merged 9 commits into
valkey-io:unstablefrom
nmvk:clusterscan
Mar 9, 2026
Merged

CLUSTERSCAN Command#2934
madolson merged 9 commits into
valkey-io:unstablefrom
nmvk:clusterscan

Conversation

@nmvk

@nmvk nmvk commented Dec 15, 2025

Copy link
Copy Markdown
Contributor

Implemented CLUSTERSCAN command for topology-aware scanning

Unlike SCAN which is local to a single node, CLUSTERSCAN provides a
mechanism that helps clients iterate across slot boundaries and handles
MOVED redirections.

Key details

  • Global cluster iteration via fingerprint-{hashtag}-cursor
  • Scan one slot at a time
  • Start the CLUSTERSCAN with 0
  • SLOT argument for parallel scanning of multiple slots
  • Re-use scanGenericCommand for the response

Cursor format: fingerprint-{hashtag}-localcursor

  • Fingerprint is a hash of the node's DB seed that identifies the
    current memory layout. On mismatch, scan restarts from cursor 0
    rather than returning an error.
  • Fingerprint 0 indicates a cross slot cursor (e.g., initial cursor
    or slot transition) where validation is skipped.
  • Hashtag encodes the target slot
  • Local cursor tracks position within the slot

Usage:

CLUSTERSCAN <cursor> [MATCH pattern] [COUNT count] [TYPE type] [SLOT number]
  CLUSTERSCAN 0                    # Start scanning from slot 0
  CLUSTERSCAN <cursor>             # Continue from cursor
  CLUSTERSCAN 0 SLOT 1000          # Start scanning specific slot
  CLUSTERSCAN <cursor> MATCH user:* COUNT 100

This closes #33

@codecov

codecov Bot commented Dec 15, 2025

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 74.83%. Comparing base (223bfa8) to head (22d1121).
⚠️ Report is 54 commits behind head on unstable.

Additional details and impacted files
@@              Coverage Diff              @@
##           unstable    #2934       +/-   ##
=============================================
+ Coverage          0   74.83%   +74.83%     
=============================================
  Files             0      129      +129     
  Lines             0    71630    +71630     
=============================================
+ Hits              0    53602    +53602     
- Misses            0    18028    +18028     
Files with missing lines Coverage Δ
src/cluster.c 92.17% <100.00%> (ø)
src/commands.def 100.00% <ø> (ø)
src/db.c 94.36% <100.00%> (ø)
src/dict.c 78.17% <100.00%> (ø)
src/hashtable.c 92.87% <100.00%> (ø)
src/server.h 100.00% <ø> (ø)
src/t_hash.c 95.39% <100.00%> (ø)
src/t_set.c 98.05% <100.00%> (ø)
src/t_zset.c 96.98% <100.00%> (ø)
src/util.c 67.31% <100.00%> (ø)

... and 119 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Comment thread src/cluster.c Outdated
@nmvk nmvk force-pushed the clusterscan branch 5 times, most recently from b9f30db to 3bbe5d2 Compare December 22, 2025 02:28
@nmvk nmvk requested a review from madolson December 22, 2025 04:44
@nmvk nmvk marked this pull request as ready for review December 22, 2025 04:44
Comment thread src/cluster.c Outdated
Comment thread src/cluster.c Outdated
Comment thread src/cluster.c Outdated
Comment thread src/commands/clusterscan.json
Comment thread tests/unit/cluster/clusterscan.tcl Outdated
Comment thread src/cluster.c Outdated
Comment thread tests/unit/cluster/clusterscan.tcl Outdated
@nmvk nmvk force-pushed the clusterscan branch 3 times, most recently from 527be33 to 9c8b1de Compare January 8, 2026 00:33
Comment thread src/crc16_slottable.h Outdated

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR implements the CLUSTERSCAN command to enable topology-aware scanning across cluster slots, addressing Issue #33. Unlike the node-local SCAN command, CLUSTERSCAN allows clients to scan across the entire cluster by automatically advancing through slots using cursor-encoded slot information.

Changes:

  • Implements CLUSTERSCAN command with cursor format version-{hashtag}-localcursor for slot-aware routing
  • Refactors CRC16 slot table from header to source file with accessor function clusterGetSlotHashtag()
  • Extends scanGenericCommand to support slot-specific scanning with custom cursor prefixes
  • Adds comprehensive test coverage for basic functionality, MATCH/COUNT/TYPE options, MOVED redirects, and SLOT argument

Reviewed changes

Copilot reviewed 15 out of 17 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
src/cluster.c Core CLUSTERSCAN implementation including cursor parsing, slot validation, and redirect handling
src/db.c Extended scanGenericCommand signature to support slot filtering and cursor prefixing for CLUSTERSCAN
src/crc16_slottable.c Moved CRC16 slot hashtag table from header to source with accessor function
src/crc16_slottable.h Simplified header to declare clusterGetSlotHashtag function
src/commands/clusterscan.json Command metadata defining arguments, ACL categories, and reply schema
src/commands.def Auto-generated command table entry
src/server.h Added function declarations for clusterscanCommand and clusterscanGetKeys
src/module.c Updated to use clusterGetSlotHashtag accessor
src/valkey-benchmark.c Updated to use clusterGetSlotHashtag accessor
src/Makefile Added crc16_slottable.o to build
cmake/Modules/SourceFiles.cmake Added crc16_slottable.c to CMake build
tests/unit/cluster/clusterscan.tcl Comprehensive test suite covering cursor validation, SLOT argument, redirects, and options
tests/support/cluster.tcl Added clusterscan to plain commands list for cluster client
src/t_zset.c, src/t_set.c, src/t_hash.c Updated scanGenericCommand calls with new parameters
.config/typos.toml Updated to exclude new source file from spell checking

Comment thread src/cluster.c Outdated
Comment thread src/cluster.c
Comment thread src/cluster.c Outdated
Comment thread src/cluster.c
@madolson madolson added major-decision-pending Major decision pending by TSC team release-notes This issue should get a line item in the release notes needs-doc-pr This change needs to update a documentation page. Remove label once doc PR is open. labels Feb 9, 2026
@madolson

madolson commented Feb 9, 2026

Copy link
Copy Markdown
Member

Core meeting:

  1. Instead of using a version, let's hash all of the relevant bits of information together to generate a "unique" fingerprint for the memory layout. Today that includes basically the hash table type and the DB seed. We can always mix in "more" information later. If the "version" changes, instead of throwing an error, we can just restart the slot local cursor back to zero. We should document this behavior.
  2. Directionally approved.

@valkey-io/core-team Pinging ya'll for the major decision. We discussed it today in the US/EU meeting, but pinging other folks.

@nmvk nmvk force-pushed the clusterscan branch 3 times, most recently from be9e960 to 27316db Compare February 16, 2026 07:57
Unlike `SCAN` which is local to a single node, `CLUSTERSCAN` provides a
mechanism that helps clients iterate across slot boundaries and handles
`MOVED` redirections.

**Key details**

* Global cluster iteration via `fingerprint-{hashtag}-cursor`
* Scan one slot at a time
* Start the CLUSTERSCAN with 0
* SLOT argument for parallel scanning of multiple slots
* Re-use scanGenericCommand for the response

**Cursor format:** `fingerprint-{hashtag}-localcursor`
 - Fingerprint is a hash of the node's DB seed that identifies the
   current memory layout. On mismatch, scan restarts from cursor 0
   rather than returning an error.
 - Fingerprint 0 indicates a cross slot cursor (e.g., initial cursor
   or slot transition) where validation is skipped.
 - Hashtag encodes the target slot
 - Local cursor tracks position within the slot

**Usage:**

CLUSTERSCAN <cursor> [MATCH pattern] [COUNT count] [TYPE type] [SLOT number]

  CLUSTERSCAN 0                    # Start scanning from slot 0
  CLUSTERSCAN <cursor>             # Continue from cursor
  CLUSTERSCAN 0 SLOT 1000          # Start scanning specific slot
  CLUSTERSCAN <cursor> MATCH user:* COUNT 100

This implements valkey-io#33

Signed-off-by: nmvk <r@nmvk.com>
NOT_KEY handles this

Signed-off-by: nmvk <r@nmvk.com>
Comment thread tests/unit/cluster/clusterscan.tcl Outdated
Update tests/unit/cluster/clusterscan.tcl

Co-authored-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
Signed-off-by: Raghav <r@nmvk.com>
@zuiderkwast zuiderkwast added the run-extra-tests Run extra tests on this PR (Runs all tests from daily except valgrind and RESP) label Mar 6, 2026
Comment thread src/commands/clusterscan.json Outdated
Signed-off-by: Madelyn Olson <madelyneolson@gmail.com>
@madolson madolson added to-be-merged Almost ready to merge run-extra-tests Run extra tests on this PR (Runs all tests from daily except valgrind and RESP) and removed run-extra-tests Run extra tests on this PR (Runs all tests from daily except valgrind and RESP) labels Mar 6, 2026
@madolson madolson moved this to In Progress in Valkey 9.1 Mar 9, 2026
@madolson madolson merged commit 8a4a7b2 into valkey-io:unstable Mar 9, 2026
140 of 141 checks passed
@github-project-automation github-project-automation Bot moved this from In Progress to Done in Valkey 9.1 Mar 9, 2026
@ranshid

ranshid commented Mar 9, 2026

Copy link
Copy Markdown
Member

should we close #33 ?

lemire pushed a commit to lemire/valkey that referenced this pull request Mar 9, 2026
Implemented CLUSTERSCAN command for topology-aware scanning

Unlike `SCAN` which is local to a single node, `CLUSTERSCAN` provides a
mechanism that helps clients iterate across slot boundaries and handles
`MOVED` redirections.

**Key details**

* Global cluster iteration via `fingerprint-{hashtag}-cursor`
* Scan one slot at a time
* Start the CLUSTERSCAN with 0
* SLOT argument for parallel scanning of multiple slots
* Re-use scanGenericCommand for the response

**Cursor format:** `fingerprint-{hashtag}-localcursor`
 - Fingerprint is a hash of the node's DB seed that identifies the
   current memory layout. On mismatch, scan restarts from cursor 0
   rather than returning an error.
 - Fingerprint 0 indicates a cross slot cursor (e.g., initial cursor
   or slot transition) where validation is skipped.
 - Hashtag encodes the target slot
 - Local cursor tracks position within the slot

**Usage:**

```
CLUSTERSCAN <cursor> [MATCH pattern] [COUNT count] [TYPE type] [SLOT number]
```

```
  CLUSTERSCAN 0                    # Start scanning from slot 0
  CLUSTERSCAN <cursor>             # Continue from cursor
  CLUSTERSCAN 0 SLOT 1000          # Start scanning specific slot
  CLUSTERSCAN <cursor> MATCH user:* COUNT 100
```

---------

Signed-off-by: nmvk <r@nmvk.com>
Signed-off-by: Raghav <r@nmvk.com>
Signed-off-by: Madelyn Olson <madelyneolson@gmail.com>
Co-authored-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
Co-authored-by: Madelyn Olson <madelyneolson@gmail.com>
Signed-off-by: Daniel Lemire <daniel@lemire.me>
JimB123 added a commit to JimB123/valkey that referenced this pull request Mar 10, 2026
nmvk added a commit to nmvk/valkey-doc that referenced this pull request Mar 13, 2026
Docuementation changes for the CLUSTERSCAN

valkey-io/valkey#2934

Signed-off-by: nmvk <r@nmvk.com>
madolson added a commit to madolson/valkey that referenced this pull request Mar 15, 2026
- Fix release date to Mon Mar 18 2026
- Fix typos: duplicate 'load', 'keyes' -> 'keys', duplicate 'INFO'
- Remove reverted contributor (arshidkv12, valkey-io#3137)
- Add 7 new release-notes entries from upstream/unstable merge:
  CLUSTERSCAN (valkey-io#2934), MSETEX (valkey-io#3121), availability-zone (valkey-io#3156),
  stream range optimization (valkey-io#3002), RDB as AOF preamble (valkey-io#1901),
  unsigned 64-bit module config (valkey-io#1546), fast_float -> ffc (valkey-io#3329)

Signed-off-by: Madelyn Olson <madelyneolson@gmail.com>
madolson pushed a commit to valkey-io/valkey-doc that referenced this pull request Mar 15, 2026
Docuementation changes for the CLUSTERSCAN

valkey-io/valkey#2934

Signed-off-by: nmvk <r@nmvk.com>
JimB123 pushed a commit that referenced this pull request Mar 19, 2026
Implemented CLUSTERSCAN command for topology-aware scanning

Unlike `SCAN` which is local to a single node, `CLUSTERSCAN` provides a
mechanism that helps clients iterate across slot boundaries and handles
`MOVED` redirections.

**Key details**

* Global cluster iteration via `fingerprint-{hashtag}-cursor`
* Scan one slot at a time
* Start the CLUSTERSCAN with 0
* SLOT argument for parallel scanning of multiple slots
* Re-use scanGenericCommand for the response

**Cursor format:** `fingerprint-{hashtag}-localcursor`
 - Fingerprint is a hash of the node's DB seed that identifies the
   current memory layout. On mismatch, scan restarts from cursor 0
   rather than returning an error.
 - Fingerprint 0 indicates a cross slot cursor (e.g., initial cursor
   or slot transition) where validation is skipped.
 - Hashtag encodes the target slot
 - Local cursor tracks position within the slot

**Usage:**

```
CLUSTERSCAN <cursor> [MATCH pattern] [COUNT count] [TYPE type] [SLOT number]
```

```
  CLUSTERSCAN 0                    # Start scanning from slot 0
  CLUSTERSCAN <cursor>             # Continue from cursor
  CLUSTERSCAN 0 SLOT 1000          # Start scanning specific slot
  CLUSTERSCAN <cursor> MATCH user:* COUNT 100
```

---------

Signed-off-by: nmvk <r@nmvk.com>
Signed-off-by: Raghav <r@nmvk.com>
Signed-off-by: Madelyn Olson <madelyneolson@gmail.com>
Co-authored-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
Co-authored-by: Madelyn Olson <madelyneolson@gmail.com>
zuiderkwast added a commit that referenced this pull request May 6, 2026
Instead of scanning one slot at a time `CLUSTERSCAN` now scans the
entire contiguous range of slots owned by the current node.

Implementation details

* Re-sharding safe as the hash slot is updated based on the local cursor
  position.
* Fingerprint remains stable across the entire contiguous slot range
  instead of being reset per slot.
* Parsing/validation of parameters for the SCAN commands is refactored
  and moved to a separate function.

```
   > CLUSTERSCAN 0
   "0-{06S}-0"                      # start at slot 0

   > CLUSTERSCAN 0-{06S}-0
   "aBcDeF-{06S}-48"                # scanning slot 0...

   > CLUSTERSCAN aBcDeF-{06S}-48
   "aBcDeF-{1Y7}-16"                 # slot 0 done, continues to slot 6 (same node hence FP is unchanged)

   > CLUSTERSCAN aBcDeF-{1Y7}-16
   "aBcDeF-{0or}-32"                # slot 6 done, continues to slot 100  (same node hence FP is unchanged)
   ...
   > CLUSTERSCAN aBcDeF-{...}-64
   "0-{8YG}-0"                      # Current continuous slot boundary reached hence cross-node transition 
```

Follow-up of #2934

---------

Signed-off-by: nmvk <r@nmvk.com>
Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
Co-authored-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
lucasyonge pushed a commit that referenced this pull request May 11, 2026
Instead of scanning one slot at a time `CLUSTERSCAN` now scans the
entire contiguous range of slots owned by the current node.

Implementation details

* Re-sharding safe as the hash slot is updated based on the local cursor
  position.
* Fingerprint remains stable across the entire contiguous slot range
  instead of being reset per slot.
* Parsing/validation of parameters for the SCAN commands is refactored
  and moved to a separate function.

```
   > CLUSTERSCAN 0
   "0-{06S}-0"                      # start at slot 0

   > CLUSTERSCAN 0-{06S}-0
   "aBcDeF-{06S}-48"                # scanning slot 0...

   > CLUSTERSCAN aBcDeF-{06S}-48
   "aBcDeF-{1Y7}-16"                 # slot 0 done, continues to slot 6 (same node hence FP is unchanged)

   > CLUSTERSCAN aBcDeF-{1Y7}-16
   "aBcDeF-{0or}-32"                # slot 6 done, continues to slot 100  (same node hence FP is unchanged)
   ...
   > CLUSTERSCAN aBcDeF-{...}-64
   "0-{8YG}-0"                      # Current continuous slot boundary reached hence cross-node transition 
```

Follow-up of #2934

---------

Signed-off-by: nmvk <r@nmvk.com>
Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
Co-authored-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
lucasyonge pushed a commit that referenced this pull request May 12, 2026
Instead of scanning one slot at a time `CLUSTERSCAN` now scans the
entire contiguous range of slots owned by the current node.

Implementation details

* Re-sharding safe as the hash slot is updated based on the local cursor
  position.
* Fingerprint remains stable across the entire contiguous slot range
  instead of being reset per slot.
* Parsing/validation of parameters for the SCAN commands is refactored
  and moved to a separate function.

```
   > CLUSTERSCAN 0
   "0-{06S}-0"                      # start at slot 0

   > CLUSTERSCAN 0-{06S}-0
   "aBcDeF-{06S}-48"                # scanning slot 0...

   > CLUSTERSCAN aBcDeF-{06S}-48
   "aBcDeF-{1Y7}-16"                 # slot 0 done, continues to slot 6 (same node hence FP is unchanged)

   > CLUSTERSCAN aBcDeF-{1Y7}-16
   "aBcDeF-{0or}-32"                # slot 6 done, continues to slot 100  (same node hence FP is unchanged)
   ...
   > CLUSTERSCAN aBcDeF-{...}-64
   "0-{8YG}-0"                      # Current continuous slot boundary reached hence cross-node transition 
```

Follow-up of #2934

---------

Signed-off-by: nmvk <r@nmvk.com>
Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
Co-authored-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
lucasyonge pushed a commit that referenced this pull request May 12, 2026
This bug was introduced in #3366.

Before PR #3366, hash-seed config was applied directly via
hashtableSetHashFunctionSeed(), so clusterscanFingerprint() correctly
used hash_function_seed to derive the fingerprint.
```c
if (server.hash_seed != NULL) {
    memset(hashseed, 0, sizeof(hashseed));
    getHashSeedFromString(hashseed, sizeof(hashseed), server.hash_seed);
    hashtableSetHashFunctionSeed(hashseed);
}
```

PR #3366 introduced a separate configurable_hash_seed for data
hashtables and kept hash_function_seed as a random per-process value.
```c
/* Set the configured hash seed used by data hashtables (keys, sets, zsets,
 * hashes) or use the random seed if not configured. */
if (server.hash_seed) {
    uint8_t seed[16] = {0};
    getHashSeedFromString(seed, sizeof(seed), server.hash_seed);
    setConfigurableHashSeed(seed);
} else {
    setConfigurableHashSeed(hashtableGetHashFunctionSeed());
}
```

However, clusterscanFingerprint() was not updated accordingly — it
still reads hash_function_seed, which is now random on every node.
This makes fingerprints differ across nodes even when they share the
same hash-seed config, causing cursors to restart on failover.

CLUSTERSCAN was introduced in #2934.

Signed-off-by: Binbin <binloveplay1314@qq.com>
enjoy-binbin added a commit that referenced this pull request Jun 25, 2026
…3675)

No bug exists — all NOT_KEY commands (CLUSTERSCAN, SSUBSCRIBE,
SPUBLISH, SUNSUBSCRIBE) have NULL getkeys_proc, so doesCommandHaveKeys
returns 0 and ACL correctly skips key checks. This PR makes the
handling more explicit and defensive for future commands that may
combine getkeys_proc with NOT_KEY key-specs.

Changes:

- Refactor doesCommandHaveKeys() from a one-line ternary into clearer
  if-else branches. Add an explicit check: if a command has getkeys_proc
  but all its key-specs are NOT_KEY, treat it as having no real keys.

- Add a defensive NOT_KEY check in ACLSelectorCheckKey() to skip
  key-pattern ACL validation for NOT_KEY entries, consistent with the
  existing skip in getKeysUsingKeySpecs().

- Add tests: verify COMMAND GETKEYS/GETKEYSANDFLAGS report "no key
  arguments" for NOT_KEY commands; verify restricted ACL users can
  run CLUSTERSCAN without key-permission errors.

Related: #2934, #3699

Signed-off-by: Binbin <binloveplay1314@qq.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

major-decision-approved Major decision approved by TSC team needs-doc-pr This change needs to update a documentation page. Remove label once doc PR is open. release-notes This issue should get a line item in the release notes run-extra-tests Run extra tests on this PR (Runs all tests from daily except valgrind and RESP) to-be-merged Almost ready to merge

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

[NEW] Cluster-wide SCAN

6 participants