Skip to content

Add multi-database support to cluster mode#1671

Merged
madolson merged 61 commits into
valkey-io:unstablefrom
xbasel:multidb
May 4, 2025
Merged

Add multi-database support to cluster mode#1671
madolson merged 61 commits into
valkey-io:unstablefrom
xbasel:multidb

Conversation

@xbasel

@xbasel xbasel commented Feb 5, 2025

Copy link
Copy Markdown
Member

cluster: add multi-database support in cluster mode

Add multi-database support in cluster mode to align with standalone mode
and facilitate migration. Previously, cluster mode was restricted to a
single database (DB0). This change allows multiple databases while
preserving the existing slot-based key distribution.

Key Features:

  • Database-Agnostic Hashing. The hashing algorithm is unchanged.
    Identical keys always map to the same slot across all databases,
    ensuring consistent key distribution and compatibility with
    existing single-database setups.
  • Multi-DB commands support. SELECT, MOVE, and COPY are now supported in
    cluster mode.
  • Fully backward compatible with no API changes.
  • SWAPDB is not supported in cluster mode. It is unsafe due to inconsistency risks.

Command-Level Changes:

  • SELECT / MOVE / COPY are now supported in cluster mode.
  • MOVE / COPY (with db) are rejected (TRYAGAIN error) during slot migration to prevent multi-DB inconsistencies.
  • SWAPDB will return an error if used when cluster mode is enabled.
  • GETKEYSINSLOT, COUNTKEYSINSLOT and MIGRATE will operate in the context of the selected database.
    This means, for example, that migrating keys in a slot will require iterating and repeating across all databases.

Slot Migration Process:

  • Multi-DB support in cluster mode affects slot migration. Operators should now iterate over all the configured databases.

Transaction Handling (MULTI/EXEC):

  • getNodeByQuery key lookup behavior changed:
    • No key lookups when queuing commands in MULTI, only cross-slot
      validation.
    • Key lookups happen at EXEC time.
    • SELECT inside MULTI/EXEC is now checked, ensuring key validation
      uses the selected DB at lookup.

Valkey-cli:

  • valkey-cli has been updated to support resharding across all databases.

Configuration:

  • Introduce new configuration cluster-databases.
    The new configuration controls the maximal number of databases in cluster mode.

Implements #1319

@codecov

codecov Bot commented Feb 5, 2025

Copy link
Copy Markdown

Codecov Report

Attention: Patch coverage is 91.00000% with 9 lines in your changes missing coverage. Please review.

Project coverage is 70.84%. Comparing base (2d200df) to head (f2ed97c).
Report is 7 commits behind head on unstable.

Files with missing lines Patch % Lines
src/valkey-cli.c 80.00% 8 Missing ⚠️
src/cluster.c 96.87% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##           unstable    #1671      +/-   ##
============================================
- Coverage     71.02%   70.84%   -0.18%     
============================================
  Files           123      123              
  Lines         66116    66173      +57     
============================================
- Hits          46956    46879      -77     
- Misses        19160    19294     +134     
Files with missing lines Coverage Δ
src/cluster_legacy.c 86.78% <100.00%> (+0.37%) ⬆️
src/config.c 78.39% <ø> (-0.05%) ⬇️
src/db.c 89.99% <100.00%> (+0.42%) ⬆️
src/server.c 87.94% <100.00%> (+0.03%) ⬆️
src/server.h 100.00% <ø> (ø)
src/valkey-benchmark.c 62.42% <100.00%> (+0.24%) ⬆️
src/cluster.c 90.24% <96.87%> (+0.21%) ⬆️
src/valkey-cli.c 54.60% <80.00%> (-1.32%) ⬇️

... and 13 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@xbasel xbasel marked this pull request as draft February 6, 2025 10:01
@xbasel xbasel marked this pull request as ready for review February 10, 2025 21:37
@xbasel xbasel requested a review from zuiderkwast February 10, 2025 22:13
Comment thread src/db.c
@soloestoy soloestoy requested review from soloestoy and removed request for zuiderkwast February 12, 2025 06:28
Comment thread src/cluster.c Outdated
@soloestoy

Copy link
Copy Markdown
Member

I'm happy that we did "Unified db rehash method for both standalone and cluster #12848" when developing kvstore , which made the implementation of multi-database simpler.

@ranshid ranshid added the release-notes This issue should get a line item in the release notes label Feb 17, 2025

@hpatro hpatro left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to add history to SWAPDB, SELECT, MOVE json files to indicate it's supported since 9.0.

Comment thread src/cluster_legacy.c Outdated
Comment thread tests/support/cluster.tcl Outdated
Comment thread tests/cluster/tests/05-cluster-multidatabases.tcl Outdated
Comment thread tests/cluster/tests/05-cluster-multidatabases.tcl Outdated
Comment thread tests/unit/cluster/cli.tcl Outdated
Comment thread src/cluster.c Outdated
Comment thread src/db.c
Comment thread tests/unit/lazyfree.tcl Outdated
@ranshid ranshid added the client-changes-needed Client changes may be required for this feature label Feb 24, 2025
Comment thread src/db.c Outdated
Comment thread src/cluster.c
@xbasel

xbasel commented Mar 3, 2025

Copy link
Copy Markdown
Member Author

documentation: valkey-io/valkey-doc#242

@xbasel xbasel requested a review from a team March 5, 2025 11:38
@hwware

hwware commented Mar 5, 2025

Copy link
Copy Markdown
Contributor

It looks like there are still some test cases failed related to multiply db feature. Please fix them first, Thanks

@xbasel xbasel marked this pull request as draft March 5, 2025 18:36
@xbasel xbasel force-pushed the multidb branch 4 times, most recently from 538e23e to 63151ae Compare March 6, 2025 12:14
xbasel added a commit to xbasel/valkey that referenced this pull request May 6, 2025
Re-adds a statement to restore the `singledb` config that was
accidentally removed in PR valkey-io#1671.

Signed-off-by: xbasel <103044017+xbasel@users.noreply.github.com>
xbasel added a commit to xbasel/valkey that referenced this pull request May 6, 2025
Re-adds a statement to restore the `singledb` config that was
accidentally removed in PR valkey-io#1671.

Signed-off-by: xbasel <103044017+xbasel@users.noreply.github.com>
zuiderkwast pushed a commit that referenced this pull request May 6, 2025
Re-adds a statement to restore the `singledb` config that was
accidentally removed in PR #1671.

Fixes #2049

Signed-off-by: xbasel <103044017+xbasel@users.noreply.github.com>
madolson added a commit that referenced this pull request May 6, 2025
One of the new tests that was added uses `CONFIG GET PORT`, which isn't
right one for TLS.

Also removed some other use of the helper which aren't actually used.

Introduced as part of #1671.

---------

Signed-off-by: Madelyn Olson <madelyneolson@gmail.com>
rainsupreme pushed a commit to rainsupreme/valkey that referenced this pull request May 14, 2025
## cluster: add multi-database support in cluster mode

Add multi-database support in cluster mode to align with standalone mode
and facilitate migration. Previously, cluster mode was restricted to a
single database (DB0). This change allows multiple databases while
preserving the existing slot-based key distribution.


### Key Features:
- Database-Agnostic Hashing. The hashing algorithm is unchanged.
  Identical keys always map to the same slot across all databases,
  ensuring consistent key distribution and compatibility with
  existing single-database setups.
- Multi-DB commands support. SELECT, MOVE, and COPY are now supported in
  cluster mode.
- Fully backward compatible with no API changes.
- SWAPDB is not supported in cluster mode. It is unsafe due to
inconsistency risks.

### Command-Level Changes:
- SELECT / MOVE / COPY are now supported in cluster mode.
- MOVE / COPY (with db) are rejected (TRYAGAIN error) during slot
migration to prevent multi-DB inconsistencies.
- SWAPDB will return an error if used when cluster mode is enabled.
- GETKEYSINSLOT, COUNTKEYSINSLOT and MIGRATE will operate in the context
of the selected database.
This means, for example, that migrating keys in a slot will require
iterating and repeating across all databases.

### Slot Migration Process:
- Multi-DB support in cluster mode affects slot migration. Operators
should now iterate over all the configured databases.
 
### Transaction Handling (MULTI/EXEC):
- getNodeByQuery key lookup behavior changed:
  - No key lookups when queuing commands in MULTI, only cross-slot
    validation.
  - Key lookups happen at EXEC time.
  - SELECT inside MULTI/EXEC is now checked, ensuring key validation
    uses the selected DB at lookup.

### Valkey-cli:
- valkey-cli has been updated to support resharding across all
databases.

### Configuration:
- Introduce new configuration `cluster-databases`.
The new configuration controls the maximal number of databases in
cluster mode.

Implements  valkey-io#1319

---------

Signed-off-by: xbasel <103044017+xbasel@users.noreply.github.com>
Signed-off-by: zhaozhao.zz <zhaozhao.zz@alibaba-inc.com>
Co-authored-by: zhaozhao.zz <zhaozhao.zz@alibaba-inc.com>
Co-authored-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
Co-authored-by: Madelyn Olson <madelyneolson@gmail.com>
Co-authored-by: Ran Shidlansik <ranshid@amazon.com>
rainsupreme pushed a commit to rainsupreme/valkey that referenced this pull request May 14, 2025
Re-adds a statement to restore the `singledb` config that was
accidentally removed in PR valkey-io#1671.

Fixes valkey-io#2049

Signed-off-by: xbasel <103044017+xbasel@users.noreply.github.com>
rainsupreme pushed a commit to rainsupreme/valkey that referenced this pull request May 14, 2025
One of the new tests that was added uses `CONFIG GET PORT`, which isn't
right one for TLS.

Also removed some other use of the helper which aren't actually used.

Introduced as part of valkey-io#1671.

---------

Signed-off-by: Madelyn Olson <madelyneolson@gmail.com>
gmbnomis added a commit to gmbnomis/valkey that referenced this pull request Jun 11, 2025
To support multiple databases in cluster mode (see valkey-io#1671),
`getNodeByQuery` temporarily switches databases when tracking `SELECT`
statements during slot migration/import. The intended logic is to
revert any database change after the operation. However, this approach
is flawed: in some transactions the database change is not properly
reverted, causing the client to remain on the wrong database.

For example, if a transaction includes `SELECT` statements, the current
database may be changed even if the transaction is never executed (see
added test).

Fix the issue by saving the original database once and
restoring to it after a switch.

Signed-off-by: Simon Baatz <gmbnomis@gmail.com>
zuiderkwast added a commit that referenced this pull request Jun 16, 2025
To support multiple databases in cluster mode (see #1671),
`getNodeByQuery` temporarily switches databases when tracking `SELECT`
statements during slot migration/import. The intended logic is to revert
any database change after the operation. However, this approach is
flawed: in some transactions the database change is not properly
reverted, causing the client to remain on the wrong database.

For example, if a transaction includes `SELECT` statements, the current
database may be changed even if the transaction is never executed (see
added test).

Fix the issue by saving the original database once and restoring to it
after a switch.

---------

Signed-off-by: Simon Baatz <gmbnomis@gmail.com>
Signed-off-by: Simon Baatz <gmbnomis@users.noreply.github.com>
Co-authored-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
ranshid pushed a commit to ranshid/valkey that referenced this pull request Jun 18, 2025
…#2206)

To support multiple databases in cluster mode (see valkey-io#1671),
`getNodeByQuery` temporarily switches databases when tracking `SELECT`
statements during slot migration/import. The intended logic is to revert
any database change after the operation. However, this approach is
flawed: in some transactions the database change is not properly
reverted, causing the client to remain on the wrong database.

For example, if a transaction includes `SELECT` statements, the current
database may be changed even if the transaction is never executed (see
added test).

Fix the issue by saving the original database once and restoring to it
after a switch.

---------

Signed-off-by: Simon Baatz <gmbnomis@gmail.com>
Signed-off-by: Simon Baatz <gmbnomis@users.noreply.github.com>
Co-authored-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
enjoy-binbin added a commit to enjoy-binbin/valkey that referenced this pull request Jul 29, 2025
The comment has been outdated since valkey-io#1671, update it.

Signed-off-by: Binbin <binloveplay1314@qq.com>
enjoy-binbin added a commit that referenced this pull request Aug 5, 2025
…wed (#2391)

The comment has been outdated since #1671, update it.

Signed-off-by: Binbin <binloveplay1314@qq.com>
allenss-amazon pushed a commit to allenss-amazon/valkey-core that referenced this pull request Aug 19, 2025
…wed (valkey-io#2391)

The comment has been outdated since valkey-io#1671, update it.

Signed-off-by: Binbin <binloveplay1314@qq.com>
enjoy-binbin added a commit to enjoy-binbin/valkey that referenced this pull request Oct 11, 2025
The test was accidentally removed in PR valkey-io#1671.

Signed-off-by: Binbin <binloveplay1314@qq.com>
enjoy-binbin added a commit that referenced this pull request Oct 14, 2025
The test was accidentally removed in PR #1671.

Signed-off-by: Binbin <binloveplay1314@qq.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

client-changes-needed Client changes may be required for this feature needs-doc-pr This change needs to update a documentation page. Remove label once doc PR is open. release-notes This issue should get a line item in the release notes

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.