Skip to content

Generate a new shard_id when the replica executes CLUSTER RESET SOFT#2283

Merged
hpatro merged 6 commits into
valkey-io:unstablefrom
enjoy-binbin:replica_reset_shardid
Jul 3, 2025
Merged

Generate a new shard_id when the replica executes CLUSTER RESET SOFT#2283
hpatro merged 6 commits into
valkey-io:unstablefrom
enjoy-binbin:replica_reset_shardid

Conversation

@enjoy-binbin

@enjoy-binbin enjoy-binbin commented Jun 28, 2025

Copy link
Copy Markdown
Member

When a replica doing the CLUSTER RESET SOFT, we should generate a
new shard_id for it. Since if the node was a replica, it inherits
the shard_id of its primary, and after the CLUSTER RESET SOFT, the
replica become a new empty primary, it should has its own shard_id.

Otherwise, when it joins back into the cluster or responds the PONG,
we will have two primaries in the same shard.

When a replica doing the CLUSTER RESET SOFT, we should generate a
new shard_id for it. Since if the node was a replica, it inherits
the shard_id of its primary, and after the CLUSTER RESET SOFT, the
replica become a new empty primary, it should has its own shard_id.

Otherwise, when it joins back into the cluster or responds the PONG,
we will have two primaries in the same shard.

Signed-off-by: Binbin <binloveplay1314@qq.com>
@enjoy-binbin enjoy-binbin requested a review from hpatro June 28, 2025 15:01
@enjoy-binbin enjoy-binbin added the run-extra-tests Run extra tests on this PR (Runs all tests from daily except valgrind and RESP) label Jun 28, 2025
Comment thread src/cluster_legacy.c Outdated
Signed-off-by: Binbin <binloveplay1314@qq.com>
@codecov

codecov Bot commented Jun 28, 2025

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 71.46%. Comparing base (0b0b813) to head (e8f2d6c).
⚠️ Report is 172 commits behind head on unstable.

Additional details and impacted files
@@             Coverage Diff              @@
##           unstable    #2283      +/-   ##
============================================
- Coverage     71.52%   71.46%   -0.06%     
============================================
  Files           123      123              
  Lines         66926    67032     +106     
============================================
+ Hits          47868    47905      +37     
- Misses        19058    19127      +69     
Files with missing lines Coverage Δ
src/cluster_legacy.c 86.76% <100.00%> (-0.14%) ⬇️

... and 22 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@hpatro hpatro left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with the change around shard id generation. But would like the extension flag marking part left as is.

Comment thread src/cluster_legacy.h Outdated
Comment thread src/cluster_legacy.c
Signed-off-by: Binbin <binloveplay1314@qq.com>
Comment thread src/cluster_legacy.c Outdated
Signed-off-by: Binbin <binloveplay1314@qq.com>
Signed-off-by: Binbin <binloveplay1314@qq.com>
Signed-off-by: Binbin <binloveplay1314@qq.com>
@enjoy-binbin

Copy link
Copy Markdown
Member Author

@hpatro and I discussed and decided to temporarily remove link->support_extension for now. After this merge, we will discuss it in another new PR.

The new link->extension is aim for this case:

  1. Node B cluster reset soft, and remove all the nodes in its cluster_nodes_dict
  2. Node A sends the PING, and Node B will reply a PONG even it does not know A, and in here, since Node B does not know Node A, it won't send the extension. But the link has mf_flags that we all know Node A can handle extension. So in these packets, Node A will know B change the role but not able to learn the shard-id

That is something like that. So if we don't have link->extension, Node B needs to wait until the cluster meets (I remember we have a compensation mechanism that will allow node B to eventually meet back into the cluster.) again before sending the extension

with link->support_extension time:

Testing unit/cluster/shardid-propagation
[ok]: The replica will have a new shard_id after cluster reset soft (65 ms)
[ok]: Check for memory leaks (pid 51984) (587 ms)
[ok]: Check for memory leaks (pid 51954) (312 ms)
[ok]: Check for memory leaks (pid 51924) (313 ms)
[ok]: Check for memory leaks (pid 51881) (293 ms)
[1/1 done]: unit/cluster/shardid-propagation (13 seconds)

without the link->support_extension

[ok]: The replica will have a new shard_id after cluster reset soft (3102 ms)
[ok]: Check for memory leaks (pid 54164) (301 ms)
[ok]: Check for memory leaks (pid 54127) (291 ms)
[ok]: Check for memory leaks (pid 54103) (294 ms)
[ok]: Check for memory leaks (pid 54050) (292 ms)
[1/1 done]: unit/cluster/shardid-propagation (16 seconds)

@hpatro hpatro merged commit e279b00 into valkey-io:unstable Jul 3, 2025
60 of 61 checks passed
@enjoy-binbin enjoy-binbin deleted the replica_reset_shardid branch July 4, 2025 01:54
@enjoy-binbin enjoy-binbin added the release-notes This issue should get a line item in the release notes label Jul 4, 2025
@enjoy-binbin enjoy-binbin moved this to Done in Valkey 9.0 Jul 4, 2025
enjoy-binbin added a commit that referenced this pull request Jul 16, 2025
… faster during handshake (#2310)

Add support_extension to cluster link, so that when myself receives a message from sender,
even if myself does not know the sender, for example during handshake the sender node is
still treated as random node, in thi case, the packet we reply to sender can also carry the
extension, since the sender link explicitly indicates that it supports the extension, by doing
this, we can speed up the spread of the extension.

Also see #2283 and #2279 for more details.
enjoy-binbin added a commit to enjoy-binbin/valkey that referenced this pull request Feb 12, 2026
When a cluster reset is performed on a replica node, a new shard ID is generated
because the node is about to become an empty primary node, see valkey-io#2283.

However, the log added in valkey-io#2510 caused some confusions. In clusterSetNodeAsPrimary
we will print:
```
serverLog(LL_NOTICE, "Reconfiguring node %.40s (%s) as primary for shard %.40s", n->name, humanNodename(n), n->shard_id);
```

In clusterReset, we first call clusterSetNodeAsPrimary and then generate a new
shard ID, which causes us to print an error shard ID log first.

There is an exmaple, when a replica node performs a cluster reset, we will print:
```
xxx * Cluster reset (user request from 'xxx').
xxx * Reconfiguring node af76a3e0ffcd77bd14fa47ce4d07ab2bdc78702f (xxx) as primary for shard ea528667634af8beed83adac2b9af8360769a1b4
```

But the node shard id is actually:
```
xxx> cluster myshardid
"52ede26d1554dd203161ba09011af14574b2cc84"
```

Now after a new shard ID is generated we will print a log, and we also move the
call to clusterSetNodeAsPrimary after the new shard id, so that we can have the
right one. After this PR:
```
xxx * Cluster reset (user request from 'xxx').
xxx * Moving myself to a new shard bd31870ce73f5977084e6a46e337a4a1ad38fc66.
xxx * Reconfiguring node 1d54b904efd30cd9d7d1abbfd63c8fafbb62e1c8 (xxx) as primary for shard bd31870ce73f5977084e6a46e337a4a1ad38fc66
```

This is part of valkey-io#2989, but i guess we won't merge the extension fix in a short
time, so i am gonna extracting it separately as a log fix (or improvement).

Signed-off-by: Binbin <binloveplay1314@qq.com>
enjoy-binbin added a commit that referenced this pull request Feb 24, 2026
When a cluster reset is performed on a replica node, a new shard ID is generated
because the node is about to become an empty primary node, see #2283.

However, the log added in #2510 caused some confusions. In clusterSetNodeAsPrimary
we will print:
```
serverLog(LL_NOTICE, "Reconfiguring node %.40s (%s) as primary for shard %.40s", n->name, humanNodename(n), n->shard_id);
```

In clusterReset, we first call clusterSetNodeAsPrimary and then generate a new
shard ID, which causes us to print an error shard ID log first.

There is an exmaple, when a replica node performs a cluster reset, we will print:
```
xxx * Cluster reset (user request from 'xxx').
xxx * Reconfiguring node af76a3e0ffcd77bd14fa47ce4d07ab2bdc78702f (xxx) as primary for shard ea528667634af8beed83adac2b9af8360769a1b4
```

But the node shard id is actually:
```
xxx> cluster myshardid
"52ede26d1554dd203161ba09011af14574b2cc84"
```

Now after a new shard ID is generated we will print a log, and we also move the
call to clusterSetNodeAsPrimary after the new shard id, so that we can have the
right one. After this PR:
```
xxx * Cluster reset (user request from 'xxx').
xxx * Moving myself to a new shard bd31870ce73f5977084e6a46e337a4a1ad38fc66.
xxx * Reconfiguring node 1d54b904efd30cd9d7d1abbfd63c8fafbb62e1c8 (xxx) as primary for shard bd31870ce73f5977084e6a46e337a4a1ad38fc66
```

This is part of #2989, but i guess we won't merge the extension fix in a short
time, so i am gonna extracting it separately as a log fix (or improvement).

Signed-off-by: Binbin <binloveplay1314@qq.com>
hpatro pushed a commit to hpatro/valkey that referenced this pull request Mar 5, 2026
…3192)

When a cluster reset is performed on a replica node, a new shard ID is generated
because the node is about to become an empty primary node, see valkey-io#2283.

However, the log added in valkey-io#2510 caused some confusions. In clusterSetNodeAsPrimary
we will print:
```
serverLog(LL_NOTICE, "Reconfiguring node %.40s (%s) as primary for shard %.40s", n->name, humanNodename(n), n->shard_id);
```

In clusterReset, we first call clusterSetNodeAsPrimary and then generate a new
shard ID, which causes us to print an error shard ID log first.

There is an exmaple, when a replica node performs a cluster reset, we will print:
```
xxx * Cluster reset (user request from 'xxx').
xxx * Reconfiguring node af76a3e0ffcd77bd14fa47ce4d07ab2bdc78702f (xxx) as primary for shard ea528667634af8beed83adac2b9af8360769a1b4
```

But the node shard id is actually:
```
xxx> cluster myshardid
"52ede26d1554dd203161ba09011af14574b2cc84"
```

Now after a new shard ID is generated we will print a log, and we also move the
call to clusterSetNodeAsPrimary after the new shard id, so that we can have the
right one. After this PR:
```
xxx * Cluster reset (user request from 'xxx').
xxx * Moving myself to a new shard bd31870ce73f5977084e6a46e337a4a1ad38fc66.
xxx * Reconfiguring node 1d54b904efd30cd9d7d1abbfd63c8fafbb62e1c8 (xxx) as primary for shard bd31870ce73f5977084e6a46e337a4a1ad38fc66
```

This is part of valkey-io#2989, but i guess we won't merge the extension fix in a short
time, so i am gonna extracting it separately as a log fix (or improvement).

Signed-off-by: Binbin <binloveplay1314@qq.com>
Signed-off-by: Harkrishn Patro <bunty.hari@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cluster release-notes This issue should get a line item in the release notes run-extra-tests Run extra tests on this PR (Runs all tests from daily except valgrind and RESP)

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

2 participants