Logging fix or improvement around new shard ID generation#3192
Merged
Conversation
When a cluster reset is performed on a replica node, a new shard ID is generated because the node is about to become an empty primary node, see valkey-io#2283. However, the log added in valkey-io#2510 caused some confusions. In clusterSetNodeAsPrimary we will print: ``` serverLog(LL_NOTICE, "Reconfiguring node %.40s (%s) as primary for shard %.40s", n->name, humanNodename(n), n->shard_id); ``` In clusterReset, we first call clusterSetNodeAsPrimary and then generate a new shard ID, which causes us to print an error shard ID log first. There is an exmaple, when a replica node performs a cluster reset, we will print: ``` xxx * Cluster reset (user request from 'xxx'). xxx * Reconfiguring node af76a3e0ffcd77bd14fa47ce4d07ab2bdc78702f (xxx) as primary for shard ea528667634af8beed83adac2b9af8360769a1b4 ``` But the node shard id is actually: ``` xxx> cluster myshardid "52ede26d1554dd203161ba09011af14574b2cc84" ``` Now after a new shard ID is generated we will print a log, and we also move the call to clusterSetNodeAsPrimary after the new shard id, so that we can have the right one. After this PR: ``` xxx * Cluster reset (user request from 'xxx'). xxx * Moving myself to a new shard bd31870ce73f5977084e6a46e337a4a1ad38fc66. xxx * Reconfiguring node 1d54b904efd30cd9d7d1abbfd63c8fafbb62e1c8 (xxx) as primary for shard bd31870ce73f5977084e6a46e337a4a1ad38fc66 ``` This is part of valkey-io#2989, but i guess we won't merge the extension fix in a short time, so i am gonna extracting it separately as a log fix (or improvement). Signed-off-by: Binbin <binloveplay1314@qq.com>
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## unstable #3192 +/- ##
============================================
+ Coverage 74.87% 74.92% +0.05%
============================================
Files 129 129
Lines 71328 71332 +4
============================================
+ Hits 53407 53449 +42
+ Misses 17921 17883 -38
🚀 New features to boost your workflow:
|
dvkashapov
approved these changes
Feb 12, 2026
Nikhil-Manglore
approved these changes
Feb 13, 2026
hpatro
pushed a commit
to hpatro/valkey
that referenced
this pull request
Mar 5, 2026
…3192) When a cluster reset is performed on a replica node, a new shard ID is generated because the node is about to become an empty primary node, see valkey-io#2283. However, the log added in valkey-io#2510 caused some confusions. In clusterSetNodeAsPrimary we will print: ``` serverLog(LL_NOTICE, "Reconfiguring node %.40s (%s) as primary for shard %.40s", n->name, humanNodename(n), n->shard_id); ``` In clusterReset, we first call clusterSetNodeAsPrimary and then generate a new shard ID, which causes us to print an error shard ID log first. There is an exmaple, when a replica node performs a cluster reset, we will print: ``` xxx * Cluster reset (user request from 'xxx'). xxx * Reconfiguring node af76a3e0ffcd77bd14fa47ce4d07ab2bdc78702f (xxx) as primary for shard ea528667634af8beed83adac2b9af8360769a1b4 ``` But the node shard id is actually: ``` xxx> cluster myshardid "52ede26d1554dd203161ba09011af14574b2cc84" ``` Now after a new shard ID is generated we will print a log, and we also move the call to clusterSetNodeAsPrimary after the new shard id, so that we can have the right one. After this PR: ``` xxx * Cluster reset (user request from 'xxx'). xxx * Moving myself to a new shard bd31870ce73f5977084e6a46e337a4a1ad38fc66. xxx * Reconfiguring node 1d54b904efd30cd9d7d1abbfd63c8fafbb62e1c8 (xxx) as primary for shard bd31870ce73f5977084e6a46e337a4a1ad38fc66 ``` This is part of valkey-io#2989, but i guess we won't merge the extension fix in a short time, so i am gonna extracting it separately as a log fix (or improvement). Signed-off-by: Binbin <binloveplay1314@qq.com> Signed-off-by: Harkrishn Patro <bunty.hari@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
When a cluster reset is performed on a replica node, a new shard ID is generated
because the node is about to become an empty primary node, see #2283.
However, the log added in #2510 caused some confusions. In clusterSetNodeAsPrimary
we will print:
In clusterReset, we first call clusterSetNodeAsPrimary and then generate a new
shard ID, which causes us to print an error shard ID log first.
There is an exmaple, when a replica node performs a cluster reset, we will print:
But the node shard id is actually:
Now after a new shard ID is generated we will print a log, and we also move the
call to clusterSetNodeAsPrimary after the new shard id, so that we can have the
right one. After this PR:
This is part of #2989, but i guess we won't merge the extension fix in a short
time, so i am gonna extracting it separately as a log fix (or improvement).