The smaller config epoch primary will become the replica when two primaries in the same shard by enjoy-binbin · Pull Request #2279 · valkey-io/valkey

enjoy-binbin · 2025-06-26T16:33:27Z

In #2261, there is a scenario that results in two primaries in a shard.

Let's say a shard has a primary(A) and a replica(B). Also let's assume that
cluster-allow-replica-migration is disabled.

Node A goes down, and Node B takes over primaryship.
Node A continues to be down while another Node C is added as a replica of B.
Node B goes down, and Node C takes over primaryship.
Node A and Node B come back up and start learning about the topology.
Node A comes up thinking it was the primary (but has an older config epoch compared to C).
Node A learns about Node C via gossip and assigns it a random shard_id.
Node A receives a direct ping from Node C.
a. Node C advertises the same set of slots that Node A was earlier owning.
b. Since Node A assigns a random shard ID to Node C, Node A thinks that it is still a
primary and it lost all its slots to Node C, which is in another shard.
Node A then updates the actual shard_id of Node C while processing shard_id in ping extensions.
Node A and Node C end up being primaries in the same shard while Node C continues to own slots.

So in the step 8, Node A and Node C has the same shard_id and both report themselves
as a primary, but Node C has a bigger config epoch, i guess in this case, we can try
to make Node A become a replica of Node C.

Fixes #2261.

An ASCII seq diagram:

+------+          +------+          +------+
|Node A|          |Node B|          |Node C|
+------+          +------+          +------+
   |                 |                 |
   | <Primary>       | <Replica>       |
   |                 |                 |
   X (fails)         |                 |
                     |---------------->|
                     |  (promoted to primary)
                     |                 |
                     X (fails)         |
                                       |
                                       |---------->
                                       | (promoted to primary)
   | (rejoins)                         |
   | (no extensions shared)            |
   |---------------------------------->|
   | (assigns random shard_id)|
   |<----------------------------------|
   | (direct ping: slot info)          |
   | Updates C’s shard_id correctly    |
   | Detects slot conflict             |
   |                                  
   |                                  
+-----------------------------------------------+
| Outcome: Nodes A & C both see themselves as   |
| primary temporarily until shard_id is updated |
| and converged correctly based on epoch value  |
+-----------------------------------------------+

…maries in the same shard Signed-off-by: Binbin <binloveplay1314@qq.com>

hpatro

Overall, the fix looks good to me.

Liked the GPT generated (slightly modified) ASCII seq diagram for this, pasting it here for others as well.

+------+          +------+          +------+
|Node A|          |Node B|          |Node C|
+------+          +------+          +------+
   |                 |                 |
   | <Primary>       | <Replica>       |
   |                 |                 |
   X (fails)         |                 |
                     |---------------->|
                     |  (promoted to primary)
                     |                 |
                     X (fails)         |
                                       |
                                       |---------->
                                       | (promoted to primary)
   | (rejoins)                         |
   | (no extensions shared)            |
   |---------------------------------->|
   | (assigns random shard_id)|
   |<----------------------------------|
   | (direct ping: slot info)          |
   | Updates C’s shard_id correctly    |
   | Detects slot conflict             |
   |                                  
   |                                  
+-----------------------------------------------+
| Outcome: Nodes A & C both see themselves as   |
| primary temporarily until shard_id is updated |
| and converged correctly based on epoch value  |
+-----------------------------------------------+

Signed-off-by: Binbin <binloveplay1314@qq.com>

enjoy-binbin · 2025-06-28T04:24:17Z

I see the test New replica inherits multiple migrating slots is failing:

47461:M 28 Jun 2025 12:18:24.199 * Two primaries in same shard, and the sender has a greater config epoch. Reconfiguring myself as a replica of 8af67730a16d4a47a947271cca28a03ef9fdf337 ()
cluster_legacy.c:5995 'myself->numslots == 0' is not true

myself cluster nodes:

d2081b715fb4b174127ec7a99de4e180de345fbd 127.0.0.1:21120@31120,,tls-port=0,shard-id=9d3ebac222cc295191331bd3dbba06e607b42019 master - 0 1751084304199 3 connected 10923-16383
ee0d3a2618bf64993ce1e71569f845aa6e33f46b 127.0.0.1:21121@31121,,tls-port=0,shard-id=d3f24d198dac6233c89454ed22f1a3a85b1e0e9f master - 0 1751084304199 2 connected 5462-10922
8af67730a16d4a47a947271cca28a03ef9fdf337 127.0.0.1:21119@31119,,tls-port=0,shard-id=d5ada9160b2d0a5196dcdfe0325f6accf187dec0 master - 0 1751084304199 4 connected
21fe04bcaaf3eb6dfb6217e27b3100dccb0993e4 127.0.0.1:21118@31118,,tls-port=0,shard-id=d3f24d198dac6233c89454ed22f1a3a85b1e0e9f slave ee0d3a2618bf64993ce1e71569f845aa6e33f46b 0 1751084304199 2 connected
1497ec4b6d1ecdc419741306f8d1ae44d9e8fc9d 127.0.0.1:21117@31117,,tls-port=0,shard-id=9d3ebac222cc295191331bd3dbba06e607b42019 slave d2081b715fb4b174127ec7a99de4e180de345fbd 0 1751084304199 3 connected
7c0a9f9b0610f3a3c4d895282ed41f51f9f97a5d 127.0.0.1:21122@31122,,tls-port=0,shard-id=d5ada9160b2d0a5196dcdfe0325f6accf187dec0 myself,master - 0 0 1 connected 0-5461 [7->-ee0d3a2618bf64993ce1e71569f845aa6e33f46b] [13->-ee0d3a2618bf64993ce1e71569f845aa6e33f46b] [17->-ee0d3a2618bf64993ce1e71569f845aa6e33f46b]
```

I will take a look later this day.

enjoy-binbin · 2025-06-28T13:11:50Z

The problem seems to be, a primary used to had a larger config epoch, and then it became a replica, and then it update the shard_id, and then it doing the cluster reset soft and became a empty primary, but remain the larger config epoch and its shard_id, and causing this problem. IT is also the same issue, two primaries in the same shard, but the empty primary had a larger config epoch.

See ##2283. I think when a replica doing the CLUSTER RESET SOFT, we should generate a new shard_id for it.

zuiderkwast · 2025-07-04T22:46:23Z

and then it doing the cluster reset soft and became a empty primary, but remain the larger config epoch and its shard_id, and causing this problem

This test case doesn's use CLUSTER RESET SOFT. ❓

zuiderkwast

LGTM in general.

enjoy-binbin · 2025-07-05T03:01:28Z

This test case doesn's use CLUSTER RESET SOFT. ❓

This is the other test case, i handle it in #2283

    test "New replica inherits multiple migrating slots" {
        # Reset R3 to turn it into an empty node
        assert_equal {OK} [R 3 CLUSTER RESET]
        # Add R3 back as a replica of R0
        assert_equal {OK} [R 3 CLUSTER MEET [srv 0 "host"] [srv 0 "port"]]
        wait_for_role 0 master
        assert_equal {OK} [R 3 CLUSTER REPLICATE $R0_id]
        wait_for_role 3 slave
        # Validate final states
        wait_for_slot_state 3 "\[7->-$R1_id\] \[13->-$R1_id\] \[17->-$R1_id\]"
    }

Signed-off-by: Binbin <binloveplay1314@qq.com>

enjoy-binbin · 2025-07-05T05:00:40Z

+        /* If after processing everything, we find that myself and sender are on the
+         * same shard and are both primaries, if myself config epoch is smaller, make
+         * it a replica. */
+        if (sender && nodeIsPrimary(myself) && nodeIsPrimary(sender) && areInSameShard(myself, sender) &&
+            nodeEpoch(sender) > nodeEpoch(myself)) {
+            clusterHandlePrimariesSameShardCollision(sender);
+        }


The new test added in ##2283 now currently will fail in here:

86095:M 05 Jul 2025 12:57:44.202 # === ASSERTION FAILED === 86095:M 05 Jul 2025 12:57:44.202 # ==> cluster_legacy.c:6020 'myself->numslots == 0' is not true ------ STACK TRACE ------ Backtrace: 0 valkey-server 0x000000010227faac updateShardId + 0 1 valkey-server 0x000000010227d12c clusterReadHandler + 7736 2 valkey-server 0x00000001022f8040 connSocketEventHandler + 180 3 valkey-server 0x00000001021bedd8 aeProcessEvents + 372 4 valkey-server 0x00000001021e8f50 main + 26864 5 dyld 0x0000000180b13154 start + 2476

That is because we drop the link->support_extension in here #2283 (comment)

The shard_id was not updated in time, which led to a misjudgment. So we either need to support link->support_extension or add node->numslots judgment here.

Signed-off-by: Binbin <binloveplay1314@qq.com>

… faster during handshake (#2310) Add support_extension to cluster link, so that when myself receives a message from sender, even if myself does not know the sender, for example during handshake the sender node is still treated as random node, in thi case, the packet we reply to sender can also carry the extension, since the sender link explicitly indicates that it supports the extension, by doing this, we can speed up the spread of the extension. Also see #2283 and #2279 for more details.

Signed-off-by: Binbin <binloveplay1314@qq.com>

codecov · 2025-07-16T04:47:25Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 71.46%. Comparing base (78e7208) to head (9c749e6).
Report is 7 commits behind head on unstable.

Additional details and impacted files

@@             Coverage Diff              @@
##           unstable    #2279      +/-   ##
============================================
+ Coverage     71.42%   71.46%   +0.03%     
============================================
  Files           123      123              
  Lines         67135    67147      +12     
============================================
+ Hits          47952    47987      +35     
+ Misses        19183    19160      -23

Files with missing lines	Coverage Δ
src/cluster_legacy.c	`86.94% <100.00%> (-0.16%)`	⬇️

... and 23 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Signed-off-by: Binbin <binloveplay1314@qq.com>

The assert was added in valkey-io#2301 and we found that there are some situations would trigger assert and crash the server. The reason we added the assert is because, in the code: 1. sender_claimed_primary and sender are in the same shard 2. and sender is the old primary, sender_claimed_primary is the old replica 3. and now sender become a replica, sender_claimed_primary become a primary That means a failover happend in the shard, and sender should be the primary of sender_claimed_primary. But obviously this assumption may be wrong, we rely on shard_id to determine whether it is in a same shard, and assume that a shard can only have one primary. But this is wrong, from valkey-io#2279 we can know there will be a case that we can create two primaries in the same shard due to the untimely update of shard_id. So we can create a test that trigger the assert in this way: 1. pre condition: two primaries in the same shard, one has slots and one is empty. 2. replica doing a cluster failover 3. the empty primary doing a cluster replicate with the replica (new primary) We change the assert to an if condition to fix it. Signed-off-by: Binbin <binloveplay1314@qq.com>

…2431) The assert was added in #2301 and we found that there are some situations would trigger assert and crash the server. The reason we added the assert is because, in the code: 1. sender_claimed_primary and sender are in the same shard 2. and sender is the old primary, sender_claimed_primary is the old replica 3. and now sender become a replica, sender_claimed_primary become a primary That means a failover happend in the shard, and sender should be the primary of sender_claimed_primary. But obviously this assumption may be wrong, we rely on shard_id to determine whether it is in a same shard, and assume that a shard can only have one primary. But this is wrong, from #2279 we can know there will be a case that we can create two primaries in the same shard due to the untimely update of shard_id. So we can create a test that trigger the assert in this way: 1. pre condition: two primaries in the same shard, one has slots and one is empty. 2. replica doing a cluster failover 3. the empty primary doing a cluster replicate with the replica (new primary) We change the assert to an if condition to fix it. Closes #2423. Note that the test written here also exposes the issue in #2441, so these two may need to be addressed together. Signed-off-by: Binbin <binloveplay1314@qq.com>

…alkey-io#2431) The assert was added in valkey-io#2301 and we found that there are some situations would trigger assert and crash the server. The reason we added the assert is because, in the code: 1. sender_claimed_primary and sender are in the same shard 2. and sender is the old primary, sender_claimed_primary is the old replica 3. and now sender become a replica, sender_claimed_primary become a primary That means a failover happend in the shard, and sender should be the primary of sender_claimed_primary. But obviously this assumption may be wrong, we rely on shard_id to determine whether it is in a same shard, and assume that a shard can only have one primary. But this is wrong, from valkey-io#2279 we can know there will be a case that we can create two primaries in the same shard due to the untimely update of shard_id. So we can create a test that trigger the assert in this way: 1. pre condition: two primaries in the same shard, one has slots and one is empty. 2. replica doing a cluster failover 3. the empty primary doing a cluster replicate with the replica (new primary) We change the assert to an if condition to fix it. Closes valkey-io#2423. Note that the test written here also exposes the issue in valkey-io#2441, so these two may need to be addressed together. Signed-off-by: Binbin <binloveplay1314@qq.com>

…alkey-io#2431) The assert was added in valkey-io#2301 and we found that there are some situations would trigger assert and crash the server. The reason we added the assert is because, in the code: 1. sender_claimed_primary and sender are in the same shard 2. and sender is the old primary, sender_claimed_primary is the old replica 3. and now sender become a replica, sender_claimed_primary become a primary That means a failover happend in the shard, and sender should be the primary of sender_claimed_primary. But obviously this assumption may be wrong, we rely on shard_id to determine whether it is in a same shard, and assume that a shard can only have one primary. But this is wrong, from valkey-io#2279 we can know there will be a case that we can create two primaries in the same shard due to the untimely update of shard_id. So we can create a test that trigger the assert in this way: 1. pre condition: two primaries in the same shard, one has slots and one is empty. 2. replica doing a cluster failover 3. the empty primary doing a cluster replicate with the replica (new primary) We change the assert to an if condition to fix it. Closes valkey-io#2423. Note that the test written here also exposes the issue in valkey-io#2441, so these two may need to be addressed together. Signed-off-by: Binbin <binloveplay1314@qq.com> Signed-off-by: Harkrishn Patro <harkrisp@amazon.com>

hpatro · 2025-10-10T18:38:23Z

A bit too late to include this change to 9.0 GA. Let's target it for 9.1

coderabbitai · 2026-05-28T08:11:47Z

📝 Walkthrough

Walkthrough

This PR adds automatic collision resolution when two primary nodes are detected in the same shard. A new handler reconfigures the local node as a replica when the sender has a higher config epoch, and an integration test validates the scenario across multiple failovers and reconnections.

Changes

Primary Collision Resolution

Layer / File(s)	Summary
Collision handler implementation and integration `src/cluster_legacy.c`	New `clusterHandlePrimariesSameShardCollision()` logs and reconfigures the local node as the sender's replica. Integrated into message processing after gossip and ping extension handling, with context comment on ping extension timing.
Collision scenario test `tests/unit/cluster/shardid-propagation.tcl`	Integration test drives topology through isolation, forced failovers, and reconnection control to create two primaries in the same shard, then asserts both nodes transition to replica roles.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Suggested reviewers

zuiderkwast

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately describes the main change: handling the scenario where two primaries exist in the same shard by making the smaller config epoch primary become a replica.
Description check	✅ Passed	The description is directly related to the changeset, providing detailed context about issue `#2261`, the reproduction scenario, and the proposed solution with an ASCII diagram.
Linked Issues check	✅ Passed	The PR implements the core requirement from `#2261`: when two primaries are detected in the same shard, the node with the smaller config epoch becomes a replica of the node with the larger epoch.
Out of Scope Changes check	✅ Passed	All changes are in-scope: the new collision handler addresses `#2261`'s primary issue, the test validates the fix, and a comment clarification does not alter runtime behavior.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

Warning

Review ran into problems

🔥 Problems

Git: Failed to clone repository. Please run the @coderabbitai full review command to re-trigger a full review. If the issue persists, set path_filters to include or exclude specific files.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/cluster_legacy.c`:
- Around line 4416-4422: This change in cluster_legacy.c modifies shard
collision/failover behavior around the block that checks nodeIsPrimary(myself),
nodeIsPrimary(sender), areInSameShard(myself, sender) and calls
clusterHandlePrimariesSameShardCollision(sender); before merging, update the PR
description and add an explicit request for an `@core-team` architectural review
referencing the changed logic and the symbols
clusterHandlePrimariesSameShardCollision, nodeIsPrimary, areInSameShard, and
nodeEpoch so core architects can evaluate failover semantics; ensure the PR
summary explains the behavioral impact and links to this diff for clear review.
- Around line 2469-2475: The function clusterHandlePrimariesSameShardCollision
is a helper used only within this translation unit and should have internal
linkage; make it file-local by adding the static keyword to its declaration
(change the definition of clusterHandlePrimariesSameShardCollision to be static
void clusterHandlePrimariesSameShardCollision(clusterNode *sender)) so the
symbol is not exported, leaving the body and calls to clusterSetPrimary and
clusterDoBeforeSleep unchanged.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro Plus

Run ID: d792cf65-87a4-4fe6-8f47-c4bd699fa8af

📥 Commits

Reviewing files that changed from the base of the PR and between 7c3821f and 76f3a4e.

📒 Files selected for processing (2)

src/cluster_legacy.c
tests/unit/cluster/shardid-propagation.tcl

coderabbitai · 2026-05-28T08:17:43Z

+void clusterHandlePrimariesSameShardCollision(clusterNode *sender) {
+    serverLog(LL_NOTICE, "Two primaries in same shard, and the sender has a greater config epoch. "
+                         "Reconfiguring myself as a replica of %.40s (%s)",
+              sender->name, sender->human_nodename);
+    clusterSetPrimary(sender, 1, 0);
+    clusterDoBeforeSleep(CLUSTER_TODO_SAVE_CONFIG | CLUSTER_TODO_UPDATE_STATE | CLUSTER_TODO_FSYNC_CONFIG);
+}


🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win

Make the collision handler file-local with static.

Line 2469 introduces a helper that is only used inside src/cluster_legacy.c; it should be declared static to keep linkage local.

Proposed fix

-void clusterHandlePrimariesSameShardCollision(clusterNode *sender) { +static void clusterHandlePrimariesSameShardCollision(clusterNode *sender) {

As per coding guidelines: "Use static keyword for file-local functions in C code".

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

void clusterHandlePrimariesSameShardCollision(clusterNode *sender) {

serverLog(LL_NOTICE, "Two primaries in same shard, and the sender has a greater config epoch. "

"Reconfiguring myself as a replica of %.40s (%s)",

sender->name, sender->human_nodename);

clusterSetPrimary(sender, 1, 0);

clusterDoBeforeSleep(CLUSTER_TODO_SAVE_CONFIG | CLUSTER_TODO_UPDATE_STATE | CLUSTER_TODO_FSYNC_CONFIG);

}

static void clusterHandlePrimariesSameShardCollision(clusterNode *sender) {

serverLog(LL_NOTICE, "Two primaries in same shard, and the sender has a greater config epoch. "

"Reconfiguring myself as a replica of %.40s (%s)",

sender->name, sender->human_nodename);

clusterSetPrimary(sender, 1, 0);

clusterDoBeforeSleep(CLUSTER_TODO_SAVE_CONFIG | CLUSTER_TODO_UPDATE_STATE | CLUSTER_TODO_FSYNC_CONFIG);

}

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/cluster_legacy.c` around lines 2469 - 2475, The function clusterHandlePrimariesSameShardCollision is a helper used only within this translation unit and should have internal linkage; make it file-local by adding the static keyword to its declaration (change the definition of clusterHandlePrimariesSameShardCollision to be static void clusterHandlePrimariesSameShardCollision(clusterNode *sender)) so the symbol is not exported, leaving the body and calls to clusterSetPrimary and clusterDoBeforeSleep unchanged.

coderabbitai · 2026-05-28T08:17:43Z

+        /* If after processing everything, we find that myself and sender are on the
+         * same shard and are both primaries, if myself is a empty primary and myself
+         * config epoch is smaller, make it become a replica of sender. */
+        if (sender && nodeIsPrimary(myself) && nodeIsPrimary(sender) && areInSameShard(myself, sender) &&
+            myself->numslots == 0 && nodeEpoch(sender) > nodeEpoch(myself)) {
+            clusterHandlePrimariesSameShardCollision(sender);
+        }


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Request @core-team architectural review before merge.

This change alters shard collision/failover behavior in src/cluster_legacy.c and should explicitly get @core-team review.

As per coding guidelines: "Request @core-team architectural review for changes to cluster*.c, replication.c, rdb.c, or aof.c".

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/cluster_legacy.c` around lines 4416 - 4422, This change in cluster_legacy.c modifies shard collision/failover behavior around the block that checks nodeIsPrimary(myself), nodeIsPrimary(sender), areInSameShard(myself, sender) and calls clusterHandlePrimariesSameShardCollision(sender); before merging, update the PR description and add an explicit request for an `@core-team` architectural review referencing the changed logic and the symbols clusterHandlePrimariesSameShardCollision, nodeIsPrimary, areInSameShard, and nodeEpoch so core architects can evaluate failover semantics; ensure the PR summary explains the behavioral impact and links to this diff for clear review.

The smaller config epoch primary will become the replica when two pri…

d49eb01

…maries in the same shard Signed-off-by: Binbin <binloveplay1314@qq.com>

deepakrn reviewed Jun 27, 2025

View reviewed changes

Comment thread tests/unit/cluster/shardid-propagation.tcl Outdated

deepakrn reviewed Jun 27, 2025

View reviewed changes

Comment thread tests/unit/cluster/shardid-propagation.tcl Outdated

hpatro reviewed Jun 27, 2025

View reviewed changes

Comment thread src/cluster_legacy.c Outdated

Comment thread src/cluster_legacy.c Outdated

code review

990691b

Signed-off-by: Binbin <binloveplay1314@qq.com>

enjoy-binbin marked this pull request as draft June 28, 2025 15:07

zuiderkwast reviewed Jul 4, 2025

View reviewed changes

Comment thread tests/unit/cluster/shardid-propagation.tcl

Merge remote-tracking branch 'upstream/unstable' into issue_2261

cc58cd4

Signed-off-by: Binbin <binloveplay1314@qq.com>

enjoy-binbin commented Jul 5, 2025

View reviewed changes

enjoy-binbin added this to Valkey 9.0 Jul 5, 2025

enjoy-binbin moved this to Todo in Valkey 9.0 Jul 5, 2025

enjoy-binbin mentioned this pull request Jul 5, 2025

Add extensions supported to cluster link layer to propagate extension faster during handshake #2310

Merged

enjoy-binbin added 2 commits July 11, 2025 19:01

Merge remote-tracking branch 'upstream/unstable' into issue_2261

2dbc7aa

Signed-off-by: Binbin <binloveplay1314@qq.com>

Try to move the process of extension to the front

78bb09e

Signed-off-by: Binbin <binloveplay1314@qq.com>

enjoy-binbin added the run-extra-tests Run extra tests on this PR (Runs all tests from daily except valgrind and RESP) label Jul 11, 2025

enjoy-binbin added 2 commits July 16, 2025 10:27

Merge remote-tracking branch 'upstream/unstable' into issue_2261

bf3f54c

Signed-off-by: Binbin <binloveplay1314@qq.com>

Dont move clusterProcessPingExtensions to the front

f72b1b5

Signed-off-by: Binbin <binloveplay1314@qq.com>

still need to check if myself is a empty primary

9c749e6

Signed-off-by: Binbin <binloveplay1314@qq.com>

enjoy-binbin marked this pull request as ready for review July 23, 2025 08:35

enjoy-binbin requested a review from PingXie July 23, 2025 08:35

hpatro self-requested a review August 1, 2025 20:34

enjoy-binbin mentioned this pull request Aug 5, 2025

Fix replica (the old primary) claims to sitll have slots after manual failover #2301

Merged

enjoy-binbin mentioned this pull request Aug 5, 2025

Change the same shard failover assert to if condition to avoid crash #2431

Merged

hpatro mentioned this pull request Sep 11, 2025

[BUG] Two primaries in same shard #2261

Open

enjoy-binbin added this to Valkey 9.1 Oct 11, 2025

enjoy-binbin removed this from Valkey 9.0 Oct 11, 2025

Merge branch 'unstable' into issue_2261

76f3a4e

coderabbitai Bot reviewed May 28, 2026

View reviewed changes

Uh oh!

Conversation

enjoy-binbin commented Jun 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

hpatro left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

enjoy-binbin commented Jun 28, 2025

Uh oh!

enjoy-binbin commented Jun 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

zuiderkwast commented Jul 4, 2025

Uh oh!

zuiderkwast left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

enjoy-binbin commented Jul 5, 2025

Uh oh!

enjoy-binbin Jul 5, 2025

Choose a reason for hiding this comment

Uh oh!

codecov Bot commented Jul 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

hpatro commented Oct 10, 2025

Uh oh!

coderabbitai Bot commented May 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Suggested reviewers

Review ran into problems

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 28, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 28, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

enjoy-binbin commented Jun 26, 2025 •

edited

Loading

enjoy-binbin commented Jun 28, 2025 •

edited

Loading

codecov Bot commented Jul 16, 2025 •

edited

Loading

coderabbitai Bot commented May 28, 2026 •

edited

Loading