Retain a down primary's slot info after automatic fail over #10163

PingXie · 2022-01-23T03:00:52Z

This PR improves the consistency of "cluster nodes" output such that, regardless of whether there exists a replica for the failed primary node, "cluster nodes" executed on any remaining node in the cluster now contains the slot info for the failed primary node.

[Before the change]

...
72f66f4ecbcfc3055faae01a68bbeadc5079bb92 127.0.0.1:30001@40001 master,fail - 1642667704665 1642667703661 1 disconnected
...

With the change

...
72f66f4ecbcfc3055faae01a68bbeadc5079bb92 127.0.0.1:30001@40001 master,fail - 1642667704665 1642667703661 1 disconnected 0-5460
...

in the "cluster nodes" output after an automatic failover. The idea is to bring consistency to the behavior of "cluster nodes" regardless of whether or not a replica exists for the failed primary.

…le with previous Redis releases

…able

This reverts commit b4f5737.

PingXie · 2022-02-17T06:42:08Z

@madolson when you get a chance :-) this PR improves the "cluster nodes" output consistency, irrespective of # of replicas in the cluster.

PingXie · 2022-02-23T10:03:32Z

@zuiderkwast would really appreciate it if you could help review this PR. Thanks!

zuiderkwast

I understand that it can be useful to know which shard a failed master belonged to, but I have some concerns:

Can there can be multiple masters for the same slot range, where only one is not failed? What if clients fail to parse this output and selects the failed master instead of the healthy master? Assume the following two lines are included in the output:
```
 72f66f4ecbcfc3055faae01a68bbeadc5079bb92 127.0.0.1:30001@40001 master,fail - 1642667704665 1642667703661 1 disconnected 0-5460
 ab87c879682639786876de7868976f9100c0c09a 127.0.0.1:30002@40002 master - 1642667704665 1642667703661 1 connected 0-5460
```
If a client only looks for the string "master" and a slot range, then this change can cause the client to select the failed master. It is a breaking change.
The intended solution to the problem that a master has no replicas is replica migration. Perhaps it is possible to improve this automatic process so replicas can be migrated to a master which has < N replicas (if target is N replicas for each master). Maybe it is not too difficult to add a config for that. WDYT?

Perhaps we could even improve that algorithm so a new replica can decide to migrate itself to another master even before it starts replicating the first master. Or we can allow configuring it as a replica without specifying which master (for example allow CLUSTER REPLICATE without a node-id) and then it can automatically decide which master it shall replicate. (Automatic is better than manual...)
If we want to promote the new CLUSTER SHARDS command as a replacement for CLUSTER NODES and CLUSTER SLOTS, it would be good if failed masters can be included in the CLUSTER SHARDS output, the correct shard. Otherwise, if CLUSTER NODES is the only command that provides this information, clients might still prefer it of the other commands.
CLUSTER SHARDS is new command so we don't need to introduce a breaking change if we add it to this command from the beginning, instead of changing CLUSTER NODES.

zuiderkwast · 2022-02-23T17:57:01Z

src/cluster.c

                goto fmterr;
            }
-            while(start <= stop) clusterAddSlot(n, start++);
+            /* Add the slot ranges to the officical slots bitmap only if the


Suggested change

/* Add the slot ranges to the officical slots bitmap only if the

/* Add the slot ranges to the official slots bitmap only if the

PingXie · 2022-02-24T05:55:36Z

Thanks @zuiderkwast.

You had me on the compatibility concern. It is an indeed a breaking change. I will close this PR next.

allow CLUSTER REPLICATE without a node-id

We might want to generalize it a bit further and consider the case where the entire shard is lost (either because there is no replica or all nodes in the shard are dead). Ideally, the implementation would also need to take into account of AZs in addition to the replica count. I guess I was looking for something like "CLUSTER FIX". Under the hood, the node receiving this command will seek out uncovered slot ranges first and if none is found, try to become a replica of a primary that is missing the expected number of replicas. Going with this thought further, could the ultimate form of this idea be integrated into "CLUSTER MEET" or "CLUSTER MEET [FIX]", perhaps? In any case, this is not a trivial task when there are multiple new nodes roaming in the cluster looking for jobs.

it would be good if failed masters can be included in the CLUSTER SHARDS output

This is a great idea. Will continue the discussion on the CLUSTER SHARDS thread.

Ping Xie added 4 commits January 23, 2022 02:28

This change retains a failed primary node's slot assignment information

dd9ee15

in the "cluster nodes" output after an automatic failover. The idea is to bring consistency to the behavior of "cluster nodes" regardless of whether or not a replica exists for the failed primary.

Merge branch 'unstable' of https://github.com/redis/redis into unstable

2533ce4

Resize notused1 in clusterMsg so that clusterMsg is backward compatib…

5b938cd

…le with previous Redis releases

Merge branch 'unstable' of https://github.com/redis/redis into unstable

a17afeb

madolson self-requested a review January 30, 2022 19:54

Ping Xie added 9 commits January 30, 2022 21:46

Merge branch 'unstable' of https://github.com/redis/redis into unstable

508bbac

Merge branch 'unstable' of https://github.com/redis/redis into unstable

8306775

Merge branch 'unstable' of https://github.com/PingXie/redis into unst…

e2a4b35

…able

Merge branch 'unstable' of https://github.com/redis/redis into unstable

5fa64e2

Undo bad merge

b4f5737

Revert "Undo bad merge"

e04fdbc

This reverts commit b4f5737.

Merge branch 'unstable' of https://github.com/redis/redis into unstable

9f52159

Merge branch 'unstable' of https://github.com/redis/redis into unstable

996f3c2

Merge branch 'unstable' of https://github.com/redis/redis into unstable

b153be2

Merge branch 'unstable' into unstable

209fe04

zuiderkwast reviewed Feb 23, 2022

View reviewed changes

PingXie closed this Feb 24, 2022

PingXie mentioned this pull request Feb 24, 2022

Add cluster shards support. #10293

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Retain a down primary's slot info after automatic fail over #10163

Retain a down primary's slot info after automatic fail over #10163

Uh oh!

PingXie commented Jan 23, 2022 •

edited

Loading

Uh oh!

PingXie commented Feb 17, 2022

Uh oh!

PingXie commented Feb 23, 2022

Uh oh!

zuiderkwast left a comment

Uh oh!

zuiderkwast Feb 23, 2022

Uh oh!

PingXie commented Feb 24, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	/* Add the slot ranges to the officical slots bitmap only if the
	/* Add the slot ranges to the official slots bitmap only if the

Retain a down primary's slot info after automatic fail over #10163

Retain a down primary's slot info after automatic fail over #10163

Uh oh!

Conversation

PingXie commented Jan 23, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

PingXie commented Feb 17, 2022

Uh oh!

PingXie commented Feb 23, 2022

Uh oh!

zuiderkwast left a comment

Choose a reason for hiding this comment

Uh oh!

zuiderkwast Feb 23, 2022

Choose a reason for hiding this comment

Uh oh!

PingXie commented Feb 24, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

PingXie commented Jan 23, 2022 •

edited

Loading