Skip to content

replica redirect read&write to master in standalone mode#12192

Closed
soloestoy wants to merge 4 commits into
redis:unstablefrom
soloestoy:replica-redirect
Closed

replica redirect read&write to master in standalone mode#12192
soloestoy wants to merge 4 commits into
redis:unstablefrom
soloestoy:replica-redirect

Conversation

@soloestoy

@soloestoy soloestoy commented May 17, 2023

Copy link
Copy Markdown
Contributor

To implement #12097

  1. replica is able to redirect read and write commands to it's master in standalone mode
    • reply with "-MOVED -1 master-ip:port"
  2. add a config replica-enable-redirect to control whether to redirect or not, with the default setting being off
    • when enabled, the data access commands(read and write) will be redirected
  3. allow readonly and readwrite command in standalone mode, may be a breaking change
    • use READONLY command can allow read commands on a replica when replica-enable-redirect enabled

other tips:

  1. pubsub commands are not redirected
  2. the tls&tcp multi ports problem are not addressed

@soloestoy soloestoy added state:major-decision Requires core team consensus breaking-change This change can potentially break existing application labels May 17, 2023
@oranagra

Copy link
Copy Markdown
Member

generally LGTM.

I think the -1 in the slot number is acceptable, i suppose that for many clients will find it easier than supporting a new reply, and i hope that for some it'll even be implicitly supported in the current code, but let's validate that with client library maintainers.

not sure regarding timing (fitting this into 7.2), don't want to rush it and then regret.

@madolson

Copy link
Copy Markdown
Contributor

reuse MOVED or add a new reply REDIRECT, I see MOVED has slot arg, now I set -1, maybe REDIRECT without slot is better?

I have a strong preference to keep MOVED. It still seems easier long term for clients to just understand one error.

do we need redirect the pubsub commands? they are not read or write

We currently don't force pubsub commands to primaries, but maybe we should, I think that's a different conversation though.

the TLS problem, master and replica doesn't know the real work port each other

Not sure what to say about that.

@madolson

Copy link
Copy Markdown
Contributor

How is this a breaking change?

@soloestoy

Copy link
Copy Markdown
Contributor Author

How is this a breaking change?

readonly and readwrite cannot be executed in standalone mode before this, I'm not sure if it will really have an impact, but to be safe I labeled breaking change.

@yangbodong22011

Copy link
Copy Markdown
Contributor

I vote for MOVED, which will not bring new understanding costs.

@zuiderkwast zuiderkwast left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

I like MOVED with slot = -1. I think it is intuitive.

Comment thread src/config.c Outdated
createBoolConfig("replica-serve-stale-data", "slave-serve-stale-data", MODIFIABLE_CONFIG, server.repl_serve_stale_data, 1, NULL, NULL),
createBoolConfig("replica-read-only", "slave-read-only", DEBUG_CONFIG | MODIFIABLE_CONFIG, server.repl_slave_ro, 1, NULL, NULL),
createBoolConfig("replica-ignore-maxmemory", "slave-ignore-maxmemory", MODIFIABLE_CONFIG, server.repl_slave_ignore_maxmemory, 1, NULL, NULL),
createBoolConfig("replica-redirect-read-write", NULL, MODIFIABLE_CONFIG, server.repl_replica_redirect_rw, 0, NULL, NULL),

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is replica-redirect enough? It can be implicitly understood that it is only for read and write commands, just like for cluster redirects.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually am not sure it's clear enough! We've had folks ask if they could disable redirects for cluster mode (for mostly bad reasons), so they might think they could turn this off for cluster mode.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you want to add something like 'standalone' to the name to make it clear it's not a cluster config?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess in my ideal world this config would also apply on cluster mode enabled as well, and it would be a cluster agnostic config. At that point maybe your naming would be sufficient, replica-enable-redirects maybe?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is already the behaviour in cluster mode. Would the config have different defaults in cluster vs standalone mode. Btw disabling it in cluster mode is "for mostly bad reasons" you said above so I thought you don't want to allow that...

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i agree it would be nice to have it cluster agnostic, and that then we have a problem with default.
unless we wanna make it default to yes in standalone too.

another bad option is to make it an enum config with [yes,no,cluster-only], the last one being the default.

@soloestoy soloestoy May 22, 2023

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we want this, I prefer split into two configs, like replica-enable-redirect by default no and cluster-replica-enable-redirect by default yes.

@madolson

Copy link
Copy Markdown
Contributor

readonly and readwrite cannot be executed in standalone mode before this, I'm not sure if it will really have an impact, but to be safe I labeled breaking change.

Usually this is just a major decision. I don't think we would expect clients to be relying on this functionality for anything? I'm going to remove the tag.

@madolson madolson removed the breaking-change This change can potentially break existing application label May 19, 2023
Comment thread src/server.c
(is_write_command ||
(is_read_command && !(c->flags & CLIENT_READONLY)))) {
addReplyErrorSds(c,sdscatprintf(sdsempty(), "-MOVED -1 %s:%d",
server.masterhost, server.masterport));

@madolson madolson May 19, 2023

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Presumably we need to also cover the "announced" port for K8s type distributions, where this port is not the same port the end user would see. I guess announced IP/hostnames are also problems.

EDIT: To add some more context here. We've seen a number of users try to use the masterhost and masterport info field for topology discovery, which doesn't work correctly.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point. but for that we need a new config, right?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you mean master-announce-ip and master-announce-port?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, something like that. Not sure how I feel about adding two new configurations that I would like to deprecate though. Ideally we would re-use the cluster-announce-* type configs. This relates to how I want to make sure we are aligned long term.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For tls, we'd need master-announce-tls-port too then. Many new configs. Or maybe only these two: redirect host:port and redirect-tls host:port, i.e. redirect to any node, regardless of whether it's our master or not. Too flexible?

@soloestoy soloestoy May 23, 2023

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer @zuiderkwast 's idea, the redirect and redirect-tls has a clear scope, only affect redirect.

If we support master-announce-*, seems we need apply the announce ip:port to info role and other commands' reply. But this could cause some internal management services to become unavailable, since the internal services and user businesses possibly belong to different network, and the management may rely on the real ip:port not the announced ip:port in info and role etc.

@oranagra

Copy link
Copy Markdown
Member

we discussed this in a core-team meeting, we feel we'd like to completely design this feature, before implementing and committing to parts of it.

two concerns that were raised are:

  1. maybe MOVED response will need to carry more info (like protocol, TLS, or others)
  2. we think we'd like all multi-instance deployments (even a master+replica set) to look like a cluster, that means from topology discovery concerns, we might wanna present them as such to clients.
  3. the above discussion about instances in containers or behind NAT, and master-announce-*, etc. (including TLS), all of that should be designed before we take the first step.

further discussions are needed..

@soloestoy

Copy link
Copy Markdown
Contributor Author

it's ok we can design this feature completely and raise all concerns, this is a good way in scheme design stage.

and now this PR can work well when users deploy Redis on simple hosts, I think we have reached a consensus on this point. and the implementation now is forward compatible, .

about the concerns:

  1. extend the ability of MOVE is a good idea, but I think this is out of this PR. MOVE is a general redirection reply that affects not only standalone but also cluster, we can do it in 8.0
  2. I don't get this point, clients can use info replication and role to get the master-replicas topology, and get the replica-subreplicas topology recursively.
  3. this is a practical problem, but the current *-announce-* implement are not perfect either, sometimes it can be confusing.
    • For example when we config the announced ip:port, they will override the real ip:port for communication between Redis noeds, but users may cannot access the announced ip:port. In this case they need use cluster-announce-hostname, but IMHO the cluster-announce-hostname and cluster-announce-ip are confusing, users cannot distinguish which one is for server side and which one is for client side. Moreover, cluster-announce-hostname may need cluster-announce-hostname-port and cluster-announce-hostname-tls-port to work.
      image
    • Another situation is Redis nodes and client in different subnets as described in Feature Request: Accessing Redis Cluster running behind NAT gateway #7460, if users config cluster-announce-ip:port then the Redis noes cannot reach each other via the announced ip:port
      image
    • There are also many issues with the current announced ip:port mechanism, and we need to solve the problem before applying it here. I don't want this PR to become too complicated, I suggest we put these things aside for now and focus on satisfying the users of Simple Host first.

all in all, my opinion is we can design completely but implement step by step.

@monkey92t

monkey92t commented May 24, 2023

Copy link
Copy Markdown

First, it is necessary to clarify the meaning of "MOVE" in the context mentioned. Does it represent 301 or 302? Perhaps more people believe that "MOVE" represents 301, but in the case of redis-cluster, it has always been possible to have A -> MOVE -> B, which makes it look more like 302.

I think perhaps it should be called "REPLACER," which would indicate that the new address permanently replaces all the functionalities of the current node, rather than just being a redirection.

Regarding the seamless switch problem in Redis, I propose adding a "retired" status to the master node. If the current node is not the master node or there are no slave nodes available, an error will be returned..

When the master node is in the "retired" status, it can still accept read and write commands without the client being aware of this change. In this state, the master node handles read commands as usual, but forwards write commands to the slave node(s) (while keeping track of keys that the slave node(s) have not yet synchronized).

Once the slave node(s) reach an identical state with the master node, the master node responds to any command with "REPLACER". When the client receives the "REPLACER" status, it switches to the slave node, which provides all the functionalities of the original master node.

@madolson

madolson commented May 24, 2023

Copy link
Copy Markdown
Contributor

I will say, I think the incremental delivery we made this happen in makes me wish we had thought this through more holistically :D.

So, is our ideal cluster setup:

These configs are set as a block to indicate there is a different way to reach this node for replication and cluster. By default we assume a flat topology for all nodes, so this would also apply to clients, however we definitely assume the topology is flat for the nodes in the cluster.

cluster-announce-ip
cluster-announce-port
cluster-announce-tls-port
cluster-announce-bus-port

We also need a second set of configs, which are used when the route from the client to the cluster may not be the same as the route between nodes in the cluster.

cluster-announce-hostname: The client visible name displayed in redirects and topology.
<new> cluster-announced-client-ip: We're going to assume there is a fixed IP for both TLS and TCP.
<new> cluster-announced-client-tls-port: This is the port to show in clients in redirects and topology for tls clients.
<new> cluster-announce-client-port: This is the port to show to clients in redirects and topology.

I think in the fullness of time we want all of this configuration for cluster mode disabled as well, so I'm not clear why we wouldn't just copy the interface we are currently using.

I don't get this point, clients can use info replication and role to get the master-replicas topology, and get the replica-subreplicas topology recursively.

Do we want clients to be doing the replication info? I think we don't, I think we want them to call something like cluster shards to learn about the topology. It's an important call out here that cluster mode enabled doesn't have sub-replicas, so what is the right way to expose that. I see a few approaches:

  1. We have all nodes know the topology, through gossip or some other means.
  2. We push the clients to call multiple discovery commands or force it to a specific node.

I think we want to align on 1 to be our long term solution.

all in all, my opinion is we can design completely but implement step by step.

I think this is very much an engineering mindset, but it's not very coherent from project mindset. For performance improvements, that incremental approach makes more sense since, but when releasing a coherent feature, we should strive to build something that is consistent over a longer time frame.

@soloestoy

soloestoy commented May 25, 2023

Copy link
Copy Markdown
Contributor Author

These configs are set as a block to indicate there is a different way to reach this node for replication and cluster. By default we assume a flat topology for all nodes, so this would also apply to clients, however we definitely assume the topology is flat for the nodes in the cluster.

cluster-announce-ip
cluster-announce-port
cluster-announce-tls-port
cluster-announce-bus-port

We also need a second set of configs, which are used when the route from the client to the cluster may not be the same as the route between nodes in the cluster.

cluster-announce-hostname: The client visible name displayed in redirects and topology.
<new> cluster-announced-client-ip: We're going to assume there is a fixed IP for both TLS and TCP.
<new> cluster-announced-client-tls-port: This is the port to show in clients in redirects and topology for tls clients.
<new> cluster-announce-client-port: This is the port to show to clients in redirects and topology.

This makes sense, but there is a problem how to identify whether this connection is from a client or a node, some admin service may need get the announce topology not the announce client topology. I know some virtual network tech can do that, but there is currently no universal standardized technology to achieve this. Maybe we need some ways like a new command CLIENT COME-FROM, but it's a bit ugly, and command can lie.

I think in the fullness of time we want all of this configuration for cluster mode disabled as well, so I'm not clear why we wouldn't just copy the interface we are currently using.

Yes, I agree these configs can be used for more than just cluster, maybe in some day we can remove the cluster prefix.

I didn't understand the words copy the interface we are currently using, you mean adding master-announce-ip:port? I think it's unnecessary, master-replica is different with cluster-nodes, master doesn't need connect to replica, the REPLICAOF must use the NAT ip:port to connect master, so the -MOVED must reply with the NAT master ip:port.

Do we want clients to be doing the replication info? I think we don't

But now like sentinel is using these to get the topology, and in Redis perspective Sentinel is just a normal client.

I think this is very much an engineering mindset, but it's not very coherent from project mindset. For performance improvements, that incremental approach makes more sense since, but when releasing a coherent feature, we should strive to build something that is consistent over a longer time frame.

This feature is forward compatible, if we add some new configs or change internal mechanism, it can still work well for users, that's why I think we can do step by step.

@soloestoy

Copy link
Copy Markdown
Contributor Author

ping @redis/core-team @zuiderkwast, need your feedbacks

Comment thread redis.conf Outdated
Co-authored-by: Oran Agra <oran@redislabs.com>
@madolson

madolson commented Jun 20, 2023

Copy link
Copy Markdown
Contributor

Yes, I agree these configs can be used for more than just cluster, maybe in some day we can remove the cluster prefix.

I think it makes sense to keep cluster. While talking with our AWS customers, people seem to resonate calling any collection of nodes a cluster. The configs only make sense for more than 1 node.

I didn't understand the words copy the interface we are currently using, you mean adding master-announce-ip:port? I think it's unnecessary, master-replica is different with cluster-nodes, master doesn't need connect to replica, the REPLICAOF must use the NAT ip:port to connect master, so the -MOVED must reply with the NAT master ip:port.

I would really like to just use the current cluster announce configs, and only add a single new config which describes an unsharded cluster (in addition to the other configs I mentioned). I want to merge our two deployment configurations, cluster-enabled and disabled, as much as possible. From an end-user perspective the result is basically the same, but from a management and maintenance perspective, I think it will be good for us long term.

This feature is forward compatible, if we add some new configs or change internal mechanism, it can still work well for users, that's why I think we can do step by step.

I agree with this, but I just think it's short sighted. I don't want end users configuring stuff that stops being relevant quickly.

@soloestoy

Copy link
Copy Markdown
Contributor Author

I think it makes sense to keep cluster. While talking with our AWS customers, people seem to resonate calling any collection of nodes a cluster. They configs only make sense for more than 1 node.

Interesting, on the contrary, our users at Alibaba Cloud prefer to clearly distinguish between standalone and cluster because these two usage modes are very different.

I have experience with AWS ElastiCache and have consulted with some users. I find it misleading to call a standalone mode with a primary node and multiple secondary nodes a "cluster". I do not like this design.

I would really like to just use the current cluster announce configs, and only add a single new config which describes an unsharded cluster (in addition to the other configs I mentioned). I want to merge our two deployment configurations, cluster-enabled and disabled, as much as possible. From an end-user perspective the result is basically the same, but from a management and maintenance perspective, I think it will be good for us long term.

I don't want to mix up standalone and cluster. If you want to merge the configuration options for these two different modes, I still believe that removing the "cluster" prefix is a good approach because the announced IP port doesn't need to distinguish between deployment modes.

I don't want end users configuring stuff that stops being relevant quickly.

Yes, that's why I don't want to add too much configuration. The current approach is already sufficient for users who deploy directly on physical hosts and for those who deploy in NAT mode using replica-announce-ip. In my opinion, before we have figured out the correct and optimal use of announced IP:Port, it's better to maintain a relatively pure state and this is the correct way to avoid short-sighted.

@soloestoy

Copy link
Copy Markdown
Contributor Author
  • Does the redirect apply to all connections or just the one we received the reply for?
  • If we have a connection pool, do we need to drop all connections and reconnect?
  • What happens if the -MOVED target fails to resolve / connect / authenticate? Do we have to remember the old address and go back to it again, hoping we'll get better luck or at least -MOVED to a better endpoint? Otherwise, it's a dead end.

@yossigo Wouldn't cluster mode have these issues as well? These problems are not unique to standalone mode, and cluster mode will also encounter them. Moreover, these problems have already been solved in cluster mode and can be easily applied to standalone mode. As @yangbodong22011 said, this is not a difficult task.

@soloestoy

Copy link
Copy Markdown
Contributor Author

It looks like we all agree that standalone should be long-term supported, I think we can merge it, @redis/core-team any other questions?

@yossigo

yossigo commented Jul 3, 2023

Copy link
Copy Markdown
Collaborator

@soloestoy We agree about standalone mode being supported in the future, but I think merging this PR (as is or with modifications) requires consensus on other topics where we still don't have it.

I think we should at least be aligned about the answers to these questions:

  1. Should the standalone mode support high availability, or can we assume we have a non-sharded cluster for that?
  2. If high availability in standalone mode is supported, what does that look like on the protocol side? Arbitrary extensions, a subset of Redis Cluster Spec, or a full Redis Cluster Spec?
  3. Do we expect all standalone clients to support high availability (unlike in the current state where clients need to support Sentinel or Cluster mode explicitly)? If we do, and we adopt a full or partial cluster spec for standalone high availability, doesn't this mean we effectively want to move all clients to cluster mode?

@soloestoy

soloestoy commented Jul 3, 2023

Copy link
Copy Markdown
Contributor Author

@yossigo I couldn't fully understand what you were saying, I'm trying to provide some answers based on my understanding:

Firstly, of course, standalone should support high availability, which is unrelated to the Redis mode (standalone or cluster) and even Redis. All databases should support high availability.

Regarding Redis, it's not true that returning -MOVED in cluster mode means high availability. The -MOVED in cluster mode is essentially to indicate the change of the routing table, which includes failover.

Cluster supported self-failover from the beginning, which I think is a wise choice. However, Cluster also allows disabling automatic failover by setting cluster-replica-no-failover. In this case, is it not high availability if manual failover is performed through an external system?

As far as I know, many users build their own high availability systems, even without using Sentinel and Cluster's auto failover. These self-built high availability systems ensure their running Redis, whether standalone or cluster. I guess AWS is doing the same.

In addition, -MOVED is only a supplement to the high availability of the server-side master-replica role switch, or it is an optimization of Redis high availability switch. The -MOVED reply in Cluster can help clients better handle server-side switch. Standalone switch is still happening without -MOVED, and having -MOVED doesn't mean it becomes Cluster.

I still need to emphasize one point: -MOVED has two meanings, "Slot migration occurred" and "The master-replica role has changed". If stick to its literal meaning, -MOVED should only be used to describe slot migration, and master-replica role switch should be described by other words, such as -REDIRECT, even the switch in the cluster mode should use -REDIRECT. But I don't recommend doing so, as it would break existing systems and has no meaning.

@yossigo

yossigo commented Jul 9, 2023

Copy link
Copy Markdown
Collaborator

@yossigo I couldn't fully understand what you were saying, I'm trying to provide some answers based on my understanding:

Sorry about that. Please refer to specific unclear questions so I can try to explain myself more clearly.

Firstly, of course, standalone should support high availability, which is unrelated to the Redis mode (standalone or cluster) and even Redis. All databases should support high availability.

We can question that. Why? Today, standalone Redis does not support high availability. You need to use either Sentinel, which is compatible with standalone Redis, or Redis Cluster, which is incompatible. But if we had a non-sharded cluster, why would anyone who wants high availability choose standalone?

Cluster supported self-failover from the beginning, which I think is a wise choice. However, Cluster also allows disabling automatic failover by setting cluster-replica-no-failover. In this case, is it not high availability if manual failover is performed through an external system?

It's still highly available because the Redis Cluster Specification clearly defines how clients are expected to learn about and deal with failovers. Redis standalone doesn't have this capability today. Yes, we can add it - but that's my point: why should we do it if we already have it in Redis Cluster, and we're in consensus about creating a non-sharded cluster in the future?

As far as I know, many users build their own high availability systems, even without using Sentinel and Cluster's auto failover. These self-built high availability systems ensure their running Redis, whether standalone or cluster. I guess AWS is doing the same.

But if we had a much better Redis Cluster supporting non-sharded mode, why would those users still build their high-availability systems?

I still need to emphasize one point: -MOVED has two meanings, "Slot migration occurred" and "The master-replica role has changed". If stick to its literal meaning, -MOVED should only be used to describe slot migration, and master-replica role switch should be described by other words, such as -REDIRECT, even the switch in the cluster mode should use -REDIRECT. But I don't recommend doing so, as it would break existing systems and has no meaning.

If we agree that there are valid use cases for redirecting in standalone mode, using a different, dedicated reply makes sense.

@soloestoy

soloestoy commented Jul 10, 2023

Copy link
Copy Markdown
Contributor Author

@yossigo TBH, I'm afraid I don't totally agree with your definition of high availability, but I can follow your opinion and discuss it further.

It's still highly available because the Redis Cluster Specification clearly defines how clients are expected to learn about and deal with failovers.

From these words, in my understanding you are suggesting that high availability means that the server can provide switch information to allow the client to smoothly complete the switch, right? And then in this case, we can assume that high availability is unrelated to whether the server can detect and perform failover on its own, i.e. it is unrelated to whether cluster is configured with cluster-replica-no-failover, whether standalone uses Sentinel, and whether Failover Coordinator is used in Flotilla (IMHO, FC in Flotilla is like Sentinel in standalone. If unsharded-cluster is part of Flotilla, how does it perform failover?).

So the key point in high availability is whether Redis server can return switch information to the client, e.g. -MOVED. We can continue the discussion based on this POV. My opinion is that every form of Redis needs to support high availability, including standalone, cluster, and unsharded-cluster.

But if we had a non-sharded cluster, why would anyone who wants high availability choose standalone?

Users have the right to choose, and IIRC we have already reached a consensus that standalone needs to be supported for a long time, even if unsharded-cluster appears in the future. I still don't understand why we can't make standalone support high availability. Are we planning to abandon users who use standalone mode? @oranagra I also need your opinion on this issue.

However, please note that unsharded-cluster is only a concept, it has not started development and there are still many details that have not been touched upon, and it should be noted that there is a big difference between concepts and implementation. And it is uncertain when it can be implemented, considering this, I also hope that we can support high availability for standalone users as soon as possible.

@oranagra

Copy link
Copy Markdown
Member

we are certainly not gonna abandon standalone. the key point in my opinion remains that although there are several ways to manage instances, the clients should have just one interface, and that's what we need to fully design before we made any changes.

@soloestoy

soloestoy commented Jul 10, 2023

Copy link
Copy Markdown
Contributor Author

@oranagra can you elaborate on it? We agree standalone is long term supported, then high availability is necessary. I don't see any conflict with "the clients should have just one interface".

@yossigo

yossigo commented Jul 10, 2023

Copy link
Copy Markdown
Collaborator

@soloestoy

From these words, in my understanding you are suggesting that high availability means that the server can provide switch information to allow the client to smoothly complete the switch, right?

A highly available Redis deployment provides all the moving parts required to handle failover end-to-end.

So the key point in high availability is whether Redis server can return switch information to the client, e.g. -MOVED. We can continue the discussion based on this POV.

The ability to provide clients with the information necessary to handle failover is a key element in a highly available deployment - I agree with that. But server-side support is only one part. The other part is that clients need to properly support that. Currently, some clients do (those that support Sentinel or Cluster), and some clients don't (the rest).

My opinion is that every form of Redis needs to support high availability, including standalone, cluster, and unsharded-cluster.

In a world with unsharded-cluster and standalone server modes, why would we need to work out a way to deploy highly available standalone-mode Redis?

Yes, someone can build their own system based on standalone Redis and handle failovers. But why should they do that? And why should we ask clients that only support cluster or Sentinel to support yet another kind of deployment? After all, if someone builds their own H/A system based on standalone Redis and custom components, they could also use a load balancer or DNS or other network tricks to handle those redirects.

Users have the right to choose, and IIRC we have already reached a consensus that standalone needs to be supported for a long time, even if unsharded-cluster appears in the future. I still don't understand why we can't make standalone support high availability. Are we planning to abandon users who use standalone mode? @oranagra I also need your opinion on this issue.

Yes, they can choose:

  1. Run standalone Redis which, like today, doesn't support high availability out of the box.
  2. Run cluster mode Redis, which supports high availability and shading - but is not compatible with standalone Redis.
  3. Run a future non-sharded cluster Redis, which supports high availability but no sharding and is 100% compatible with standalone Redis.

You're proposing option "4". But why? And how exactly is it supposed to work? The way I understand it, it's a home-grown system and, as such, it can either be built not to require any client support (VIPs, DNS, etc.) or communicate to clients using the cluster protocol.

However, please note that unsharded-cluster is only a concept, it has not started development and there are still many details that have not been touched upon, and it should be noted that there is a big difference between concepts and implementation.

I agree! And based on this discussion, I think it's important to focus our efforts there so it's no longer just a concept.

@zuiderkwast

zuiderkwast commented Jul 10, 2023

Copy link
Copy Markdown
Contributor

Cluster supported self-failover from the beginning, which I think is a wise choice. However, Cluster also allows disabling automatic failover by setting cluster-replica-no-failover. In this case, is it not high availability if manual failover is performed through an external system?

It's still highly available because the Redis Cluster Specification clearly defines how clients are expected to learn about and deal with failovers. Redis standalone doesn't have this capability today. Yes, we can add it - but that's my point: why should we do it if we already have it in Redis Cluster, and we're in consensus about creating a non-sharded cluster in the future?

@yossigo I disagree with this answer. If the cluster can't do failovers, it's not highly available. If all replicas are configured with cluster-replica-no-failover, or if there are no replicas at all, the cluster is not highly available by itself. (Edit: I see now that the question does include an external system that performs manual failover, so the answer does make sense.)

I think we need to make a difference between "support HA" and "be HA". Standalone is not HA by itself, but it can be HA with help of sentinel or other sentinel-like tools. In the same way, maybe a cluster with cluster-replica-no-failover can be HA with help of some kind of daemon, K8s operator or such. But it's not HA by itself.

Why do we want to deprecate sentinel? I believe the reason is that it's not very elegant to require an external tool. A system that can be HA by itself is more elegant and user-friently. We can't forbid external sentinel-like tools but we don't need to actively support them.

@oranagra

Copy link
Copy Markdown
Member

Why do we want to deprecate sentinel?

We wanna deprecate Sentinel so that we don't have to keep maintaining that code (side by side with cluster).
ideally, the unsharded-cluster can fill that spot, maybe together with some adapter layer to provide sentinel interface to clients.

@soloestoy

Copy link
Copy Markdown
Contributor Author

Yes, they can choose:

  1. Run standalone Redis which, like today, doesn't support high availability out of the box.
  2. Run cluster mode Redis, which supports high availability and shading - but is not compatible with standalone Redis.
  3. Run a future non-sharded cluster Redis, which supports high availability but no sharding and is 100% compatible with standalone Redis.

What users can choose should be:

  1. Run standalone Redis which, optimized, support high availability.
  2. Run cluster mode Redis, which supports high availability and shading - but is not compatible with standalone Redis.
  3. Run a future non-sharded cluster Redis, which supports high availability but no sharding and is 100% compatible with standalone Redis.

And about point 3, I don't think unsharded cluster can be 100% compatible with standalone, like if a server in standalone mode does not support redirection, then the client in standalone mode will not be able to complete failover. As a result, accessing a server in unsharded cluster mode with a client in standalone mode cannot guarantee high availability.

it can either be built not to require any client support (VIPs, DNS, etc.)

VIP and DNS cannot support high availability in real-time. Since DNS cannot guarantee real-time updates, and there may be existing connections pointing to the backend after the VIP is updated, these will result in those old connections accessing the replica (master before failover). This is the problem that this PR aims to solve, allowing those old connections to be automatically switched to the new master (replica before failover), i.e. automatically routed to the new master's DNS, establishing new connections on the VIP will also access the new master.

@yossigo

yossigo commented Jul 11, 2023

Copy link
Copy Markdown
Collaborator

And about point 3, I don't think unsharded cluster can be 100% compatible with standalone, like if a server in standalone mode does not support redirection, then the client in standalone mode will not be able to complete failover. As a result, accessing a server in unsharded cluster mode with a client in standalone mode cannot guarantee high availability.

Why do you think there's no way to get unsharded cluster to be 100% compatible with standalone Redis? I think this is the key here.

If we do manage to achieve that, users will be able to connect to it using standalone clients for maximum compatibility but no failover support, or using cluster clients if they also wish to have failover support.

VIP and DNS cannot support high availability in real-time. Since DNS cannot guarantee real-time updates, and there may be existing connections pointing to the backend after the VIP is updated, these will result in those old connections accessing the replica (master before failover). This is the problem that this PR aims to solve, allowing those old connections to be automatically switched to the new master (replica before failover), i.e. automatically routed to the new master's DNS, establishing new connections on the VIP will also access the new master.

I was only providing this as an example of how someone who builds their own high availability can address that, without client support.

@yangbodong22011

yangbodong22011 commented Jul 12, 2023

Copy link
Copy Markdown
Contributor

If we do manage to achieve that, users will be able to connect to it using standalone clients for maximum compatibility but no failover support, or using cluster clients if they also wish to have failover support.

@yossigo Do you mean that we can use the client's cluster mode or standalone mode to access unsharded-cluster?

This may have the following details:

  1. Clients in cluster mode usually need to call cluster slots or cluster nodes to obtain and resolve the routing table. But I think unsharded-cluster does not need it, because it holds 16384 slots and does not need routing tables to inform.
  2. Clients in cluster mode usually restrict the execution of cross-slot commands (most clients check the key by themselves and report an error in advance, e.g. jedis), but standalone does not, and neither does unsharded-cluster.
  3. Clients in cluster mode usually restrict select and swapdb commands, but neither are standalone and unsharded-cluster.

So I think it is the right way to use the client's standalone mode to access unsharded-cluster (the only thing we need to do is to support the moved protocol for standalone, so that high availability can be achieved)

@zuiderkwast

Copy link
Copy Markdown
Contributor

There is an alternative to redirects that doesn't need any modification to clients: The server can close the connection after failover. If the client reconnects, it can be NAT-routed to the new master. If failover is triggered manually, cloing clients can also be done manually using client kill. Too brutal?

Regarding unsharded cluster, I opened a ticket to discuss it: #12408

@soloestoy

Copy link
Copy Markdown
Contributor Author

users will be able to connect to it using standalone clients for maximum compatibility but no failover support

@yossigo why? In my opinion, this is an irresponsible approach for users. Clients using standalone mode also need to support high availability, which is a very necessary feature.

@CLAassistant

Copy link
Copy Markdown

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

state:major-decision Requires core team consensus

Projects

Status: Backlog

Development

Successfully merging this pull request may close these issues.

9 participants