replica redirect read&write to master in standalone mode#12192
replica redirect read&write to master in standalone mode#12192soloestoy wants to merge 4 commits into
Conversation
|
generally LGTM. I think the -1 in the slot number is acceptable, i suppose that for many clients will find it easier than supporting a new reply, and i hope that for some it'll even be implicitly supported in the current code, but let's validate that with client library maintainers. not sure regarding timing (fitting this into 7.2), don't want to rush it and then regret. |
I have a strong preference to keep MOVED. It still seems easier long term for clients to just understand one error.
We currently don't force pubsub commands to primaries, but maybe we should, I think that's a different conversation though.
Not sure what to say about that. |
|
How is this a breaking change? |
|
|
I vote for MOVED, which will not bring new understanding costs. |
zuiderkwast
left a comment
There was a problem hiding this comment.
LGTM.
I like MOVED with slot = -1. I think it is intuitive.
| createBoolConfig("replica-serve-stale-data", "slave-serve-stale-data", MODIFIABLE_CONFIG, server.repl_serve_stale_data, 1, NULL, NULL), | ||
| createBoolConfig("replica-read-only", "slave-read-only", DEBUG_CONFIG | MODIFIABLE_CONFIG, server.repl_slave_ro, 1, NULL, NULL), | ||
| createBoolConfig("replica-ignore-maxmemory", "slave-ignore-maxmemory", MODIFIABLE_CONFIG, server.repl_slave_ignore_maxmemory, 1, NULL, NULL), | ||
| createBoolConfig("replica-redirect-read-write", NULL, MODIFIABLE_CONFIG, server.repl_replica_redirect_rw, 0, NULL, NULL), |
There was a problem hiding this comment.
Is replica-redirect enough? It can be implicitly understood that it is only for read and write commands, just like for cluster redirects.
There was a problem hiding this comment.
I actually am not sure it's clear enough! We've had folks ask if they could disable redirects for cluster mode (for mostly bad reasons), so they might think they could turn this off for cluster mode.
There was a problem hiding this comment.
So you want to add something like 'standalone' to the name to make it clear it's not a cluster config?
There was a problem hiding this comment.
I guess in my ideal world this config would also apply on cluster mode enabled as well, and it would be a cluster agnostic config. At that point maybe your naming would be sufficient, replica-enable-redirects maybe?
There was a problem hiding this comment.
This is already the behaviour in cluster mode. Would the config have different defaults in cluster vs standalone mode. Btw disabling it in cluster mode is "for mostly bad reasons" you said above so I thought you don't want to allow that...
There was a problem hiding this comment.
i agree it would be nice to have it cluster agnostic, and that then we have a problem with default.
unless we wanna make it default to yes in standalone too.
another bad option is to make it an enum config with [yes,no,cluster-only], the last one being the default.
There was a problem hiding this comment.
if we want this, I prefer split into two configs, like replica-enable-redirect by default no and cluster-replica-enable-redirect by default yes.
Usually this is just a major decision. I don't think we would expect clients to be relying on this functionality for anything? I'm going to remove the tag. |
| (is_write_command || | ||
| (is_read_command && !(c->flags & CLIENT_READONLY)))) { | ||
| addReplyErrorSds(c,sdscatprintf(sdsempty(), "-MOVED -1 %s:%d", | ||
| server.masterhost, server.masterport)); |
There was a problem hiding this comment.
Presumably we need to also cover the "announced" port for K8s type distributions, where this port is not the same port the end user would see. I guess announced IP/hostnames are also problems.
EDIT: To add some more context here. We've seen a number of users try to use the masterhost and masterport info field for topology discovery, which doesn't work correctly.
There was a problem hiding this comment.
good point. but for that we need a new config, right?
There was a problem hiding this comment.
you mean master-announce-ip and master-announce-port?
There was a problem hiding this comment.
Yeah, something like that. Not sure how I feel about adding two new configurations that I would like to deprecate though. Ideally we would re-use the cluster-announce-* type configs. This relates to how I want to make sure we are aligned long term.
There was a problem hiding this comment.
For tls, we'd need master-announce-tls-port too then. Many new configs. Or maybe only these two: redirect host:port and redirect-tls host:port, i.e. redirect to any node, regardless of whether it's our master or not. Too flexible?
There was a problem hiding this comment.
I prefer @zuiderkwast 's idea, the redirect and redirect-tls has a clear scope, only affect redirect.
If we support master-announce-*, seems we need apply the announce ip:port to info role and other commands' reply. But this could cause some internal management services to become unavailable, since the internal services and user businesses possibly belong to different network, and the management may rely on the real ip:port not the announced ip:port in info and role etc.
|
we discussed this in a core-team meeting, we feel we'd like to completely design this feature, before implementing and committing to parts of it. two concerns that were raised are:
further discussions are needed.. |
|
it's ok we can design this feature completely and raise all concerns, this is a good way in scheme design stage. and now this PR can work well when users deploy Redis on simple hosts, I think we have reached a consensus on this point. and the implementation now is forward compatible, . about the concerns:
all in all, my opinion is we can design completely but implement step by step. |
|
First, it is necessary to clarify the meaning of "MOVE" in the context mentioned. Does it represent 301 or 302? Perhaps more people believe that "MOVE" represents 301, but in the case of redis-cluster, it has always been possible to have A -> MOVE -> B, which makes it look more like 302. I think perhaps it should be called "REPLACER," which would indicate that the new address permanently replaces all the functionalities of the current node, rather than just being a redirection. Regarding the seamless switch problem in Redis, I propose adding a "retired" status to the master node. If the current node is not the master node or there are no slave nodes available, an error will be returned.. When the master node is in the "retired" status, it can still accept read and write commands without the client being aware of this change. In this state, the master node handles read commands as usual, but forwards write commands to the slave node(s) (while keeping track of keys that the slave node(s) have not yet synchronized). Once the slave node(s) reach an identical state with the master node, the master node responds to any command with "REPLACER". When the client receives the "REPLACER" status, it switches to the slave node, which provides all the functionalities of the original master node. |
|
I will say, I think the incremental delivery we made this happen in makes me wish we had thought this through more holistically :D. So, is our ideal cluster setup: These configs are set as a block to indicate there is a different way to reach this node for replication and cluster. By default we assume a flat topology for all nodes, so this would also apply to clients, however we definitely assume the topology is flat for the nodes in the cluster. We also need a second set of configs, which are used when the route from the client to the cluster may not be the same as the route between nodes in the cluster. I think in the fullness of time we want all of this configuration for cluster mode disabled as well, so I'm not clear why we wouldn't just copy the interface we are currently using.
Do we want clients to be doing the
I think we want to align on 1 to be our long term solution.
I think this is very much an engineering mindset, but it's not very coherent from project mindset. For performance improvements, that incremental approach makes more sense since, but when releasing a coherent feature, we should strive to build something that is consistent over a longer time frame. |
This makes sense, but there is a problem how to identify whether this connection is from a client or a node, some admin service may need get the announce topology not the announce client topology. I know some virtual network tech can do that, but there is currently no universal standardized technology to achieve this. Maybe we need some ways like a new command
Yes, I agree these configs can be used for more than just cluster, maybe in some day we can remove the I didn't understand the words
But now like sentinel is using these to get the topology, and in Redis perspective Sentinel is just a normal client.
This feature is forward compatible, if we add some new configs or change internal mechanism, it can still work well for users, that's why I think we can do step by step. |
|
ping @redis/core-team @zuiderkwast, need your feedbacks |
Co-authored-by: Oran Agra <oran@redislabs.com>
I think it makes sense to keep cluster. While talking with our AWS customers, people seem to resonate calling any collection of nodes a cluster. The configs only make sense for more than 1 node.
I would really like to just use the current cluster announce configs, and only add a single new config which describes an unsharded cluster (in addition to the other configs I mentioned). I want to merge our two deployment configurations, cluster-enabled and disabled, as much as possible. From an end-user perspective the result is basically the same, but from a management and maintenance perspective, I think it will be good for us long term.
I agree with this, but I just think it's short sighted. I don't want end users configuring stuff that stops being relevant quickly. |
Interesting, on the contrary, our users at Alibaba Cloud prefer to clearly distinguish between standalone and cluster because these two usage modes are very different. I have experience with AWS ElastiCache and have consulted with some users. I find it misleading to call a standalone mode with a primary node and multiple secondary nodes a "cluster". I do not like this design.
I don't want to mix up standalone and cluster. If you want to merge the configuration options for these two different modes, I still believe that removing the "cluster" prefix is a good approach because the announced IP port doesn't need to distinguish between deployment modes.
Yes, that's why I don't want to add too much configuration. The current approach is already sufficient for users who deploy directly on physical hosts and for those who deploy in NAT mode using replica-announce-ip. In my opinion, before we have figured out the correct and optimal use of announced IP:Port, it's better to maintain a relatively pure state and this is the correct way to avoid short-sighted. |
@yossigo Wouldn't cluster mode have these issues as well? These problems are not unique to standalone mode, and cluster mode will also encounter them. Moreover, these problems have already been solved in cluster mode and can be easily applied to standalone mode. As @yangbodong22011 said, this is not a difficult task. |
|
It looks like we all agree that standalone should be long-term supported, I think we can merge it, @redis/core-team any other questions? |
|
@soloestoy We agree about standalone mode being supported in the future, but I think merging this PR (as is or with modifications) requires consensus on other topics where we still don't have it. I think we should at least be aligned about the answers to these questions:
|
|
@yossigo I couldn't fully understand what you were saying, I'm trying to provide some answers based on my understanding: Firstly, of course, standalone should support high availability, which is unrelated to the Redis mode (standalone or cluster) and even Redis. All databases should support high availability. Regarding Redis, it's not true that returning Cluster supported self-failover from the beginning, which I think is a wise choice. However, Cluster also allows disabling automatic failover by setting As far as I know, many users build their own high availability systems, even without using Sentinel and Cluster's auto failover. These self-built high availability systems ensure their running Redis, whether standalone or cluster. I guess AWS is doing the same. In addition, I still need to emphasize one point: |
Sorry about that. Please refer to specific unclear questions so I can try to explain myself more clearly.
We can question that. Why? Today, standalone Redis does not support high availability. You need to use either Sentinel, which is compatible with standalone Redis, or Redis Cluster, which is incompatible. But if we had a non-sharded cluster, why would anyone who wants high availability choose standalone?
It's still highly available because the Redis Cluster Specification clearly defines how clients are expected to learn about and deal with failovers. Redis standalone doesn't have this capability today. Yes, we can add it - but that's my point: why should we do it if we already have it in Redis Cluster, and we're in consensus about creating a non-sharded cluster in the future?
But if we had a much better Redis Cluster supporting non-sharded mode, why would those users still build their high-availability systems?
If we agree that there are valid use cases for redirecting in standalone mode, using a different, dedicated reply makes sense. |
|
@yossigo TBH, I'm afraid I don't totally agree with your definition of high availability, but I can follow your opinion and discuss it further.
From these words, in my understanding you are suggesting that high availability means that the server can provide switch information to allow the client to smoothly complete the switch, right? And then in this case, we can assume that high availability is unrelated to whether the server can detect and perform failover on its own, i.e. it is unrelated to whether cluster is configured with So the key point in high availability is whether Redis server can return switch information to the client, e.g.
Users have the right to choose, and IIRC we have already reached a consensus that standalone needs to be supported for a long time, even if unsharded-cluster appears in the future. I still don't understand why we can't make standalone support high availability. Are we planning to abandon users who use standalone mode? @oranagra I also need your opinion on this issue. However, please note that unsharded-cluster is only a concept, it has not started development and there are still many details that have not been touched upon, and it should be noted that there is a big difference between concepts and implementation. And it is uncertain when it can be implemented, considering this, I also hope that we can support high availability for standalone users as soon as possible. |
|
we are certainly not gonna abandon standalone. the key point in my opinion remains that although there are several ways to manage instances, the clients should have just one interface, and that's what we need to fully design before we made any changes. |
|
@oranagra can you elaborate on it? We agree standalone is long term supported, then high availability is necessary. I don't see any conflict with "the clients should have just one interface". |
A highly available Redis deployment provides all the moving parts required to handle failover end-to-end.
The ability to provide clients with the information necessary to handle failover is a key element in a highly available deployment - I agree with that. But server-side support is only one part. The other part is that clients need to properly support that. Currently, some clients do (those that support Sentinel or Cluster), and some clients don't (the rest).
In a world with unsharded-cluster and standalone server modes, why would we need to work out a way to deploy highly available standalone-mode Redis? Yes, someone can build their own system based on standalone Redis and handle failovers. But why should they do that? And why should we ask clients that only support cluster or Sentinel to support yet another kind of deployment? After all, if someone builds their own H/A system based on standalone Redis and custom components, they could also use a load balancer or DNS or other network tricks to handle those redirects.
Yes, they can choose:
You're proposing option "4". But why? And how exactly is it supposed to work? The way I understand it, it's a home-grown system and, as such, it can either be built not to require any client support (VIPs, DNS, etc.) or communicate to clients using the cluster protocol.
I agree! And based on this discussion, I think it's important to focus our efforts there so it's no longer just a concept. |
@yossigo I disagree with this answer. If the cluster can't do failovers, it's not highly available. If all replicas are configured with I think we need to make a difference between "support HA" and "be HA". Standalone is not HA by itself, but it can be HA with help of sentinel or other sentinel-like tools. In the same way, maybe a cluster with Why do we want to deprecate sentinel? I believe the reason is that it's not very elegant to require an external tool. A system that can be HA by itself is more elegant and user-friently. We can't forbid external sentinel-like tools but we don't need to actively support them. |
We wanna deprecate Sentinel so that we don't have to keep maintaining that code (side by side with cluster). |
What users can choose should be:
And about point 3, I don't think unsharded cluster can be 100% compatible with standalone, like if a server in standalone mode does not support redirection, then the client in standalone mode will not be able to complete failover. As a result, accessing a server in unsharded cluster mode with a client in standalone mode cannot guarantee high availability.
VIP and DNS cannot support high availability in real-time. Since DNS cannot guarantee real-time updates, and there may be existing connections pointing to the backend after the VIP is updated, these will result in those old connections accessing the replica (master before failover). This is the problem that this PR aims to solve, allowing those old connections to be automatically switched to the new master (replica before failover), i.e. automatically routed to the new master's DNS, establishing new connections on the VIP will also access the new master. |
Why do you think there's no way to get unsharded cluster to be 100% compatible with standalone Redis? I think this is the key here. If we do manage to achieve that, users will be able to connect to it using standalone clients for maximum compatibility but no failover support, or using cluster clients if they also wish to have failover support.
I was only providing this as an example of how someone who builds their own high availability can address that, without client support. |
@yossigo Do you mean that we can use the client's cluster mode or standalone mode to access unsharded-cluster? This may have the following details:
So I think it is the right way to use the client's standalone mode to access unsharded-cluster (the only thing we need to do is to support the |
|
There is an alternative to redirects that doesn't need any modification to clients: The server can close the connection after failover. If the client reconnects, it can be NAT-routed to the new master. If failover is triggered manually, cloing clients can also be done manually using client kill. Too brutal? Regarding unsharded cluster, I opened a ticket to discuss it: #12408 |
@yossigo why? In my opinion, this is an irresponsible approach for users. Clients using standalone mode also need to support high availability, which is a very necessary feature. |
|
|


To implement #12097
replica-enable-redirectto control whether to redirect or not, with the default setting being offreadonlyandreadwritecommand in standalone mode, may be a breaking changereplica-enable-redirectenabledother tips: