-
Notifications
You must be signed in to change notification settings - Fork 24.4k
Description
This issue is called PSYNC3 because, as PSYNC2 identified a set of different replication improvements, the ones described here are the next improvements planned for the Redis replication. The main focus of the improvements scribed here is:
- To allow AOF to also retain replication met-data like RDB does.
- Make Redis Sentinel and Redis Cluster failovers safe when master instances are configured without persistence, or with limited persistence (RDB).
This issue deprecates #2087, because AFAIK, this is just a better version of what proposed. PSYNC3 features are based on improvements in PSYNC2 that were not available back when protected restarts were proposed.
The single features that compose PSYNC3 are:
- AOF annotations with replication IDs and offsets.
- A Redis Sentinel and Cluster feature that marks rebooted master instances as failed, triggering a failover.
- A Redis replication change that allows slaves to only connect and continue the replication from instances that are successors of the same replication history.
AOF annotations
Starting with PYNC2 in Redis 4.0, RDB files are able to store replication informations such as the replication ID and offset. This allows many things, including, after a reboot, to continue the replication incrementally from the master without a full restart.
AOF should be optionally able to do the same. For every replicated command we should be able to also emit the corresponding master replication offset. Moreover when the AOF is created, if empty, and every time the replication ID changes, the new replication ID should be emitted as well.
One possibility is to instruct the AOF file, immediately after loading, that it is an annotated AOF file, via a single command like AOFCONFIG annotated 1. When this option is turned on, the first argument of every AOF command describes the replication offset after the command execution. So conceptually every command is like the following:
932434 SET foo bar
932460 INCR mykey
Moreover when the replication ID/offset change because of a new PSYNC attempt, a replication role change, or any other similar event, a REPLCONF SET-ID-AND-OFFSET is emitted in the AOF file.
Detecting rebooted instances as failing
Redis Sentinel is already able to detect reboots of Redis instances, by checking differences in the runid INFO field. Sentinel (and later Cluster) should be able to set a rebooted master as failing, so that the reboot event can trigger a failover even if the instance was not unavailable for a short time, compared to the failover unavailability trigger setting (down-after-milliseconds).
Because the replication link is often a very reliable channel to propagate writes compared to non-AOF persistence (or lack of persistence), this would result in better real world consistency of Redis instances configured with RDB or no persistence at all, since restarts will pass the master role to a slave and so forth.
However for this to work, also slaves should not connect to a rebooted master if it looks like unreliable from the point of view of the old slave history. And this leads to the third feature:
Slave ability to strictly follow history when reconnecting to the master
Slaves know the old master replication ID and the offset they are up. When strict history is configured, a slave should only accept successful PSYNC2 replies, or should accept full resynchronizations only when the full sync is needed because:
- The slave had no past history at all, it's a new slave.
- There was no replication backlog to serve the salve, but otherwise the ID matches, and the master offset is in the future.
Otherwise, if the full synchronization is triggered because the master recognizes the ID, but the offset of the master is in the past compared to what the slave is reporting (so the slave is more updated), or the master does not know the replication ID at all (trivial case, a master without persistence is rebooted), the slave should stop the synchronization attempt and retry later as usually. In the meanwhile the failover will promote a new master and the replication should be able to continue.
However, Sentinel and Redis Cluster must be able to override this setting. After a failover happens, a variant of SLAVEOF should be able to force the instance to accept the new master.
Conclusions
The implementation of the above changes should significantly improve the reliability of cluster of Redis instances in HA setups all the times AOF is not used, and should also be able to improve synchronizations times when AOF is used. This feature has currently no planned ETA and must be designed with care in all the details, so the first step is to follow up with a design document similar to the one of PSYNC2.