-
Notifications
You must be signed in to change notification settings - Fork 24.4k
Description
Today, redis propagates setting a relative expire time (SET EX 100, expire 100 seconds from now) as a relative expire but stores it in the AOF and RDB as an absolute time. The motivation for this (#5171 (comment)) is that clocks might be skewed between two nodes so we want them to expire at roughly the same time. Since we expect AOF and RDB to be loaded much later, we don't have that constraint. This introduces a couple of weird notions:
- A replica might retain the data much longer than the primary, since it could have a significant replication lag (Maybe right after a fullsync).
- Data is still sometimes replicated absolutely. If part of the application is using relative time, and part is using absolute, there will be odd discrepancies between how long data is living in the replica.
There is also a second weird issue, which is that expire might cause the replica to display a view of the data that never existed on the primary. Some workloads rely on sending some requests to the primary and some to the replica, so it's weird that the replica may be "ahead" of the primary because it logically expired a key.
My suggestions:
- Always replicate the data as an absolute time, this should solve the two issues mentioned.
- Have a flag that makes expire "linearizable" across the cluster. Replicas will no longer make independent decisions about whether to show data, so you will not have a view on the replica that is inconsistent with the primary. I suggest a config, but I can imagine some workloads that don't care, and would prefer it be knocked out of the cache.