Monotonic clock and updates to AE #7644

JimB123 · 2020-08-10T23:30:28Z

Background

Redis needs an internal API for using a monotonic clock. Currently, most of Redis uses gettimeofday. This is a problem because time of day can be altered due to system time changes. The system time can be altered manually or by NTP. There is actually code, included in ae.c, which checks to see if the clock has moved backwards. However, most of Redis uses gettimeofday for timing, etc. - simply assuming that it is a monotonic clock.

This update provides the following

An interface for retrieving a monotonic clock. getMonotonicUs returns a uint64_t (aka monotime) with the number of micro-seconds from an arbitrary point. No more messing with tv_sec/tv_usec. Simple routines are provided for measuring elapsed milli-seconds or elapsed micro-seconds (the most common use case for a monotonic timer). No worries about time moving backwards.
High-speed assembler implementation for x86 and ARM. The standard method for retrieving the monotonic clock is POSIX.1b (1993): clock_gettime(CLOCK_MONOTONIC, timespec*). However, most modern processors provide a constant speed instruction clock which can be retrieved in a fraction of the time that it takes to call clock_gettime. For x86, this is provided by the RDTSC instruction. For ARM, this is provided by the CNTVCT_EL0 instruction. As a compile-time option, these high-speed timers can be chosen. (Default is POSIX clock_gettime.)
Refactor of event loop timers. The timer processing in ae.c has been refactored to use the new monotonic clock interface. This results in simpler/cleaner logic and improved performance.

Performance implications

When using the processor's instruction clock, timer processing in the event loop is dramatically improved. In a test program, overall timer event processing has been measured to run at nearly 6x performance. (Test details below.)
When using the POSIX clock_gettime, timer processing is improved about 18% due to streamlined logic in ae.c.

Test details

Simple test program creating an event loop and 3 timers - each firing with 0ms delay. Linux host, with processor: Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz. The events each incremented a counter. A beforeSleep proc checked elapsed time (using the monotonic clock) until 5 seconds had passed. Compared average count of timers fired.

Note: on my test machine, clock_gettime is measured to run about 5% slower than gettimeofday. However it is expected that this varies on system-to-system and that they are nominally similar in terms of performance.

Future updates

Introduction of this API provides the opportunity to begin replacing ustime and mstime calls throughout the code. With the huge performance difference provided by using the processor clock, this will provide benefits to any feature which relies on time-based code.

For those unfamiliar with the x86 TSC

The x86 maintains a counter which is based on a constant clock rate at the nominal clocking of the processor. Even though individual cores are running at various and changing clock rates, they all share the same constant-rate clock. On multi-CPU systems, all CPUs are sync'd to a common clock source.

Though you can still find articles about older CPUs that did not support a constant-rate TSC, all of the reasonably modern x86 and AMD processors support constant-rate TSC. On Linux, you can check for the constant_tsc flag in /proc/cpuinfo.

Note also, though the TSC runs at a constant rate at the nominal speed of the processor, the processor itself doesn't actually know it's speed. The OS will report the speed in /proc/cpuinfo (Linux). On Windows, the QueryPerformanceFrequency function may be used. The ticks-per-second needs to be used to convert the TSC to common units of time (milli-seconds/micro-seconds).

The TSC is not guaranteed to be wall-clock precise - it may suffer from clock-drift and is not corrected by NTP. It's fine for timing short intervals, but (for example) system uptime should be measured with a different mechanism (like gettimeofday).

Rebase to latest OSS

itamarhaber · 2020-08-11T10:52:04Z

@JimB123 wow - looks extremely promising.

@guybe7 following up on #6861, WDUT?

guybe7 · 2020-08-11T14:47:48Z

looks good indeed.
@JimB123 please CMIIW - usage of getMonotonicUs helps us only in case we need to measure "deltas" from some point. it does not return a number from which we can deduce the wall-clock time, right? if so, it's not really help us with #6861 where we need the actual timestamp (it can be retrieved by XINFO)

TBH we we really need to reduce calls to mstime is some server struct uint64 that represents the "timestamp for the current command" - can be calculated on the first time a command calls to mstime and stored in the server struct. we zero it when a command is done.
even if a a command calls mstime multiple times gettimeofday is actually called just once.

JimB123 · 2020-08-11T15:14:58Z

@guybe7 You are correct. The monotonic clock should never be used in an attempt to "deduce the wall-clock time". It is only useful for determining elapsed time (aka deltas) - which is a common need.

Re: reduction of calls to mstime... this might be worthwhile if many use cases can be satisfied by (for example) the time at the start of command processing. Without study, it's unclear to me if the monotonic time should be cached OR wall-clock time OR both. In existing code, mstime is used generically as both a wall clock time and also for the purpose of computing elapsed time.

guybe7 · 2020-08-11T15:18:20Z

@JimB123 thanks
@oranagra @yossigo please have a look as well

madolson · 2020-08-11T16:02:12Z

What a great idea Jim! Disclosure, this is an AWS contribution, so we run a similar variant to this internally already and have for awhile. Although I think this is specifically beneficial here, I also agree that is probably useful to replace some of the other usages of mstime/gettimeofday within redis when precise time isn't necessary and we can get a little performance boost. I am good with the current implementation as is, but this is a change worthy of consensus.

The specific concerns I had that would be great to validate by others:

General support of clock_gettime() since gettimeofday is being removed. clock_gettime was poorly supported by MacOS, which I think is why salvatore never used it earlier. I think it's now been widely available for 4 years now on MacOS, but someone who might be running a very old version of MacOS might find this unable to build. We might end up on other systems that don't support it as well, but I'm inclined to say we can patch that when it comes.
The ifdef causing indentation (I know, whitespace). The IDE I use, VSCode, can be configured to hide ifdef'd out code, so to me it looks like all this code is incorrectly indented. Generally #directives don't interfere with indentation anywhere else in Redis, and I agree with that, but I also see there is a reason for readability if you are just viewing this in a plain text editor.
Monotonic and ae as standalone libraries. I know some people use redis as a source of clean code, I thought it would be nice to not require an extra dependencies on monotonic if someone didn't want to use it. It might also be nice as a header only library.

oranagra

@JimB123 thanks a lot for this contribution.
I agree with everyone that it's an important step forward, and we can merge it as is and later worry about using it in more places.

I also agree with Guy and others that in some (many) places we should just cache the mstime/ustime of the current command rather than sample it repeatedly, but that's certainly another topic.

Regarding styling:
I personally think ifdefs should not induce code indentation. Instead i like the #else and #endif to have a comment indicating to which #if they belong.
I usually prefer statements that change execution flow (goto, return, continue, break) to be on a line of their own (not with the if), except in a repeated error handling code which would add too many lines and hide the actual logic code.
But these are not a "one size fits all" (specifically since this is a new file), it's up to you to decide if you wanna change it.

I saw the PR description says the default is to use clock_gettime, but correct me if i'm wrong, you mean that it's the "fallback", right? i.e. by default it will use the faster HW specific methods and only fallback to POSIX if they're missing at compile time.

src/monotonic.c

JimB123 · 2020-08-12T15:59:49Z

I saw the PR description says the default is to use clock_gettime, but correct me if i'm wrong, you mean that it's the "fallback", right? i.e. by default it will use the faster HW specific methods and only fallback to POSIX if they're missing at compile time.

@oranagra Originally, I coded this to use clock_gettime as a fallback. However, after talking with Maddy, we thought that the processor clock would be better as an "opt-in" functionality. In order to use it, you need to explicitly define "USE_PROCESSOR_CLOCK" (which is not defined). So, as published, clock_gettime is a default, not a fallback.

oranagra · 2020-08-13T05:17:33Z

@oranagra Originally, I coded this to use clock_gettime as a fallback. However, after talking with Maddy, we thought that the processor clock would be better as an "opt-in" functionality. In order to use it, you need to explicitly define "USE_PROCESSOR_CLOCK" (which is not defined). So, as published, clock_gettime is a default, not a fallback.

I fail to see what are the risks of enabling it by default (@madolson can you elaborate). many will not be aware of this flag and will just remain with less efficient redis.
I guess there are two (quite slim) risks:

Either the assembly / intrinsic code will fail to compile, in which case we'll learn about it and add some additional ifdef.
Or the parsing of cpuinfo will fail at runtime, in which case we need to add a fallback mechanism that will revert to use clock_gettime at runtime.

@yossigo what do you think?

The second bullet brings me back to the assertions. again assert is more suitable to check a condition that should logically never happen. in this case it checks external conditions (parsing a file), on failure it should print a message to stderr and call exit(1), (not abort() and certainly not segfault)..
But if we now consider the runtime fallback (let's wait for a decision) then these should obviously not even be exit, we can refactor the code so that this function returns NULL, but it is wrapped inside another function that calls the clock_gettime init, and sets up some flag so that it's used in runtime.

yossigo · 2020-08-13T08:55:55Z

@JimB123 This is a great contribution, thanks!
@oranagra I don't know enough to judge the risks of using it by default. But I do agree, if USE_PROCESSOR_CLOCK is defined but fails at runtime we can just fallback and emit a warning to the log. This is a common practice with regard in other cases, like kernel configurations that is not optimal and may have a negative impact on Redis.

oranagra · 2020-08-13T09:18:30Z

@yossigo so just to be sure, you support making it enabled in a default build, and have it simply print an error and exit if if there is a problem (i.e. no fallback at runtime is needed).
right?

yossigo · 2020-08-13T09:22:18Z

@oranagra no.

I don't have a strong opinion about being built by default, I know it can have undesired side effects in some cases (i.e. depending on CPU / firmware / hypervisor in use, in case a CPU is hotswapped, etc.).

But if we do compile support for it, I don't see a reason to exit if we fail to initialize. We can just produce a warning and fall back to clock_gettime().

oranagra · 2020-08-13T09:30:45Z

ohh, missed one word in your previous post.
So you're not worried that enabling it on default build would fail to build or fail to execute (since we have a runtime fallback), you're worried that it'll run and be buggy.

Personally i don't really like an important optimization to be opt-in. i think there's a high chance that many will not enjoy it.
Maybe we can:

Gather some info on the subject somehow?
Add some runtime sanity check at startup with a warning and a fallback?
Or conclude that even when dysfunctional, the implication are not very severe (no crashes or data loss), and users will be able to disable it and report back to us?

JimB123 · 2020-08-13T15:18:39Z

The second bullet brings me back to the assertions. again assert is more suitable to check a condition that should logically never happen. in this case it checks external conditions (parsing a file),

@oranagra I submit that this is acceptable use of assert. These conditions should never happen unless there is a fundamental error in the code or compilation options. I agree that this is an "external condition" in that it's checking the expected functionality of the operating system (at startup). But, this is not "parsing a file" as there is no "file". Linux "files" under /proc/ are core functionality of the OS and while they are represented as part of the filesystem, they are not files.

on failure it should print a message to stderr and call exit(1), (not abort() and certainly not segfault).

I don't see much difference between assert which (according to documentation) "prints an error message to standard error and terminates the program by calling abort(3)" vs. explicitly printing an error message to standard error and terminating the program by calling exit(3).

While I think use of assert in this situation is appropriate, simple, readable, & maintainable, I can create a local function that exits with a message to stderr if that's what's needed to pass review.

JimB123 · 2020-08-13T15:28:27Z

But if we do compile support for it, I don't see a reason to exit if we fail to initialize. We can just produce a warning and fall back to clock_gettime().

Note that supporting run-time fallback would probably offset the value of providing inline implementations of the functions. With run-time fallback, I think I would choose to provide only a function pointer in the .h and initialize it in the .c file.

madolson · 2020-08-13T18:14:19Z

@oranagra My main concern was to make sure Redis kept building out of the box or weird clock behavior. There seems to be a lot of weird compatibility issues around what timers were supported on different OS's. It looks like it's gotten much more consistent in the last couple of years. If someone is running an old enough x86 processor, it might not actually work correctly as the TSC between cores aren't synced. So I thought letting people slowly testing it out with an opt in flag for awhile might work and then at some point we build it by default.

Looking back to the internal review, I was also concerned it might not initialize correctly. So the alternative approach is to make sure at startup everything works as expected otherwise fallback to clock_gettime(). I was also okay with that approach, but it's slightly more complex.

JimB123 · 2020-08-13T21:07:50Z

Thank you all for the positive reception of this feature. It's nice to see that there are no issues/concerns with the functional interface or the changes to ae.

@madolson, as primary reviewer for this, please reach concensus (or make an executive decision) regarding the following open questions:

Should the CPU instruction clock be enabled by default?

YES: the clock source will be chosen automatically based on existing system #defines
NO: the builder will need to explicitly opt-in by defining USE_PROCESSOR_CLOCK at build time
(I lean towards YES, but am comfortable with NO)

If the selected clock can not be initialized properly should a fallback be performed or should the system abort initialization? NOTE: a failure to initialize indicates a coding error (probably a coding error related to the use of the #defines on a given system).

ABORT: Redis generates error message and terminates at startup
FALL-BACK: Redis prints (to stderr) an error message, and then chooses clock_gettime. Note that this adds 2 complications.
- It's possible (though unlikely) that the POXIX.1b(1993) function clock_gettime might not exist or that the specific clock (CLOCK_MONOTONIC) might not exist. If we fall back to clock_gettime(CLOCK_MONOTONIC) and that fails to initialize, I strongly suggest that aborting is better than continuing to fall back to a non-monotonic source (like gettimeofday).
- By adding fallback, it makes implementation of getMonotonicUs more complex and I would recommend NOT inlining it in this case. Instead, a function-pointer approach would be cleaner. This would slow down assember implementations of getMonotonicUs as the function itself is just a few assembler instructions. However it would still be orders of magnitude faster than calling clock_gettime.
(I prefer ABORT, but would be comfortable with FALL-BACK as long as we abort if clock_gettime(CLOCK_MONOTONIC) fails.)

In cases where the code aborts during initialization, is assert acceptable?

YES: Code is simple and clean. A message is printed to stderr and abort(3) is called.
NO: A printf would be performed to stderr and exit(3) would be called. Note that this functionality is below the level of Redis logging so serverLog, serverAssert, etc. should not be considered.
(I prefer YES. I would be OK with NO, as long as it's just printf/exit and nothing fancier.)

madolson · 2020-08-18T06:26:35Z

Hello, I guess I'll post the update:

No, it should not be enabled default. The current code is fine.
Yes, it should fallback if it's not able to initialize when possible. If initialization of clock_gettime fails, exiting is still fine.
I don't think there was really a preference here and that wasn't really a sticking point in our conversation.

src/monotonic.c

oranagra · 2020-08-18T08:03:42Z

we also need to document it somewhere (maybe in the README).
specifying how to enable it and what are the risk (possibly linking to some external info)
maybe this is a good link: http://oliveryang.net/2015/09/pitfalls-of-TSC-usage/

The code used to decide on the next time to wake on a timer with microsecond accuracy, but when deciding to go to sleep it used milliseconds accuracy (with truncation), this means that it would wake up too early, see that there's no timer to process, and go to sleep again for 0ms again and again until the right microsecond arrived. i.e. a timer for 100ms, would sleep for 99ms, but then do a busy loop through the kernel in the last millisecond, triggering many calls to beforeSleep. The fix is to change all the logic in ae.c to work with microseconds, which is good since most of the ae backends support micro (or even nano) seconds. however the epoll backend, doesn't support micro, so to avoid this problem it needs to round upwards, rather than truncate. Issue created by the monotonic timer PR #7644 (redis 6.2) Before that, all the timers in ae.c were in milliseconds (using mstime), so when it requested the backend to sleep till the next timer event, it would have worked ok.

The code used to decide on the next time to wake on a timer with microsecond accuracy, but when deciding to go to sleep it used milliseconds accuracy (with truncation), this means that it would wake up too early, see that there's no timer to process, and go to sleep again for 0ms again and again until the right microsecond arrived. i.e. a timer for 100ms, would sleep for 99ms, but then do a busy loop through the kernel in the last millisecond, triggering many calls to beforeSleep. The fix is to change all the logic in ae.c to work with microseconds, which is good since most of the ae backends support micro (or even nano) seconds. however the epoll backend, doesn't support micro, so to avoid this problem it needs to round upwards, rather than truncate. Issue created by the monotonic timer PR redis#7644 (redis 6.2) Before that, all the timers in ae.c were in milliseconds (using mstime), so when it requested the backend to sleep till the next timer event, it would have worked ok.

jhmilan · 2022-08-05T20:06:41Z

Sorry for asking a maybe stupid question... Does it mean that as of Redis 6.2 Redis is using monotonic clock? Or is it an config option... Depends on the platform?

Thanks in advance

JimB123 · 2022-08-05T22:17:02Z

@jhmilan there is an internal API which provides a monotonic clock for some (not all) Redis functions. Eventually most of Redis should use that API. Currently most of Redis uses mstime which provides wall-clock time.

Behind the API is a clock implementation which is determined at compile time. There are options to directly use the X86 TSC, the ARM equivalent, or a POSIX implementation.

Note that, other than slight performance variations across platforms, this has no impact on the Redis client experience.

jhmilan · 2022-08-05T22:32:40Z

@jhmilan there is an internal API which provides a monotonic clock for some (not all) Redis functions. Eventually most of Redis should use that API. Currently most of Redis uses mstime which provides wall-clock time.

Behind the API is a clock implementation which is determined at compile time. There are options to directly use the X86 TSC, the ARM equivalent, or a POSIX implementation.

Note that, other than slight performance variations across platforms, this has no impact on the Redis client experience.

Thanks for your quick reply @JimB123
My main concern is about the fact that TTL could be invalidated due to date shifts (NTP ...) and how it relates to the known issue around Redlock. I would like to understand if, as of some given Redis version, this issue is solved and, thus, that Redlock pitfall is gone.

oranagra · 2022-08-07T05:21:30Z

@jhmilan the expiration mechanism always uses wall-clock (not monotonic), and i'm not sure it should change.
can you provide a reference to the problem you're referring to?

jhmilan · 2022-08-07T07:18:13Z

@oranagra sure. This a very interesting read where Martin Kleppman talks about this topic: https://martin.kleppmann.com/2016/02/08/how-to-do-distributed-locking.html

Note that Redis uses gettimeofday, not a monotonic clock, to determine the expiry of keys. The man page for gettimeofday explicitly says that the time it returns is subject to discontinuous jumps in system time – that is, it might suddenly jump forwards by a few minutes, or even jump back in time (e.g. if the clock is stepped by NTP because it differs from a NTP server by too much, or if the clock is manually adjusted by an administrator). Thus, if the system clock is doing weird things, it could easily happen that the expiry of a key in Redis is much faster or much slower than expected.

From this I understand that it is possible that a given key expires before its expected TTL. If that is the case, then, as explained in the article, Redlock could violate the safety property for the lock adquisition

oranagra · 2022-08-07T08:47:06Z

that's all true (except for the clock jumping backwards, since a proper configuration should use slewing), but the question is if / how this breaks redlock (which i don't know much about).

jhmilan · 2022-08-07T09:34:29Z

Say you have 5 Redis nodes for your Redlock implementation. Then 2 of them are down. You adquire a lock for the resource "foo" for 30 seconds (it is basically setting the key "foo" in a majority of nodes for 30 secs TTL).
The operation would success since you managed to set the key in at least 3/5 nodes .

Now imagine that the 2 nodes down come back to life again and suppose there is a clock shift in node B which makes the TTL to expire. Basically it would mean that 2/5 nodes (a minority) would hold the lock but someone trying to acquire "foo" would success since 3/5 nodes would set that key.

In this situation 2 clients could have acquired the same lock simultaneously. The chances for this to happen are quite low I think (especially for short lived ttls) but still, I becomes Redlock unsafe and not eligible for some scenarios were consistency is crucial.

By using a monotonic clock for the expiration mechanism that issue would be gone. I guess that this is not important when using Redis as a cache layer, there you would be just missing some key that you can easily refresh.

jhmilan · 2022-08-19T04:01:28Z

Small thought here... Regardless Redlock use case, in my opinion, TTL mechanism should be reliable and the fact it uses the wall-clock makes it weak. I never thought about it very thorough, but it scopes Redis to just be a simpler cache layer and no more than this.

It is a bit disappointing

madolson · 2022-08-19T16:10:30Z

Small thought here... Regardless Redlock use case, in my opinion, TTL mechanism should be reliable and the fact it uses the wall-clock makes it weak.

Wall-clocks have been getting more reliable and consistent. Within AWS, you can configure NTP to use AWS time sync which provides a very accurate global clock which correctly slews with changes. I don't think the answer to more precise time measurement is to necessarily use the hardware clocks/monotonic APIs, especially for distributed systems.

jhmilan · 2022-08-19T16:41:47Z

@madolson "more reliable" is not enough in my opinion. Your argument is that "you can set up the system in a way that makes this problem less likely" and, ok, but this is not sufficient I would say.

It is not about being against NTP and wall clocks in general, it is just about this specific use case in Redis TTL. I think it should be (100%) reliable. A TTL should never ever expire before its expiration time. It never ever should be affected by something external.

oranagra · 2022-08-21T07:12:29Z

@jhmilan This is not as simple as it looks, you are looking at it from the perspective of Redlock, but redis has other concerns as well.
First, Redis has both EXPIRE and EXPIREAT commands (the later taking a Unix time rather than relative time, but both getting encoded in the same format internally).
In addition, to that, we have replicas and AOF / RDB files, we need to consider the consistency of the dataset that's replicated to other machines, see #8474.

jhmilan · 2022-08-21T08:03:10Z

@jhmilan This is not as simple as it looks, you are looking at it from the perspective of Redlock, but redis has other concerns as well.
First, Redis has both EXPIRE and EXPIREAT commands (the later taking a Anix time rather than relative time, but both getting encoded in the same format internally).
In addition, to that, we have replicas and AOF / RDB files, we need to consider the consistency of the dataset that's replicated to other machines, see #8474.

Thanks for your response. And please, don't take me wrong.

I'm not saying this is simple. I'm pretty sure it has a lots of corners and implementation details that I don't know.
I stumbled upon this topic from Redlock but now I'm looking at the problem more generically.
I have no clue about AOF/RDB files (just a Redis user)

My only suggestion here is adding some disclaimer in the documentation clearly stating the limitation and risks around TTLs, clocks, etc (or maybe they are there and I miss them).

As a user of Redis I assumed that TTLs are consistent. It is not something that you think of upfront. So discovering that wall-clock issues can affect Redis TTLs scares you a bit as a user. It would be nice if you know exactly what are the potential problems, how you can prevent them, what are the implications, what you should/should not do, etc...

And particularly, Redlock docs must warn about it very clearly.

I love Redis and I think it is a great solution. My problem has been that I have implemented something that Redis has brought up publicly (Redlock) and after implementing it, I discovered that something in the guts of Redis makes this (official) solution weak.

Do you think this TTL/clock issue might be fixed consistently in the future?

Any light you can bring in that sense would be very welcome.

Thank you very much in advance

oranagra · 2022-08-21T09:30:19Z

Hi,
We don't currently have any plans to change or fix this, but maybe if someone suggests a solution we can consider it, or if enough problems are found and we realize we have to find some solution.

Regarding Redlock, it could be that something has changed since it was designed. well, many things have changed, but i can't think of something that broke it.
I agree it should be documented. i invite you to make a PR to https://github.com/redis/redis-doc/blob/master/docs/reference/patterns/distributed-locks.md

Regarding other documentation, about TTL, i'm not certain what's already documented and where it could be added.
@itamarhaber maybe you have some idea?

jhmilan · 2022-08-23T17:48:53Z

Hi,
We don't currently have any plans to change or fix this, but maybe if someone suggests a solution we can consider it, or if enough problems are found and we realize we have to find some solution.

Regarding Redlock, it could be that something has changed since it was designed. well, many things have changed, but i can't think of something that broke it.
I agree it should be documented. i invite you to make a PR to https://github.com/redis/redis-doc/blob/master/docs/reference/patterns/distributed-locks.md

I will, for sure.

Thanks for clarifying this to me.

Update adds a general source for retrieving a monotonic time. In addition, AE has been updated to utilize the new monotonic clock for timer processing. This performance improvement is **not** enabled in a default build due to various H/W compatibility concerns, see README.md for details. It does however change the default use of gettimeofday with clock_gettime and somewhat improves performance. This update provides the following 1. An interface for retrieving a monotonic clock. getMonotonicUs returns a uint64_t (aka monotime) with the number of micro-seconds from an arbitrary point. No more messing with tv_sec/tv_usec. Simple routines are provided for measuring elapsed milli-seconds or elapsed micro-seconds (the most common use case for a monotonic timer). No worries about time moving backwards. 2. High-speed assembler implementation for x86 and ARM. The standard method for retrieving the monotonic clock is POSIX.1b (1993): clock_gettime(CLOCK_MONOTONIC, timespec*). However, most modern processors provide a constant speed instruction clock which can be retrieved in a fraction of the time that it takes to call clock_gettime. For x86, this is provided by the RDTSC instruction. For ARM, this is provided by the CNTVCT_EL0 instruction. As a compile-time option, these high-speed timers can be chosen. (Default is POSIX clock_gettime.) 3. Refactor of event loop timers. The timer processing in ae.c has been refactored to use the new monotonic clock interface. This results in simpler/cleaner logic and improved performance.

The code used to decide on the next time to wake on a timer with microsecond accuracy, but when deciding to go to sleep it used milliseconds accuracy (with truncation), this means that it would wake up too early, see that there's no timer to process, and go to sleep again for 0ms again and again until the right microsecond arrived. i.e. a timer for 100ms, would sleep for 99ms, but then do a busy loop through the kernel in the last millisecond, triggering many calls to beforeSleep. The fix is to change all the logic in ae.c to work with microseconds, which is good since most of the ae backends support micro (or even nano) seconds. however the epoll backend, doesn't support micro, so to avoid this problem it needs to round upwards, rather than truncate. Issue created by the monotonic timer PR redis#7644 (redis 6.2) Before that, all the timers in ae.c were in milliseconds (using mstime), so when it requested the backend to sleep till the next timer event, it would have worked ok.

Merge pull request #1 from redis/unstable

0ee3061

Rebase to latest OSS

JimB123 force-pushed the monotonic branch from 2b90a4f to 32dfd0b Compare August 10, 2020 23:54

itamarhaber requested a review from madolson August 11, 2020 10:46

oranagra reviewed Aug 12, 2020

View reviewed changes

src/monotonic.c Outdated Show resolved Hide resolved

src/monotonic.c Outdated Show resolved Hide resolved

JimB123 force-pushed the monotonic branch from 32dfd0b to 5cd5eda Compare August 12, 2020 16:11

JimB123 mentioned this pull request Aug 13, 2020

Incremental eviction processing #7653

Merged

madolson reviewed Aug 18, 2020

View reviewed changes

src/monotonic.c Outdated Show resolved Hide resolved

oranagra added release-notes indication that this issue needs to be mentioned in the release notes state-doc-needed state:major-decision Requires core team consensus state:needs-code-doc the PR requires documentation inside the redis repository and removed state-doc-needed labels Aug 18, 2020

JimB123 force-pushed the monotonic branch from 5cd5eda to 3b7ae13 Compare August 18, 2020 20:22

This was referenced Apr 26, 2021

Use clock_gettime(2) on FreeBSD instead of gettimeofday(2). #2237

Closed

Fixed ae.c to increase performances under FreeBSD #1217

Closed

use monotonic clock for relative time #2570

Closed

yossigo mentioned this pull request Apr 27, 2021

Use monotonic clock when available #416

Closed

filipecosta90 mentioned this pull request May 3, 2021

[BUG] why use gettimeofday to measure latency #5225

Closed

sundb mentioned this pull request Jun 29, 2021

Added a monotonic clock read. #4694

Closed

yoav-steinberg mentioned this pull request Jan 9, 2022

[event loop]: Fix calculate nearest tvp #2084

Closed

oranagra mentioned this pull request Apr 17, 2022

Optimization: Use either monotonic or wall-clock to measure command execution time, to regain up to 4% execution time #10502

Merged

jhmilan mentioned this pull request Aug 30, 2022

Adding disclaimer about consistency (fencing tokens and ttls expirati… redis/redis-doc#2106

Merged

YacineTALEB mentioned this pull request Jan 27, 2023

Monotonic Clock Going Backwards #11759

Open

infdahai mentioned this pull request May 22, 2023

consider to replace system_clock with steady_clock apache/kvrocks#1466

Closed

2 tasks

ranshid mentioned this pull request Sep 9, 2025

Set the default to enable CPU clock for monotonic time tracking valkey-io/valkey#2597

Open

Monotonic clock and updates to AE #7644

Monotonic clock and updates to AE #7644

Uh oh!

Conversation

JimB123 commented Aug 10, 2020

Background

This update provides the following

Performance implications

Test details

Future updates

For those unfamiliar with the x86 TSC

Uh oh!

itamarhaber commented Aug 11, 2020

Uh oh!

guybe7 commented Aug 11, 2020

Uh oh!

JimB123 commented Aug 11, 2020

Uh oh!

guybe7 commented Aug 11, 2020

Uh oh!

madolson commented Aug 11, 2020

Uh oh!

oranagra left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

JimB123 commented Aug 12, 2020

Uh oh!

oranagra commented Aug 13, 2020

Uh oh!

yossigo commented Aug 13, 2020

Uh oh!

oranagra commented Aug 13, 2020

Uh oh!

yossigo commented Aug 13, 2020

Uh oh!

oranagra commented Aug 13, 2020

Uh oh!

JimB123 commented Aug 13, 2020

Uh oh!

JimB123 commented Aug 13, 2020

Uh oh!

madolson commented Aug 13, 2020

Uh oh!

JimB123 commented Aug 13, 2020

Uh oh!

madolson commented Aug 18, 2020

Uh oh!

Uh oh!

oranagra commented Aug 18, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jhmilan commented Aug 5, 2022

Uh oh!

JimB123 commented Aug 5, 2022

Uh oh!

jhmilan commented Aug 5, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

oranagra commented Aug 7, 2022

Uh oh!

jhmilan commented Aug 7, 2022

Uh oh!

oranagra commented Aug 7, 2022

Uh oh!

jhmilan commented Aug 7, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jhmilan commented Aug 19, 2022

Uh oh!

madolson commented Aug 19, 2022

Uh oh!

jhmilan commented Aug 19, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

oranagra commented Aug 21, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jhmilan commented Aug 21, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

oranagra commented Aug 18, 2020 •

edited

Loading

jhmilan commented Aug 5, 2022 •

edited

Loading

jhmilan commented Aug 7, 2022 •

edited

Loading

jhmilan commented Aug 19, 2022 •

edited

Loading

oranagra commented Aug 21, 2022 •

edited

Loading

jhmilan commented Aug 21, 2022 •

edited

Loading