-
Notifications
You must be signed in to change notification settings - Fork 24.4k
Fix busy loop in ae.c when timer event is about to fire #8764
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
The code used to decide on the next time to wake on a timer with microsecond accuracy, but when deciding to go to sleep it used milliseconds accuracy (with truncation), this means that it would wake up too early, see that there's no timer to process, and go to sleep again for 0ms again and again until the right microsecond arrived. i.e. a timer for 100ms, would sleep for 99ms, but then do a busy loop though the kernel in the last millisecond. The fix is to change all the logic in ae.c to work with microseconds, which is good since most of the ae backends support micro (or even nano) seconds. however the epoll backend, doesn't support micro, so to avoid this problem it needs to round upwards, rather than truncate. Issue created by the monotonic timer PR #7644 (redis 6.2) Before that, all the timers in ae.c were in milliseconds (using mstime), so when it requested the backend to sleep till the next timer event, it would have worked ok.
yossigo
approved these changes
Apr 12, 2021
JimB123
approved these changes
Apr 12, 2021
Contributor
JimB123
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me.
Merged
JackieXie168
pushed a commit
to JackieXie168/redis
that referenced
this pull request
Sep 8, 2021
The code used to decide on the next time to wake on a timer with microsecond accuracy, but when deciding to go to sleep it used milliseconds accuracy (with truncation), this means that it would wake up too early, see that there's no timer to process, and go to sleep again for 0ms again and again until the right microsecond arrived. i.e. a timer for 100ms, would sleep for 99ms, but then do a busy loop through the kernel in the last millisecond, triggering many calls to beforeSleep. The fix is to change all the logic in ae.c to work with microseconds, which is good since most of the ae backends support micro (or even nano) seconds. however the epoll backend, doesn't support micro, so to avoid this problem it needs to round upwards, rather than truncate. Issue created by the monotonic timer PR redis#7644 (redis 6.2) Before that, all the timers in ae.c were in milliseconds (using mstime), so when it requested the backend to sleep till the next timer event, it would have worked ok.
panjf2000
added a commit
to panjf2000/redis
that referenced
this pull request
Jan 29, 2024
`epoll_pwait2()` was introduced in Linux kernel 5.11, it extends `epoll_wait()`/`epoll_pwait()` by allowing the `timeout` to be specified with a higher resolution, as a `timespec` type, providing nanosecond precision. By using `epoll_pwait2()`, we can resolve the problem in redis#8764 faultlessly, once and for all. Furthermore, it would also offer the possibility for us to handle timers in a higher resolution in the future: from microseconds to nanoseconds (if necessary).
enjoy-binbin
pushed a commit
to enjoy-binbin/redis
that referenced
this pull request
Jul 25, 2025
The code used to decide on the next time to wake on a timer with microsecond accuracy, but when deciding to go to sleep it used milliseconds accuracy (with truncation), this means that it would wake up too early, see that there's no timer to process, and go to sleep again for 0ms again and again until the right microsecond arrived. i.e. a timer for 100ms, would sleep for 99ms, but then do a busy loop through the kernel in the last millisecond, triggering many calls to beforeSleep. The fix is to change all the logic in ae.c to work with microseconds, which is good since most of the ae backends support micro (or even nano) seconds. however the epoll backend, doesn't support micro, so to avoid this problem it needs to round upwards, rather than truncate. Issue created by the monotonic timer PR redis#7644 (redis 6.2) Before that, all the timers in ae.c were in milliseconds (using mstime), so when it requested the backend to sleep till the next timer event, it would have worked ok.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The code used to decide on the next time to wake on a timer with
microsecond accuracy, but when deciding to go to sleep it used
milliseconds accuracy (with truncation), this means that it would wake
up too early, see that there's no timer to process, and go to sleep
again for 0ms again and again until the right microsecond arrived.
i.e. a timer for 100ms, would sleep for 99ms, but then do a busy loop
through the kernel in the last millisecond, triggering many calls to
beforeSleep.
The fix is to change all the logic in ae.c to work with microseconds,
which is good since most of the ae backends support micro (or even nano)
seconds. however the epoll backend, doesn't support micro, so to avoid
this problem it needs to round upwards, rather than truncate.
Issue created by the monotonic timer PR #7644 (redis 6.2)
Before that, all the timers in ae.c were in milliseconds (using
mstime), so when it requested the backend to sleep till the next timer
event, it would have worked ok.