Skip to content

Don't read from timerfd and eventfd on each EventLoop tick#9192

Merged
normanmaurer merged 1 commit into4.1from
eventfd_timerfd
May 31, 2019
Merged

Don't read from timerfd and eventfd on each EventLoop tick#9192
normanmaurer merged 1 commit into4.1from
eventfd_timerfd

Conversation

@normanmaurer
Copy link
Copy Markdown
Member

Motivation:

We do not need to issue a read on timerfd and eventfd when the EventLoop wakes up if we register these as Edge-Triggered. This removes the overhead of 2 syscalls and so helps to reduce latency.

Modifications:

  • Ensure we register the timerfd and eventfd with EPOLLET flag
  • If eventfd_write fails with EAGAIN, call eventfd_read and try eventfd_write again as we only use it as wake-up mechanism.

Result:

Less syscalls and so reducing overhead.

Co-authored-by: Carl Mastrangelo carl@carlmastrangelo.com

@normanmaurer
Copy link
Copy Markdown
Member Author

/cc @johnou

Copy link
Copy Markdown
Member

@carl-mastrangelo carl-mastrangelo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One nit, but LGTM.

Motivation:

We do not need to issue a read on timerfd and eventfd when the EventLoop wakes up if we register these as Edge-Triggered. This removes the overhead of 2 syscalls and so helps to reduce latency.

Modifications:

- Ensure we register the timerfd and eventfd with EPOLLET flag
- If eventfd_write fails with EAGAIN, call eventfd_read and try eventfd_write again as we only use it as wake-up mechanism.

Result:

Less syscalls and so reducing overhead.

Co-authored-by: Carl Mastrangelo <carl@carlmastrangelo.com>
@normanmaurer
Copy link
Copy Markdown
Member Author

@netty-bot test this please (test failure unrelated)

Copy link
Copy Markdown
Member

@njhill njhill left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, though I'm not an epoll syscall expert

Copy link
Copy Markdown
Contributor

@franz1981 franz1981 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice one!

@normanmaurer normanmaurer self-assigned this May 28, 2019
@normanmaurer normanmaurer requested a review from ejona86 May 30, 2019 18:00
netty_unix_errors_throwChannelExceptionErrorNo(env, "eventfd_write() failed: ", errno);
uint64_t val;

for (;;) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it possible to get into some type of weird busy loop here?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nope...

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok so eventfd_write isn't going to continuously return -1 and eventfd_read 0.

@normanmaurer normanmaurer merged commit f6cf681 into 4.1 May 31, 2019
@normanmaurer normanmaurer deleted the eventfd_timerfd branch May 31, 2019 04:59
normanmaurer added a commit that referenced this pull request May 31, 2019
Motivation:

We do not need to issue a read on timerfd and eventfd when the EventLoop wakes up if we register these as Edge-Triggered. This removes the overhead of 2 syscalls and so helps to reduce latency.

Modifications:

- Ensure we register the timerfd and eventfd with EPOLLET flag
- If eventfd_write fails with EAGAIN, call eventfd_read and try eventfd_write again as we only use it as wake-up mechanism.

Result:

Less syscalls and so reducing overhead.

Co-authored-by: Carl Mastrangelo <carl@carlmastrangelo.com>
amir73il pushed a commit to amir73il/man-pages that referenced this pull request Oct 8, 2024
For the moment, the edge-triggered epoll generates an event for each
receipt of a chunk of data, that is to say, epoll_wait() will return
and tell us a monitored file descriptor is ready whenever there is a
new activity on that FD since we were last informed about that FD.
This is not a real _edge_ implementation for epoll, but it's been
working this way for years and plenty of projects are relying on it
to eliminate the overhead of one system call of read(2) per wakeup event.

There are several renowned open-source projects relying on this feature
for notification function (with eventfd): register eventfd with EPOLLET
and avoid calling read(2) on the eventfd when there is wakeup event (eventfd being written).
Examples: nginx [1], netty [2], tokio [3], libevent [4], ect. [5]
These projects are widely used in today's Internet infrastructures.
Thus, changing this behavior of epoll ET will fundamentally break them
and cause a significant negative impact.
Linux has changed it for pipe before [6], breaking some Android libraries,
which had got "reverted" somehow. [7] [8]

Nevertheless, the paragraph in the manual pages describing this
characteristic of epoll ET seems ambiguous, I think a more explict
sentence should be used to clarify it. We're improving the notification
mechanism for libuv recently by exploiting this feature with eventfd,
which brings us a significant performance boost. [9]

Therefore, we (as well as the maintainers of nginx, netty, tokio, etc.)
would have a sense of security to build an enhanced notification function
based on this feature if there is a guarantee of retaining this implementation
of epoll ET for the backward compatibility in the man pages.

[1]: https://github.com/nginx/nginx/blob/efc6a217b92985a1ee211b6bb7337cd2f62deb90/src/event/modules/ngx_epoll_module.c#L386-L457
[2]: netty/netty#9192
[3]: https://github.com/tokio-rs/mio/blob/309daae21ecb1d46203a7dbc0cf4c80310240cba/src/sys/unix/waker.rs#L111-L143
[4]: https://github.com/libevent/libevent/blob/525f5d0a14c9c103be750f2ca175328c25505ea4/event.c#L2597-L2614
[5]: libuv/libuv#4400 (comment)
[6]: https://lkml.iu.edu/hypermail/linux/kernel/2010.1/04363.html
[7]: torvalds/linux@3a34b13
[8]: torvalds/linux@3b84482
[9]: libuv/libuv#4400 (comment)

Signed-off-by: Andy Pan <i@andypan.me>
Cc: <linux-api@vger.kernel.org>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Message-ID: <20240801-epoll-et-desc-v5-1-7fcb9260a3b2@andypan.me>
Signed-off-by: Alejandro Colomar <alx@kernel.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants