Migrate event loops from file events to buffer events

*Title*: *Migrate event loops from file events to buffer events*

*Description*:

This can help to spare at least 3 syscalls per request (two `writev()`s and one `readv()`).

Currently Envoy relies on Libevent as an implementation for event loops and uses it in the "readiness" paradigm. That is, Libevent (on behalf of the kernel) notifies the application (Envoy) that a certain file descriptor is ready to be written or read; then the application makes a syscall to read or to write to the file descriptor. ~20% of a request time span Envoy spends on waiting for return from the syscalls.

Libevent also provides the API called "bufferevents" for event loops working in the "completeness" paradigm (or buffered/ringed IO). That is, an application registers read and write buffers, then Libevent notifies the application that there is data available in the read buffer or that the chunk of data put in the write buffer has been consumed. In this case if the underlying OS supports such paradigm (like in case of Microsoft's IOCP) then less computing resources are spent on syscalls like `writev()` or `readv()` - there is no need for context switches. Otherwise it's still file events hidden under the hood.

Linux supports the "completeness" paradigm natively too with its `io_uring` syscall, but Libevent hasn't been updated to support it yet. There is an [issue](https://github.com/libevent/libevent/issues/1019) for that though.

Perhaps we could modify Envoy's event loop to work in the "completeness" paradigm by relying on "bufferevents" for streaming connections and hoping for io_uring/[ioring](https://windows-internals.com/i-o-rings-when-one-i-o-operation-is-not-enough/) support to be added soon. Currently I don't know if "bufferevents" incur additional overhead compared with the traditional "readiness" approach. With my [quick hack](https://github.com/rojkov/envoy/commit/55743bd296ee523405e6041e7942e9ecc06a94cc) Envoy works even ~5% faster (14500 rps vs 15500 rps), but I presume that's because I have few things broken like e.g. flow control.

Alternatively we could abstract Libevent out somehow and resort to home grown event loops using io_uring/ioring directly when it's available and falling back to Libevent's "readiness" API otherwise. Probably with this approach it would be easier to implement as an extension a hardware accelerated event loop bypassing the kernel completely for network transfers.

/cc @antoniovicente 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Migrate event loops from file events to buffer events #17922

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Migrate event loops from file events to buffer events #17922

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions