Epoll: Avoid modifying EPOLLIN interest when in ET mode by njhill · Pull Request #9348 · netty/netty

njhill · 2019-07-10T23:03:42Z

Motivation

An inbound analog of #9347. This is only applicable when autoRead is turned off. AFAIU it shouldn't be necessary to toggle the EPOLLIN flag via successive calls to epoll_ctl between reads.

Modifications

Don't clear EPOLLIN flag when in ET mode with autoRead == false. Don't process EPOLLIN events when there is no read pending, just set the maybeMoreDataToRead flag.

Result

Fewer syscalls when using epoll transport without autoRead. I have not done any performance tests but expect the saving might be significant in cases where read() isn't often called in channelRead(...) or
channelReadComplete(...) - i.e. two syscalls per read.

Motivation An inbound analog of netty#9347. This is only applicable when autoRead is turned off. AFAIU it shouldn't be necessary to toggle the EPOLLIN flag via successive calls to epoll_ctl between reads. Modifications Don't clear EPOLLIN flag when in ET mode with autoRead == false. Don't process EPOLLIN events when there is no read pending, just set the maybeMoreDataToRead flag. Result Fewer syscalls when using epoll transport without autoRead. I have not done any performance tests but expect the saving might be significant in cases where read() isn't often called in channelRead(...) or channelReadComplete(...) - i.e. two syscalls per read.

netty-bot · 2019-07-10T23:04:04Z

Can one of the admins verify this patch?

njhill · 2019-07-10T23:12:16Z

cc @ninja-

normanmaurer · 2019-07-11T13:32:58Z

@netty-bot test this please

ninja- · 2019-07-11T13:39:10Z

@njhill yeah, looks like a good change for my main app. Looking forward to it :P

njhill · 2019-07-11T17:55:56Z

Pushed another commit with small adjustments - noticed that the flags field could have been read from outside the event loop.

normanmaurer · 2019-07-12T09:11:32Z

@netty-bot test this please

normanmaurer

Great idea!

njhill · 2019-07-12T17:12:27Z

Thanks @normanmaurer, I'm now even more curious what your thoughts are about #9349 :)

normanmaurer · 2019-07-12T18:55:52Z

@netty-bot test this please

njhill · 2019-07-12T20:02:27Z

Pushed one more tiny refinement (hopefully last!), sorry for the churn.

normanmaurer · 2019-07-12T20:09:34Z

@netty-bot test this please

carl-mastrangelo

LGTM

normanmaurer · 2019-07-13T05:49:05Z

@njhill @carl-mastrangelo after thinking a bit more about this I am not sure if this is really a good idea.

The problem is that this may result in a lot of wakeups even if we are not interested in data (if I am not wrong). Each time new data is received for a connection will have us wakeup just to do nothing. This sounds like it may be very bad in some setups, especially if you have huge receive buffers. Or I am missing anything ?

normanmaurer · 2019-07-13T05:50:02Z

This of course is only bad if the user stops reading for some extended period of time

carl-mastrangelo · 2019-07-13T06:07:11Z

@normanmaurer My recollection is that epoll doesn't re notify you until you consume all the data on the FD (which I could be wrong about).

carl-mastrangelo · 2019-07-13T06:11:14Z

this is hinted by the man page in the Q&A section 9: http://man7.org/linux/man-pages/man7/epoll.7.html

normanmaurer · 2019-07-13T06:14:42Z

@carl-mastrangelo if this is the case ( I thought it will also wakeup for new data, just not until new arrived) we have a problem here https://github.com/netty/netty/blob/4.1/transport-native-epoll/src/main/java/io/netty/channel/epoll/EpollEventLoop.java#L389 as we never read the eventfd value and so would not get another wakup, right ?

normanmaurer · 2019-07-13T07:20:28Z

@carl-mastrangelo @njhill at least for eventfd it seems my understanding holds.

    public void testET() throws Throwable {
        final FileDescriptor epoll = Native.newEpollCreate();
        final FileDescriptor eventFd = Native.newEventFd();
        final FileDescriptor timerFd = Native.newTimerFd();
        final EpollEventArray array = new EpollEventArray(1024);
        try {
            Native.epollCtlAdd(epoll.intValue(), eventFd.intValue(), Native.EPOLLIN | Native.EPOLLET);
            final AtomicReference<Throwable> causeRef = new AtomicReference<Throwable>();
            final AtomicInteger integer = new AtomicInteger();
            final Thread t = new Thread(new Runnable() {
                @Override
                public void run() {
                    try {
                        for (int i = 0; i < 2; i++) {
                            int ready = Native.epollWait(epoll, array, timerFd, -1, -1);
                            assertEquals(1, ready);
                            assertEquals(eventFd.intValue(), array.fd(0));
                            integer.incrementAndGet();
                        }
                    } catch (IOException e) {
                        causeRef.set(e);
                    }
                }
            });
            t.start();
            Native.eventFdWrite(eventFd.intValue(), 1);
            while (integer.get() != 1) {
                Thread.sleep(100);
            }
            // Ensure there is no other wakeup
            Thread.sleep(1000);
            assertEquals(1, integer.get());
            Native.eventFdWrite(eventFd.intValue(), 1);
            t.join();
            Throwable cause = causeRef.get();
            if (cause != null) {
                throw cause;
            }
            assertEquals(2, integer.get());
        } finally {
            array.free();
            epoll.close();
            eventFd.close();
            timerFd.close();
        }
    }

For sockets it may be different ?

carl-mastrangelo · 2019-07-13T07:24:33Z

@normanmaurer I tried too, and noticed the same thing. I think you are right, this would cause too much churn.

However, I think reverting the code in clearEpollIn0() might make this work. As long as the unsafe remembers if there is data, and the channel is not in auto read mode, I think the fd can remove its interest in EPOLLIN.

normanmaurer · 2019-07-13T07:31:44Z

@carl-mastrangelo i will also retry my test with a socket fd to see if there is any difference

njhill · 2019-07-15T06:41:36Z

@normanmaurer you make a good point :) I had assumed the same as @carl-mastrangelo but after more research it's funny how confusing/misleading and often contradictory the various explanations are on this particular aspect of the behaviour (including the manpages). And it sounds possible that it differs between fd types and/or between the in and out cases. So I agree a socket fd test would be the best next step (hoping that eventfd is special...), I will aim to try too if you haven't already by the time I wake up!

If it turns out to be the case for sockets too then it probably rules out the hoped-for simplification of eliminating all epoll_ctl calls, but I think some could still be avoided - specifically not setting EPOLLIN where it's done currently and instead waiting until the processing is complete, before calling epoll_wait again ... i.e. there's no point in toggling from unset->set->unset while the EL is still active.

njhill · 2019-07-16T05:55:21Z

I realized the change wasn't even behaving as intended due to an incorrectly inverted check, tests are still fine with that fixed (pushed update), but @normanmaurer it does seem that you're right re new data constituting an edge in the socket case too. I will try to rework it per prior comment.

normanmaurer · 2019-07-23T19:40:39Z

@njhill let me know once I should have a look again.

njhill · 2019-07-23T21:41:00Z

@normanmaurer see #9397. We could probably close this one.

normanmaurer · 2019-07-24T05:24:14Z

Ok lets do it.

njhill mentioned this pull request Jul 10, 2019

Epoll: Avoid modifying EPOLLOUT interest when in ET mode #9347

Closed

njhill requested review from Scottmitch and normanmaurer July 10, 2019 23:11

normanmaurer requested review from carl-mastrangelo and ejona86 July 11, 2019 13:41

Minor thread-safety related adjustments

2cb7d0b

normanmaurer approved these changes Jul 12, 2019

View reviewed changes

normanmaurer added this to the 4.1.38.Final milestone Jul 12, 2019

Further adjustment to ensure there aren't problems with LT + autoread

bbc1c4a

franz1981 approved these changes Jul 12, 2019

View reviewed changes

Remove redundant EPOLLET flag check (also done inside clearEpollIn0)

d372161

carl-mastrangelo approved these changes Jul 13, 2019

View reviewed changes

Invert incorrect EPOLLET flag check in clearEpollIn0()

75383c1

normanmaurer mentioned this pull request Jul 17, 2019

Add testcase to prove that ET semantics for eventFD are correct #9385

Merged

normanmaurer closed this Jul 24, 2019

normanmaurer removed this from the 4.1.38.Final milestone Jul 24, 2019

Uh oh!

Conversation

njhill commented Jul 10, 2019

Uh oh!

netty-bot commented Jul 10, 2019

Uh oh!

njhill commented Jul 10, 2019

Uh oh!

normanmaurer commented Jul 11, 2019

Uh oh!

ninja- commented Jul 11, 2019

Uh oh!

njhill commented Jul 11, 2019

Uh oh!

normanmaurer commented Jul 12, 2019

Uh oh!

normanmaurer left a comment

Choose a reason for hiding this comment

Uh oh!

njhill commented Jul 12, 2019

Uh oh!

normanmaurer commented Jul 12, 2019

Uh oh!

njhill commented Jul 12, 2019

Uh oh!

normanmaurer commented Jul 12, 2019

Uh oh!

carl-mastrangelo left a comment

Choose a reason for hiding this comment

Uh oh!

normanmaurer commented Jul 13, 2019

Uh oh!

normanmaurer commented Jul 13, 2019

Uh oh!

carl-mastrangelo commented Jul 13, 2019

Uh oh!

carl-mastrangelo commented Jul 13, 2019

Uh oh!

normanmaurer commented Jul 13, 2019

Uh oh!

normanmaurer commented Jul 13, 2019

Uh oh!

carl-mastrangelo commented Jul 13, 2019

Uh oh!

normanmaurer commented Jul 13, 2019

Uh oh!

njhill commented Jul 15, 2019

Uh oh!

njhill commented Jul 16, 2019

Uh oh!

normanmaurer commented Jul 23, 2019

Uh oh!

njhill commented Jul 23, 2019

Uh oh!

normanmaurer commented Jul 24, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants