Skip to content

Fix event loop shutdown timing fragility#9639

Merged
normanmaurer merged 1 commit intomasterfrom
shutdown_fix_port
Oct 8, 2019
Merged

Fix event loop shutdown timing fragility#9639
normanmaurer merged 1 commit intomasterfrom
shutdown_fix_port

Conversation

@normanmaurer
Copy link
Copy Markdown
Member

Motivation

The current event loop shutdown logic is quite fragile and in the
epoll/NIO cases relies on the default 1 second wait/select timeout that
applies when there are no scheduled tasks. Without this default timeout
the shutdown would hang indefinitely.

The timeout only takes effect in this case because queued scheduled
tasks are first cancelled in
SingleThreadEventExecutor#confirmShutdown(), but I think even this
isn't robust, since the main task queue is subsequently serviced which
could result in some new scheduled task being queued with much later
deadline.

It also means shutdowns are unnecessarily delayed by up to 1 second.

Modifications

  • Add/extend unit tests to expose the issue
  • Adjust SingleThreadEventExecutor shutdown and confirmShutdown methods
    to explicitly add no-op tasks to the taskQueue so that the subsequent
    event loop iteration doesn't enter blocking wait (as looks like was
    originally intended)

Results

Faster and more robust shutdown of event loops, allows removal of the default wait timeout.
This is a port of #9616

Motivation

The current event loop shutdown logic is quite fragile and in the
epoll/NIO cases relies on the default 1 second wait/select timeout that
applies when there are no scheduled tasks. Without this default timeout
the shutdown would hang indefinitely.

The timeout only takes effect in this case because queued scheduled
tasks are first cancelled in
SingleThreadEventExecutor#confirmShutdown(), but I _think_ even this
isn't robust, since the main task queue is subsequently serviced which
could result in some new scheduled task being queued with much later
deadline.

It also means shutdowns are unnecessarily delayed by up to 1 second.

Modifications

- Add/extend unit tests to expose the issue
- Adjust SingleThreadEventExecutor shutdown and confirmShutdown methods
to explicitly add no-op tasks to the taskQueue so that the subsequent
event loop iteration doesn't enter blocking wait (as looks like was
originally intended)

Results

Faster and more robust shutdown of event loops, allows removal of the default wait timeout.
This is a port of #9616
@normanmaurer
Copy link
Copy Markdown
Member Author

@njhill PTAL... this is a port of #9616

@normanmaurer
Copy link
Copy Markdown
Member Author

@netty-bot test this please

Copy link
Copy Markdown
Member

@njhill njhill left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@normanmaurer normanmaurer merged commit ec8e8bd into master Oct 8, 2019
@normanmaurer normanmaurer deleted the shutdown_fix_port branch October 8, 2019 08:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants