Skip to content

Conversation

@BewareMyPower
Copy link
Contributor

Fixes #24894

Motivation

Currently the quietPeriod parameter in Netty graceful shutdown in milliseconds is determined by the minimal value of 5000 and brokerShutdownTimeoutMs / 4 (default: 15000), which is unnecessarily large and could wait for too long with no pending tasks in the executor.

Modifications

  • Modified the quietPeriod argument as a small value (1 ms)
  • Added BrokerEventLoopShutdownTest to avoid the regression
  • Added logs for how long that each event loop's gracefully shutdown takes.

It should be noted that setting 1 ms as the quietPeriod makes sense. It has been explained in #24894 (comment) and more details can be found in https://github.com/netty/netty/blob/6bb1a6bcae503423343b9155ac48b161f60368b5/common/src/main/java/io/netty/util/concurrent/SingleThreadEventExecutor.java

Here is the best practice suggested by AI:

Consider a very short quiet period if applicable: If your application is lightweight and you are certain no business logic is left to execute, you can use a very small value to terminate the event loop group immediately.

Verifying this change

After this change, the BrokerEventLoopShutdownTest can pass and the logs might look like:

2025-10-27T20:29:09,129 - INFO  - [globalEventExecutor-9-2:BrokerService] - Event loop acceptor shut down after 104 ms
2025-10-27T20:29:09,130 - INFO  - [globalEventExecutor-9-2:BrokerService] - Event loop worker shut down after 105 ms

Documentation

  • doc
  • doc-required
  • doc-not-needed
  • doc-complete

Matching PR in forked repository

PR in forked repository:

@BewareMyPower BewareMyPower added type/enhancement The enhancements for the existing features or docs. e.g. reduce memory usage of the delayed messages release/4.1.2 release/4.0.8 labels Oct 27, 2025
@BewareMyPower BewareMyPower self-assigned this Oct 27, 2025
@github-actions github-actions bot added the doc-not-needed Your PR changes do not impact docs label Oct 27, 2025
@BewareMyPower BewareMyPower added this to the 4.2.0 milestone Oct 27, 2025
@lhotari
Copy link
Member

lhotari commented Oct 27, 2025

There are multiple test failures with similar IllegalArgumentException:

  Error:  Tests run: 4, Failures: 1, Errors: 0, Skipped: 2, Time elapsed: 23.36 s <<< FAILURE! -- in org.apache.pulsar.broker.loadbalance.SimpleBrokerStartTest
  Error:  org.apache.pulsar.broker.loadbalance.SimpleBrokerStartTest.testHasNICSpeed -- Time elapsed: 5.531 s <<< FAILURE!
  org.apache.pulsar.broker.PulsarServerException: java.lang.IllegalArgumentException: timeout: 0 (expected >= quietPeriod (1))
  	at org.apache.pulsar.broker.PulsarService.close(PulsarService.java:479)
  	at org.apache.pulsar.broker.loadbalance.SimpleBrokerStartTest.testHasNICSpeed(SimpleBrokerStartTest.java:70)
  	at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
  	at java.base/java.lang.reflect.Method.invoke(Method.java:580)
  	at org.testng.internal.invokers.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:139)
  	at org.testng.internal.invokers.InvokeMethodRunnable.runOne(InvokeMethodRunnable.java:47)
  	at org.testng.internal.invokers.InvokeMethodRunnable.call(InvokeMethodRunnable.java:76)
  	at org.testng.internal.invokers.InvokeMethodRunnable.call(InvokeMethodRunnable.java:11)
  	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:317)
  	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
  	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
  	at java.base/java.lang.Thread.run(Thread.java:1583)
  Caused by: java.lang.IllegalArgumentException: timeout: 0 (expected >= quietPeriod (1))
  	at io.netty.util.concurrent.SingleThreadEventExecutor.shutdownGracefully(SingleThreadEventExecutor.java:624)
  	at io.netty.util.concurrent.MultithreadEventExecutorGroup.shutdownGracefully(MultithreadEventExecutorGroup.java:163)
  	at org.apache.pulsar.broker.service.BrokerService.shutdownEventLoopGracefully(BrokerService.java:968)
  	at org.apache.pulsar.broker.service.BrokerService.closeAsync(BrokerService.java:866)
  	at org.apache.pulsar.broker.PulsarService.closeAsync(PulsarService.java:596)
  	at org.apache.pulsar.broker.PulsarService.closeAsync(PulsarService.java:494)
  	at org.apache.pulsar.broker.PulsarService.close(PulsarService.java:465)
  	... 11 more

@BewareMyPower
Copy link
Contributor Author

Okay, let me take a look

@BewareMyPower BewareMyPower marked this pull request as draft October 28, 2025 02:10
@BewareMyPower BewareMyPower marked this pull request as ready for review October 28, 2025 07:52
@codecov-commenter
Copy link

Codecov Report

❌ Patch coverage is 86.66667% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 74.27%. Comparing base (14b0821) to head (7bdcd6d).
⚠️ Report is 10 commits behind head on master.

Files with missing lines Patch % Lines
...rg/apache/pulsar/broker/service/BrokerService.java 86.66% 1 Missing and 1 partial ⚠️
Additional details and impacted files

Impacted file tree graph

@@              Coverage Diff              @@
##             master   #24895       +/-   ##
=============================================
+ Coverage     38.41%   74.27%   +35.85%     
- Complexity    13133    33485    +20352     
=============================================
  Files          1856     1913       +57     
  Lines        145160   149331     +4171     
  Branches      16846    17334      +488     
=============================================
+ Hits          55760   110909    +55149     
+ Misses        81815    29580    -52235     
- Partials       7585     8842     +1257     
Flag Coverage Δ
inttests 26.37% <6.66%> (-0.05%) ⬇️
systests 22.76% <6.66%> (-0.10%) ⬇️
unittests 73.81% <86.66%> (+39.29%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
...rg/apache/pulsar/broker/service/BrokerService.java 83.52% <86.66%> (+25.20%) ⬆️

... and 1412 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@lhotari lhotari merged commit 411aea9 into apache:master Oct 28, 2025
158 of 165 checks passed
lhotari pushed a commit that referenced this pull request Oct 28, 2025
…for event loop shutdown (#24895)

(cherry picked from commit 411aea9)
lhotari pushed a commit that referenced this pull request Oct 28, 2025
…for event loop shutdown (#24895)

(cherry picked from commit 411aea9)
@BewareMyPower BewareMyPower deleted the bewaremypower/improve-close branch October 29, 2025 01:27
manas-ctds pushed a commit to datastax/pulsar that referenced this pull request Oct 29, 2025
…for event loop shutdown (apache#24895)

(cherry picked from commit 411aea9)
(cherry picked from commit 61f6fcd)
guptas6est pushed a commit to Nordix/pulsar that referenced this pull request Oct 30, 2025
srinath-ctds pushed a commit to datastax/pulsar that referenced this pull request Nov 6, 2025
…for event loop shutdown (apache#24895)

(cherry picked from commit 411aea9)
(cherry picked from commit 61f6fcd)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cherry-picked/branch-4.0 cherry-picked/branch-4.1 doc-not-needed Your PR changes do not impact docs ready-to-test release/4.0.8 release/4.1.2 type/enhancement The enhancements for the existing features or docs. e.g. reduce memory usage of the delayed messages

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] Broker close time is too long even if there is no connection

3 participants