-
Notifications
You must be signed in to change notification settings - Fork 3.7k
[improve][broker]Call scheduleAtFixedRateNonConcurrently for scheduled tasks, instead of scheduleAtFixedRate #24596
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[improve][broker]Call scheduleAtFixedRateNonConcurrently for scheduled tasks, instead of scheduleAtFixedRate #24596
Conversation
|
Could we just fix the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To avoid making a major behavioral change that could cause breaking changes, we should only make scheduleAtFixedRate -> scheduleWithFixedDelay changes to the tasks where the execution interval is relatively short.
For scheduled tasks that continue use scheduleAtFixedRate, it would be useful to implement a wrapper that keeps state of in progress execution and skips that task and logs a warning if the previous task is still in execution. We already have the org.apache.pulsar.common.util.Runnables class in pulsar-common and we could add this logic there. The existing catchingAndLoggingThrowables method could take another argument boolean skipConcurrentExecution that would handle skipping a concurrent call to the Runnable instance while the previous execution is happening.
WDYT?
pulsar-broker/src/main/java/org/apache/pulsar/broker/service/ServerCnx.java
Outdated
Show resolved
Hide resolved
|
I think maybe we don't need to replace all |
@dao-jun I don't think that anyone is suggesting to replace |
managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/ManagedLedgerFactoryImpl.java
Outdated
Show resolved
Hide resolved
|
@poorbarcode The description refers |
BewareMyPower
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR involves many changes.
Share the comparison of these two methods from Gemini AI:
scheduleAtFixedRateis best for:
Time-sensitive tasks that need a fixed schedule (e.g., cron jobs, hourly chimes). Tasks where a fixed pause between executions is important, especially if task duration is unpredictable.
scheduleWithFixedDelayis best for
Tasks where a fixed pause between executions is important, especially if task duration is unpredictable.The choice between scheduleAtFixedRate and scheduleWithFixedDelay depends entirely on your specific use case.
Since the issue is exactly with checkMessageExpiry, please only change its related logic.
|
Even with this change, it's nearly impossible to prevent new code from calling |
@BewareMyPower I think that analysis misses the detail that |
|
Actually the original analysis includes that point, but it's too long that I didn't include it. However, back to the issue, users should not configure a short interval like 1s and it's impossible because it can only be configured by
P.S. From the screenshot, message expiry check is scheduled every 5 minutes, which is a default value. It also shows no queued tasks being executed immediately.
|
@BewareMyPower I agree that it doesn't fix the actual issue. That's why I'd recommend using Without preventing concurrent job execution, the problem would get worse since another task would start executing the same task concurrently. That could already happen in cases where we have very short delays and lets say use synchronization to run the task one-by-one. All later coming tasks would get back logged in the executor which is a problem that should be solved. I have understood that it's the problem that @poorbarcode would like to address. |
|
Yes, adding a wrapper of |
@BewareMyPower @lhotari Thanks for your suggestion, I will improve the implementation |
Finished, please review again |
...on/src/main/java/org/apache/pulsar/common/util/SingleThreadSafeScheduledExecutorService.java
Outdated
Show resolved
Hide resolved
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #24596 +/- ##
============================================
+ Coverage 73.57% 74.30% +0.73%
- Complexity 32624 33185 +561
============================================
Files 1877 1882 +5
Lines 139502 146862 +7360
Branches 15299 16875 +1576
============================================
+ Hits 102638 109133 +6495
- Misses 28908 29052 +144
- Partials 7956 8677 +721
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
Please update the change description. |
Thanks for mentioning this, please review again |
|
@lhotari please review this pr again |
…d tasks, instead of scheduleAtFixedRate (apache#24596) (cherry picked from commit f41687c) (cherry picked from commit c604530)
…d tasks, instead of scheduleAtFixedRate (apache#24596) (cherry picked from commit f41687c) (cherry picked from commit c604530)
…d tasks, instead of scheduleAtFixedRate (apache#24596)
Motivation
Background of
scheduleWithFixedDelayandscheduleWithFixedRatescheduleAtFixedRate: Executes tasks at fixed intervals (period) regardless of task execution time,scheduleWithFixedDelay: Trigger Mechanism: Waits for the previous task to complete, then delays by the fixed interval (delay) before the next execution.Issue: broker ran checkMessageExpiry many times in 1s
When system resources are tight, it leads to a large backlog of tasks
checkMessageExpiry, which are executed multiple times at the same time, causing even greater pressureModifications
Use
scheduleAtFixedRateNonConcurrentlyto overridescheduleWithFixedRate, which executes tasks at fixed intervals (period) regardless of task execution time, and never causes backlogsDocumentation
docdoc-requireddoc-not-neededdoc-completeMatching PR in forked repository
PR in forked repository: x