Skip to content

Conversation

@lhotari
Copy link
Member

@lhotari lhotari commented Aug 12, 2025

Fixes #16421
Fixes #24656

Motivation

This is the 2nd implementation PR for "PIP-430: Pulsar Broker cache improvements: refactoring eviction and adding a new cache strategy based on expected read count"

Modifications

Documentation

  • doc
  • doc-required
  • doc-not-needed
  • doc-complete

@lhotari lhotari added this to the 4.1.0 milestone Aug 12, 2025
@github-actions github-actions bot added PIP doc Your PR contains doc changes, no matter whether the changes are in markdown or code files. labels Aug 12, 2025
@codecov-commenter
Copy link

codecov-commenter commented Aug 13, 2025

Codecov Report

❌ Patch coverage is 70.53388% with 287 lines in your changes missing coverage. Please review.
✅ Project coverage is 74.18%. Comparing base (829df71) to head (bbca447).
⚠️ Report is 49 commits behind head on master.

Files with missing lines Patch % Lines
...mledger/impl/ActiveManagedCursorContainerImpl.java 63.61% 122 Missing and 21 partials ⚠️
.../ActiveManagedCursorContainerNavigableSetImpl.java 37.50% 68 Missing and 7 partials ⚠️
...eeper/mledger/impl/ManagedCursorContainerImpl.java 89.43% 13 Missing and 2 partials ⚠️
...mledger/impl/cache/RangeEntryCacheManagerImpl.java 41.66% 12 Missing and 2 partials ⚠️
...per/mledger/impl/cache/RangeCacheRemovalQueue.java 83.33% 5 Missing and 5 partials ⚠️
...keeper/mledger/impl/cache/RangeEntryCacheImpl.java 92.00% 2 Missing and 4 partials ⚠️
...rg/apache/pulsar/broker/service/BrokerService.java 57.14% 6 Missing ⚠️
...kkeeper/mledger/impl/ManagedLedgerFactoryImpl.java 20.00% 4 Missing ⚠️
...ache/pulsar/broker/ManagedLedgerClientFactory.java 57.14% 2 Missing and 1 partial ⚠️
...pache/bookkeeper/mledger/ManagedLedgerFactory.java 0.00% 2 Missing ⚠️
... and 7 more
Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff              @@
##             master   #24623      +/-   ##
============================================
- Coverage     74.26%   74.18%   -0.09%     
- Complexity    33213    33377     +164     
============================================
  Files          1885     1893       +8     
  Lines        146953   147722     +769     
  Branches      16928    17080     +152     
============================================
+ Hits         109136   109582     +446     
- Misses        29116    29395     +279     
- Partials       8701     8745      +44     
Flag Coverage Δ
inttests 26.64% <35.11%> (+0.05%) ⬆️
systests 22.74% <32.54%> (+0.11%) ⬆️
unittests 73.67% <70.53%> (-0.09%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
...ache/bookkeeper/mledger/EntryReadCountHandler.java 100.00% <100.00%> (ø)
...apache/bookkeeper/mledger/ManagedLedgerConfig.java 96.57% <100.00%> (+0.01%) ⬆️
...bookkeeper/mledger/ManagedLedgerFactoryConfig.java 100.00% <ø> (ø)
.../org/apache/bookkeeper/mledger/impl/EntryImpl.java 89.39% <100.00%> (+2.47%) ⬆️
...ookkeeper/mledger/impl/ManagedCursorContainer.java 87.50% <ø> (-9.82%) ⬇️
...che/bookkeeper/mledger/impl/ManagedCursorImpl.java 78.09% <100.00%> (+0.08%) ⬆️
...org/apache/bookkeeper/mledger/impl/OpAddEntry.java 77.46% <100.00%> (+0.54%) ⬆️
...he/bookkeeper/mledger/impl/ReadOnlyCursorImpl.java 95.83% <100.00%> (+0.37%) ⬆️
...kkeeper/mledger/impl/cache/EntryCacheDisabled.java 75.55% <100.00%> (ø)
...keeper/mledger/impl/cache/EntryLengthFunction.java 100.00% <100.00%> (ø)
... and 26 more

... and 82 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@lhotari lhotari marked this pull request as draft August 13, 2025 11:51
@lhotari
Copy link
Member Author

lhotari commented Aug 14, 2025

Here's the JMH results for the latest changes. The new getNumberOfCursorsAtSamePositionOrBefore behavior required for cacheEvictionByExpectedReadCount in ActiveManagedCursorContainerImpl is not causing an overhead compared to the previous ManagedCursorContainer tracking for slowest reader position.

Here's a visualization of the results:
JMH Visualizer: ActiveManagedCursorContainerBenchmark

text results

One interesting detail is that ActiveManagedCursorContainerImpl's algorithm would be faster than ManagedCursorContainer's algorithm for tracking the slowest cursor, assuming that most cursors are moving forward (which is the case).

@lhotari lhotari marked this pull request as ready for review August 14, 2025 11:44
@lhotari
Copy link
Member Author

lhotari commented Aug 18, 2025

The recently added "experimental" test case BrokerEntryCacheMultiBrokerTest.testTailingReadsRollingRestart shows the real value of PIP-430 changes. It demonstrates a rolling restart where a producer is producing to a topic and there are 10 connected consumers with unique subscriptions. During the 30 second test period the producer produces as fast as it can and the consumers consume as fast as they can. There are 2 brokers in the cluster and a broker is restarted 3 times during the test period.

The with PIP-430 defaults, Cache hits 98.56%, Cache misses 1.44% :

2025-08-18T19:39:44,894 - INFO  - [main:BrokerEntryCacheMultiBrokerTest] - Produced 21655 and Consumed 206425 messages (across 10 consumers with unique subscriptions) in total. Number of BK reads 74 with 2972 entries. Cache hits 98.56%, Cache misses 1.44%, Number of restarts 3

with cacheEvictionByExpectedReadCount=false, Cache hits 81.66%, Cache misses 18.34%

2025-08-18T19:41:17,119 - INFO  - [main:BrokerEntryCacheMultiBrokerTest] - Produced 20767 and Consumed 197466 messages (across 10 consumers with unique subscriptions) in total. Number of BK reads 848 with 36220 entries. Cache hits 81.66%, Cache misses 18.34%, Number of restarts 3

with cacheEvictionByExpectedReadCount=false and managedLedgerCacheEvictionExtendTTLOfRecentlyAccessed=false, Cache hits 72.97%, Cache misses 27.03%

2025-08-18T19:42:34,214 - INFO  - [main:BrokerEntryCacheMultiBrokerTest] - Produced 20888 and Consumed 198727 messages (across 10 consumers with unique subscriptions) in total. Number of BK reads 1309 with 53708 entries. Cache hits 72.97%, Cache misses 27.03%, Number of restarts 3

The results vary quite a lot and these are values from single runs. However, it demonstrates a significant improvement in reducing the amount of cache misses in rolling restarts when PIP-430 defaults are used.

@lhotari
Copy link
Member Author

lhotari commented Aug 22, 2025

Test logging was slightly impacting the results (I noticed it with async-profiler) and I made changes to configure Log4J logging in tests so that it can be diverted to file based logging instead of default console logging.

The results are very similar on Dell XPS 2019 i9-9980HK (on Linux):

description round(avg(hits), 2) round(avg(misses), 2) avg(produced) avg(consumed) avg(bk_reads) avg(bk_read_entries)
PIP430 72.28 27.72 1518034.2 15213392.6 44556.2 4192224.4
PR12258 51.55 48.45 1510802.8 15130023.2 79476.8 7317555.2
cacheEvictionByMarkDeletedPosition 37.11 62.89 1398036.4 14010866.0 95174.8 8732727.4
PIP430disabled 25.79 74.21 1224613.0 12278182.8 99706.4 9163995.0

detailed:

description produced consumed consumers bk_reads bk_read_entries hits misses restarts ts
PIP430disabled 1299793 13040831 10 118895 10998176 15.66 84.34 3 2025-08-22 11:04:20.294
PIP430disabled 1225471 12291491 10 114912 10573318 13.98 86.02 3 2025-08-22 11:04:54.714
PIP430disabled 1268051 12703645 10 113809 10258553 19.25 80.75 3 2025-08-22 11:05:28.823
PIP430disabled 1183615 11856173 10 69339 6553542 44.72 55.28 3 2025-08-22 11:06:02.859
PIP430disabled 1146135 11498774 10 81577 7436386 35.33 64.67 3 2025-08-22 11:06:37.03
cacheEvictionByMarkDeletedPosition 1549368 15521428 10 70895 6243987 59.77 40.23 3 2025-08-22 11:07:19.82
cacheEvictionByMarkDeletedPosition 1394511 13970634 10 109494 10119628 27.57 72.43 3 2025-08-22 11:07:54.142
cacheEvictionByMarkDeletedPosition 1397873 14014243 10 88050 8209983 41.42 58.58 3 2025-08-22 11:08:28.093
cacheEvictionByMarkDeletedPosition 1360366 13636976 10 121868 11250491 17.5 82.5 3 2025-08-22 11:09:02.185
cacheEvictionByMarkDeletedPosition 1288064 12911049 10 85567 7839548 39.28 60.72 3 2025-08-22 11:09:36.4
PR12258 1549010 15497061 10 25529 2378106 84.65 15.35 3 2025-08-22 11:10:18.945
PR12258 1547521 15499860 10 101612 9481406 38.83 61.17 3 2025-08-22 11:10:52.938
PR12258 1517693 15201768 10 96395 8716974 42.66 57.34 3 2025-08-22 11:11:26.707
PR12258 1510279 15137430 10 99797 9095553 39.91 60.09 3 2025-08-22 11:12:00.482
PR12258 1429511 14313997 10 74051 6915737 51.69 48.31 3 2025-08-22 11:12:34.675
PIP430 1548919 15534290 10 11922 1099187 92.92 7.08 3 2025-08-22 11:13:16.659
PIP430 1490663 14934342 10 69328 6533232 56.25 43.75 3 2025-08-22 11:13:50.878
PIP430 1531667 15346428 10 60535 5760473 62.46 37.54 3 2025-08-22 11:14:24.318
PIP430 1549077 15522591 10 35566 3315903 78.64 21.36 3 2025-08-22 11:14:58.129
PIP430 1469845 14729312 10 45430 4252327 71.13 28.87 3 2025-08-22 11:15:31.849

@lhotari
Copy link
Member Author

lhotari commented Aug 22, 2025

This test is flaky, will need to address it.

  Error:  Tests run: 3, Failures: 1, Errors: 0, Skipped: 1, Time elapsed: 56.264 s <<< FAILURE! - in org.apache.pulsar.broker.cache.BrokerEntryCacheTest
  Error:  org.apache.pulsar.broker.cache.BrokerEntryCacheTest.testTailingReadsClearsCacheAfterCacheTimeout  Time elapsed: 5.235 s  <<< FAILURE!
  org.opentest4j.AssertionFailedError: 
  
  expected: 0L
   but was: 51L
  	at java.base/jdk.internal.reflect.DirectConstructorHandleAccessor.newInstance(DirectConstructorHandleAccessor.java:62)
  	at java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:502)
  	at org.apache.pulsar.broker.cache.BrokerEntryCacheTest.testTailingReadsClearsCacheAfterCacheTimeout(BrokerEntryCacheTest.java:573)
  	at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
  	at java.base/java.lang.reflect.Method.invoke(Method.java:580)
  	at org.testng.internal.invokers.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:139)
  	at org.testng.internal.invokers.InvokeMethodRunnable.runOne(InvokeMethodRunnable.java:47)
  	at org.testng.internal.invokers.InvokeMethodRunnable.call(InvokeMethodRunnable.java:76)
  	at org.testng.internal.invokers.InvokeMethodRunnable.call(InvokeMethodRunnable.java:11)
  	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:317)
  	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
  	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
  	at java.base/java.lang.Thread.run(Thread.java:1583)

@lhotari
Copy link
Member Author

lhotari commented Aug 22, 2025

I also ran the test run on GCP on e2-highcpu-8 VM. The base VM setup was done using these instructions.
Tested hash was 9f185b0

$ ENABLE_MANUAL_TEST=true NETTY_LEAK_DETECTION=off mvn -PtestLoggingToFile -pl pulsar-broker test -Dtest=BrokerEntryCacheRollingRestartTest
...
$ { cat pulsar-broker/target/rolling_restarts_result_header.txt; cat pulsar-broker/target/rolling_restarts_result_*.csv; } | duckdb -markdo
wn -c "select description, round(avg(hits), 2), round(avg(misses),2), avg(produced), avg(consumed), avg(bk_reads), avg(bk_read_entries) from read_csv('/dev/stdin') group by descr
iption order by 3"
description round(avg(hits), 2) round(avg(misses), 2) avg(produced) avg(consumed) avg(bk_reads) avg(bk_read_entries)
PIP430 64.4 35.6 945271.4 9488097.4 36004.8 3389183.8
PIP430disabled 17.26 82.74 987600.6 9914821.4 87639.0 8237855.4
PR12258 16.69 83.31 736788.6 7405417.2 64793.0 6165019.0
cacheEvictionByMarkDeletedPosition 10.44 89.56 883692.8 8873488.2 84476.6 7935516.6
$ { cat pulsar-broker/target/rolling_restarts_result_header.txt; cat pulsar-broker/target/rolling_restarts_result_*.csv; } | duckdb -markdown -c "select * from read_csv('/dev/stdin')"
description produced consumed consumers bk_reads bk_read_entries hits misses restarts ts
PR12258 722205 7260204 10 62770 5900925 18.72 81.28 3 2025-08-22 14:37:33.051
PR12258 727260 7312068 10 66696 6409363 12.35 87.65 3 2025-08-22 14:38:07.886
PR12258 770503 7738888 10 62813 5959073 23.0 77.0 3 2025-08-22 14:38:42.591
PR12258 746141 7494393 10 67358 6394109 14.68 85.32 3 2025-08-22 14:39:17.402
PR12258 717834 7221533 10 64328 6161625 14.68 85.32 3 2025-08-22 14:39:52.448
cacheEvictionByMarkDeletedPosition 990178 9953989 10 87859 8372283 15.89 84.11 3 2025-08-22 14:40:36.216
cacheEvictionByMarkDeletedPosition 888499 8926319 10 92909 8506371 4.7 95.3 3 2025-08-22 14:41:10.719
cacheEvictionByMarkDeletedPosition 857009 8594563 10 77840 7443520 13.39 86.61 3 2025-08-22 14:41:45.061
cacheEvictionByMarkDeletedPosition 853380 8572404 10 83868 7838027 8.57 91.43 3 2025-08-22 14:42:19.706
cacheEvictionByMarkDeletedPosition 829398 8320166 10 79907 7517382 9.65 90.35 3 2025-08-22 14:42:54.178
PIP430disabled 1093913 10998750 10 104001 9850732 10.44 89.56 3 2025-08-22 14:43:37.596
PIP430disabled 965939 9693564 10 73744 6934155 28.47 71.53 3 2025-08-22 14:44:12.082
PIP430disabled 982209 9855468 10 93342 8687182 11.85 88.15 3 2025-08-22 14:44:46.605
PIP430disabled 973272 9764521 10 99398 9403644 3.7 96.3 3 2025-08-22 14:45:21.343
PIP430disabled 922670 9261804 10 67710 6313564 31.83 68.17 3 2025-08-22 14:45:55.891
PIP430 1030064 10348227 10 40013 3680055 64.44 35.56 3 2025-08-22 14:46:39.512
PIP430 978801 9819258 10 38060 3589311 63.45 36.55 3 2025-08-22 14:47:14.053
PIP430 941428 9452911 10 41558 3970945 57.99 42.01 3 2025-08-22 14:47:48.721
PIP430 846183 8495437 10 27239 2568891 69.76 30.24 3 2025-08-22 14:48:23.226
PIP430 929881 9324654 10 33154 3136717 66.36 33.64 3 2025-08-22 14:48:58.114

@lhotari lhotari requested a review from pdolif August 27, 2025 19:41
Copy link
Contributor

@merlimat merlimat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The improvements looks super-impressive! Having the micro-benchmarks make a great case.

Only minor comments

@lhotari lhotari merged commit 490ba0c into apache:master Aug 28, 2025
98 of 100 checks passed
@lhotari
Copy link
Member Author

lhotari commented Sep 3, 2025

While doing some validation, I noticed that Pulsar has a long time issue that consumers aren't able to keep with producers even when there are available resources unless dispatcherMaxReadBatchSize (and possibly also dispatcherMaxReadSizeBytes) are tuned. The default value for dispatcherMaxReadBatchSize is 100 which seems to be too small. Increasing it to 1000 could help in increasing cache hits and performance.
It's also useful to set preciseDispatcherFlowControl=true while increasing dispatcherMaxReadBatchSize so that unnecessary reads are avoided.

I created #24695 about this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

doc Your PR contains doc changes, no matter whether the changes are in markdown or code files. PIP ready-to-test type/PIP

Projects

None yet

5 participants