Skip to content

Conversation

@lhotari
Copy link
Member

@lhotari lhotari commented Apr 25, 2024

Motivation

There's currently some memory leaks in tests and while investigating the issue, I found out that there's a large number of uncompleted CompletableFutures in the heap dump.

Currently the metadata store doesn't complete all pending operations when it is closed.
There are multiple problems:

  • MpscUnboundedArrayQueue.drain will limit the number of entries to 4096 at a time
  • MpscUnboundedArrayQueue.drain uses relaxedPoll which is eventually consistent
  • There might be batches in flight, which need to be terminated

Modifications

  • replace drain with a while loop in close
  • replace drain with a for loop in batch flushing
  • add isClosed checks

Additional Context

CompletableFutures in the heap dump:

image

The instances are related to org.apache.pulsar.broker.resources.NamespaceResources$PartitionedTopicResources$$Lambda$1819+0x00007f08a8b65ee8:
image

org.apache.pulsar.broker.resources.NamespaceResources$PartitionedTopicResources$$Lambda$1819+0x00007f08a8b65ee8 seems to be this code block:

markPartitionedTopicDeletedAsync(topic).whenCompleteAsync((markResult, markExc) -> {
final boolean mdFound;
if (markExc != null) {
if (markExc.getCause() instanceof MetadataStoreException.NotFoundException) {
mdFound = false;
} else {
log.error("Failed to mark the topic {} as deleted", topic, markExc);
future.completeExceptionally(markExc);
return;
}
} else {
mdFound = true;
}
supplier.get().whenComplete((deleteResult, deleteExc) -> {
if (deleteExc != null && mdFound) {
unmarkPartitionedTopicDeletedAsync(topic)
.thenRun(() -> future.completeExceptionally(deleteExc))
.exceptionally(ex -> {
log.warn("Failed to unmark the topic {} as deleted", topic, ex);
future.completeExceptionally(deleteExc);
return null;
});
} else if (deleteExc != null) {
future.completeExceptionally(deleteExc);
} else {
future.complete(deleteResult);
}
});
});

Documentation

  • doc
  • doc-required
  • doc-not-needed
  • doc-complete

@lhotari lhotari added this to the 3.3.0 milestone Apr 25, 2024
@lhotari lhotari self-assigned this Apr 25, 2024
@github-actions github-actions bot added the doc-not-needed Your PR changes do not impact docs label Apr 25, 2024
@lhotari
Copy link
Member Author

lhotari commented Apr 25, 2024

It seems that the OOME is another issue. #22586

@coderzc coderzc modified the milestones: 3.3.0, 3.4.0 May 8, 2024
@Technoboy- Technoboy- closed this May 13, 2024
@Technoboy- Technoboy- reopened this May 13, 2024
@codecov-commenter
Copy link

Codecov Report

Attention: Patch coverage is 75.00000% with 6 lines in your changes are missing coverage. Please review.

Project coverage is 71.02%. Comparing base (bbc6224) to head (edb72fd).
Report is 260 commits behind head on master.

Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff              @@
##             master   #22585      +/-   ##
============================================
- Coverage     73.57%   71.02%   -2.55%     
+ Complexity    32624     5630   -26994     
============================================
  Files          1877     1891      +14     
  Lines        139502   152144   +12642     
  Branches      15299    18133    +2834     
============================================
+ Hits         102638   108061    +5423     
- Misses        28908    35445    +6537     
- Partials       7956     8638     +682     
Flag Coverage Δ
inttests 26.99% <23.52%> (+2.40%) ⬆️
systests 24.62% <62.50%> (+0.29%) ⬆️
unittests 72.20% <100.00%> (-0.64%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
...he/pulsar/metadata/impl/AbstractMetadataStore.java 85.29% <ø> (+0.73%) ⬆️
...ta/impl/batching/AbstractBatchedMetadataStore.java 84.67% <75.00%> (-11.38%) ⬇️

... and 363 files with indirect coverage changes

@merlimat merlimat merged commit f7d35e5 into apache:master Jun 14, 2024
hanmz pushed a commit to hanmz/pulsar that referenced this pull request Feb 12, 2025
…osing metadata store (apache#22585)

Co-authored-by: Matteo Merli <mmerli@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

doc-not-needed Your PR changes do not impact docs ready-to-test

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants