Skip to content

Conversation

@poorbarcode
Copy link
Contributor

@poorbarcode poorbarcode commented Jun 13, 2023

Motivation

When the replicator.producer retries to start[1] the topic close[2] executed concurrently, there is an orphan replicator after this topic is closed. You can reproduce by the test testRetryStartProducerStoppedByTopicRemove

time restart producer of a replicator topic.close
1 Compare and set state: Starting -> Stopped
2 Add a delay task to try to start the producer again
3 Close topic
4 The delayed task executed
5 If fail, retry again and loop

[1]: https://github.com/apache/pulsar/blob/master/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/AbstractReplicator.java#L151
[2]: https://github.com/apache/pulsar/blob/master/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/AbstractReplicator.java#L163

Modifications

If the topic was already closed, stop to retry.

Documentation

  • doc
  • doc-required
  • doc-not-needed
  • doc-complete

Matching PR in forked repository

PR in forked repository: x

@poorbarcode poorbarcode requested a review from Technoboy- June 13, 2023 07:05
@poorbarcode poorbarcode changed the title [fix][broker] release orphan replicator after topic closed [fix] [broker] release orphan replicator after topic closed Jun 13, 2023
@Technoboy- Technoboy- changed the title [fix] [broker] release orphan replicator after topic closed [fix][broker] release orphan replicator after topic closed Jun 13, 2023
@Technoboy- Technoboy- closed this Jun 13, 2023
@Technoboy- Technoboy- reopened this Jun 13, 2023
@codecov-commenter
Copy link

codecov-commenter commented Jun 15, 2023

Codecov Report

❌ Patch coverage is 68.18182% with 7 lines in your changes missing coverage. Please review.
✅ Project coverage is 72.96%. Comparing base (51c2bb4) to head (f8c08af).
⚠️ Report is 2182 commits behind head on master.

Files with missing lines Patch % Lines
...ache/pulsar/broker/service/AbstractReplicator.java 65.00% 5 Missing and 2 partials ⚠️
Additional details and impacted files

Impacted file tree graph

@@              Coverage Diff              @@
##             master   #20567       +/-   ##
=============================================
+ Coverage     33.50%   72.96%   +39.46%     
- Complexity    12053    31956    +19903     
=============================================
  Files          1613     1867      +254     
  Lines        126120   138653    +12533     
  Branches      13749    15233     +1484     
=============================================
+ Hits          42254   101172    +58918     
+ Misses        78332    29444    -48888     
- Partials       5534     8037     +2503     
Flag Coverage Δ
inttests 24.13% <18.18%> (-0.05%) ⬇️
systests 25.01% <0.00%> (?)
unittests 72.26% <68.18%> (+40.33%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
...service/nonpersistent/NonPersistentReplicator.java 63.36% <100.00%> (+4.95%) ⬆️
...roker/service/persistent/PersistentReplicator.java 68.79% <100.00%> (+23.82%) ⬆️
...ache/pulsar/broker/service/AbstractReplicator.java 76.03% <65.00%> (+22.18%) ⬆️

... and 1516 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@poorbarcode poorbarcode merged commit d85736c into apache:master Jun 15, 2023
Technoboy- pushed a commit that referenced this pull request Jun 15, 2023
Motivation: When the `replicator.producer` retries to start<sup>[1]</sup> the topic close<sup>[2]</sup> executed concurrently, there is an orphan replicator after this topic is closed.
Modifications: If the topic was already closed, stop to retry.
poorbarcode added a commit that referenced this pull request Jun 15, 2023
Motivation: When the `replicator.producer` retries to start<sup>[1]</sup> the topic close<sup>[2]</sup> executed concurrently, there is an orphan replicator after this topic is closed.
Modifications: If the topic was already closed, stop to retry.
(cherry picked from commit d85736c)
if (optional.isEmpty()) {
return false;
}
return optional.get() == localTopic;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure that we want to use reference equality here. I would think we always want to shut down the AbastractReplicator class and let it get recreated when a topic is closed.

@michaeljmarshall
Copy link
Member

If the topic was already closed, stop to retry.

@poorbarcode - if the topic is closed, I think we should close the replicator to prevent any kind of race conditions in the event the topic is re-loaded to the same broker.

nicoloboschi pushed a commit to datastax/pulsar that referenced this pull request Jul 3, 2023
)

Motivation: When the `replicator.producer` retries to start<sup>[1]</sup> the topic close<sup>[2]</sup> executed concurrently, there is an orphan replicator after this topic is closed.
Modifications: If the topic was already closed, stop to retry.
(cherry picked from commit d85736c)
(cherry picked from commit db7832b)
tisonkun pushed a commit to tisonkun/pulsar that referenced this pull request Jul 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants