-
Notifications
You must be signed in to change notification settings - Fork 3.7k
[fix] [broker] Part-1: Replicator can not created successfully due to an orphan replicator in the previous topic owner #21946
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[fix] [broker] Part-1: Replicator can not created successfully due to an orphan replicator in the previous topic owner #21946
Conversation
pulsar-broker/src/main/java/org/apache/pulsar/broker/service/AbstractReplicator.java
Outdated
Show resolved
Hide resolved
pulsar-broker/src/main/java/org/apache/pulsar/broker/service/AbstractReplicator.java
Outdated
Show resolved
Hide resolved
|
@poorbarcode Does this PR fix the issue mentioned in #21203 ? |
Yes, the current PR also fixed the issue that #21203 tries to fix. |
3eb5393 to
498ebec
Compare
codelipenghui
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to add a test to cover this case?
And it looks like we can simplify the fix by adding a new method terminate() to the replicator so that we don't need to mix the closeProducer and closeReplicator logic.
05de423 to
257f163
Compare
|
Rebase master |
257f163 to
3bb81fa
Compare
a42bd91 to
5793ca1
Compare
… an orphan replicator in the previous topic owner (apache#21946)
|
Because there are too many conflicts and there are no new releases for
|
… an orphan replicator in the previous topic owner (apache#21946) (cherry picked from commit 4924052) (cherry picked from commit 670aff0)
… an orphan replicator in the previous topic owner (apache#21946) (cherry picked from commit 4924052) (cherry picked from commit 670aff0)
… an orphan replicator in the previous topic owner (apache#21946)
Motivation
There is a race condition that makes an orphan replicator in the original owner of a topic, and causes the new owner of the topic can not start a replicator due to
org.apache.pulsar.broker.service.BrokerServiceException$NamingException Producer with name 'pulsar.repl.{local_cluster}-->{remote_cluster}' is already connected to topic.Scenario 1
Scenario 2
replication_clusters.Current PR is focusing on Scenario 1.
Steps of Scenario 1
thread start replicatorunload bundlepulsar.replclosingreplicator.disconnectreplicator.stat --> Stoppedreplicator.stat --> Startingreplicator.stat --> StartedreadMoreEntries, since there is no entries to read, just pending this requestpulsar.replProducer with name 'pulsar.repl.{local_cluster}-->{remote_cluster}' is already connected to topicModifications
Replicator.State.StoppedintoProducer_StoppedandClosed.terminateto close the Replicator.disconnectonly used to close the internal producer.A case that hit this issue
Picture-1: An orphan producer was left in

old broker, it is not associated with any topic/replicatorPicture-2: After the topic is transferred to

new broker, it can not start a new Replicator successfullySince the scenario is too complex, I can not add a test. But I reproduced the Scenario 1 locally.


#21948 fixes the following issues:
topic.unfenceTopicToResumeaftertopic.closefailed.Documentation
docdoc-requireddoc-not-neededdoc-completeMatching PR in forked repository
PR in forked repository: x