Skip to content

[fix][broker] Fix leader broker cannot be determined when the advertised address and advertised listeners are configured#21894

Merged
lhotari merged 14 commits into
apache:masterfrom
lhotari:lh-fix-leaderbroker-election-with-advertised-address
Jan 19, 2024
Merged

[fix][broker] Fix leader broker cannot be determined when the advertised address and advertised listeners are configured#21894
lhotari merged 14 commits into
apache:masterfrom
lhotari:lh-fix-leaderbroker-election-with-advertised-address

Conversation

@lhotari

@lhotari lhotari commented Jan 13, 2024

Copy link
Copy Markdown
Member

Fixes #21897

Motivation

There's currently a problem that the leader broker cannot be determined when the advertised address and advertised listeners are configured.
The workaround to the problem is to properly configure an internal advertised listener which matches the advertised address. However, applying the workaround is brittle.

The code base has had inconsistent ways for a unique identifier for a broker. In some cases, there's a dummy "http://" prefix for the identifier even when the port is the https port. This is very confusing.
This PR makes the broker identifier match the "lookup service address" in all cases. The "lookup service address" has been also renamed in this PR to "broker id" since that is the true meaning of the previously called "lookup service address".

Modifications

  • consistently use the "broker id" as the broker's identifier in Pulsar load balancing.
    • Get rid of the dummy "http://" prefix (which isn't a http url since it might point to a https port)
  • for backwards compatibility & mixed version support, add brokerId field to the leader broker information and keep the previous serviceUrl without changing it's meaning and content.

Documentation

  • doc
  • doc-required
  • doc-not-needed
  • doc-complete

Matching PR in forked repository

PR in forked repository: lhotari#170

@lhotari lhotari added the type/bug The PR fixed a bug or issue reported a bug label Jan 13, 2024
@lhotari lhotari added this to the 3.3.0 milestone Jan 13, 2024
@lhotari lhotari self-assigned this Jan 13, 2024
@github-actions github-actions Bot added the doc-not-needed Your PR changes do not impact docs label Jan 13, 2024
@lhotari lhotari requested a review from merlimat January 13, 2024 01:22
@lhotari

lhotari commented Jan 13, 2024

Copy link
Copy Markdown
Member Author

bug report on Pulsar Slack, https://apache-pulsar.slack.com/archives/C5Z4T36F7/p1705041494476059
Waiting for the user to create an issue. In the meantime, error messages (to make this PR searchable with error messages):

05:19:57.541 [pulsar-2-5] WARN  org.apache.pulsar.broker.namespace.NamespaceService - The current leader broker LeaderBroker(serviceUrl=http://xx.yy.zz.xy:8080/) isn't active. Handling load manager decisions in a decentralized way. NamespaceBundle[tenant/ns/0x2fffffff_0x3269f69d]

...

05:19:55.176 [pulsar-2-6] WARN  org.apache.pulsar.broker.namespace.NamespaceService - Broker http://xx.yy.zz.xy:8080/ (xx.yy.zz.xy:8080) couldn't be found in available brokers prod-xx-1-pulsar-broker-10:8080,prod-xx-1-pulsar-broker-yy-7:8080,prod-xx-1-pulsar-broker-zz-5:8080,prod-xx-1-pulsar-broker-7:8080,prod-xx-1-pulsar-broker-zz-2:8080,prod-xx-1-pulsar-broker-yy-1:8080,prod-xx-1-pulsar-broker-yy-4:8080,prod-xx-1-pulsar-broker-2:8080,prod-xx-1-pulsar-broker-12:8080,prod-xx-1-pulsar-broker-yy-2:8080,prod-xx-1-pulsar-broker-15:8080

UPDATE: reported as #21897

@codecov-commenter

codecov-commenter commented Jan 13, 2024

Copy link
Copy Markdown

Codecov Report

Attention: Patch coverage is 75.25253% with 49 lines in your changes missing coverage. Please review.

Project coverage is 73.60%. Comparing base (c834feb) to head (96fe11b).
Report is 1277 commits behind head on master.

Files with missing lines Patch % Lines
...sar/broker/loadbalance/impl/LoadManagerShared.java 51.21% 17 Missing and 3 partials ⚠️
...rg/apache/pulsar/broker/web/PulsarWebResource.java 60.86% 6 Missing and 3 partials ⚠️
...apache/pulsar/broker/loadbalance/LeaderBroker.java 14.28% 5 Missing and 1 partial ⚠️
...ache/pulsar/broker/namespace/NamespaceService.java 79.31% 4 Missing and 2 partials ⚠️
...che/pulsar/broker/loadbalance/NoopLoadManager.java 60.00% 2 Missing ⚠️
.../loadbalance/extensions/manager/UnloadManager.java 77.77% 0 Missing and 2 partials ⚠️
...pache/pulsar/broker/admin/impl/NamespacesBase.java 80.00% 1 Missing ⚠️
...ce/extensions/reporter/BrokerLoadDataReporter.java 80.00% 0 Missing and 1 partial ⚠️
...extensions/reporter/TopBundleLoadDataReporter.java 80.00% 0 Missing and 1 partial ⚠️
...broker/loadbalance/impl/SimpleLoadManagerImpl.java 94.11% 1 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff              @@
##             master   #21894      +/-   ##
============================================
- Coverage     73.72%   73.60%   -0.12%     
- Complexity    32405    32406       +1     
============================================
  Files          1859     1861       +2     
  Lines        138273   138592     +319     
  Branches      15158    15182      +24     
============================================
+ Hits         101939   102015      +76     
- Misses        28488    28686     +198     
- Partials       7846     7891      +45     
Flag Coverage Δ
inttests 24.12% <22.22%> (-0.16%) ⬇️
systests 23.65% <19.19%> (-0.25%) ⬇️
unittests 72.90% <73.23%> (-0.10%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
...n/java/org/apache/pulsar/broker/PulsarService.java 82.31% <100.00%> (+0.16%) ⬆️
...g/apache/pulsar/broker/admin/impl/BrokersBase.java 70.86% <100.00%> (+0.25%) ⬆️
...lsar/broker/loadbalance/LeaderElectionService.java 100.00% <100.00%> (ø)
...ker/loadbalance/extensions/BrokerRegistryImpl.java 84.76% <100.00%> (ø)
...dbalance/extensions/ExtensibleLoadManagerImpl.java 79.56% <100.00%> (-0.40%) ⬇️
...xtensions/channel/ServiceUnitStateChannelImpl.java 84.39% <100.00%> (-0.07%) ⬇️
...e/extensions/policies/IsolationPoliciesHelper.java 81.81% <100.00%> (ø)
...roker/loadbalance/impl/ModularLoadManagerImpl.java 84.65% <100.00%> (-0.20%) ⬇️
...er/loadbalance/impl/ModularLoadManagerWrapper.java 94.44% <100.00%> (+3.14%) ⬆️
...rg/apache/pulsar/broker/service/BrokerService.java 81.09% <ø> (-0.16%) ⬇️
... and 14 more

... and 82 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@dragosvictor dragosvictor left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice fix! These URLs were always confusing.

Left a couple of comments below.

Comment thread pulsar-broker/src/main/java/org/apache/pulsar/broker/PulsarService.java Outdated
Comment thread pulsar-broker/src/test/java/org/apache/pulsar/broker/SLAMonitoringTest.java Outdated
@lhotari

lhotari commented Jan 14, 2024

Copy link
Copy Markdown
Member Author

Nice fix! These URLs were always confusing.

Left a couple of comments below.

Thanks for the review and suggestions. I made the suggested changes.

@lhotari

lhotari commented Jan 14, 2024

Copy link
Copy Markdown
Member Author

There's also a test case that reproduces the issue.

@lhotari lhotari requested a review from dragosvictor January 14, 2024 19:04

@dragosvictor dragosvictor left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

Comment thread pulsar-broker/src/main/java/org/apache/pulsar/broker/PulsarService.java Outdated
@lhotari lhotari requested a review from mattisonchao January 15, 2024 14:52
@Meet0861

Copy link
Copy Markdown

@lhotari can this be cherry-picked in 2.10.4 and 2.11 also please

@lhotari

lhotari commented Jan 18, 2024

Copy link
Copy Markdown
Member Author

@lhotari can this be cherry-picked in 2.10.4 and 2.11 also please

@Meet0861 I don't think that a cherry-pick patch could be applied cleanly. A backport would be needed.

@lhotari

lhotari commented Jan 19, 2024

Copy link
Copy Markdown
Member Author

There seems to be 2 other patches that are related and might be needed when backporting to older branches: #21015 and #21633 .

lhotari added a commit that referenced this pull request Jan 21, 2024
…sed address and advertised listeners are configured (#21894)

(cherry picked from commit 3158fd3)
@lhotari lhotari modified the milestones: 3.3.0, 3.2.0 Jan 21, 2024
lhotari added a commit that referenced this pull request Jan 26, 2024
…sed address and advertised listeners are configured (#21894)

(cherry picked from commit 3158fd3)
@lhotari

lhotari commented Jan 26, 2024

Copy link
Copy Markdown
Member Author

@lhotari can this be cherry-picked in 2.10.4 and 2.11 also please

@Meet0861 I don't think that a cherry-pick patch could be applied cleanly. A backport would be needed.

@Meet0861 The support for 2.10.x and 2.11.x has ended according to the Apache Pulsar Release Policy. There's a mailing list discussion about making an explicit decision about 2.10 and 2.11 . Please participate in the discussion.

lhotari added a commit that referenced this pull request Jan 26, 2024
…sed address and advertised listeners are configured (#21894)

(cherry picked from commit 3158fd3)
mukesh-ctds pushed a commit to datastax/pulsar that referenced this pull request Mar 1, 2024
…sed address and advertised listeners are configured (apache#21894)

(cherry picked from commit 3158fd3)
(cherry picked from commit 358d122)
mukesh-ctds pushed a commit to datastax/pulsar that referenced this pull request Mar 6, 2024
…sed address and advertised listeners are configured (apache#21894)

(cherry picked from commit 3158fd3)
(cherry picked from commit 358d122)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] The leader broker identifier becomes unresolvable if the internal listener is not configured to match the advertised address.

10 participants