-
Notifications
You must be signed in to change notification settings - Fork 3.7k
[fix] Fix mixed lookup/partition metadata requests causing reliability issues and incorrect responses #24832
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[fix] Fix mixed lookup/partition metadata requests causing reliability issues and incorrect responses #24832
Conversation
pulsar-common/src/main/java/org/apache/pulsar/common/protocol/PulsarDecoder.java
Show resolved
Hide resolved
ad79165 to
b66dee8
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #24832 +/- ##
=============================================
+ Coverage 30.69% 74.24% +43.55%
- Complexity 51 33787 +33736
=============================================
Files 1854 1913 +59
Lines 144987 149178 +4191
Branches 16818 17304 +486
=============================================
+ Hits 44502 110760 +66258
+ Misses 93567 29584 -63983
- Partials 6918 8834 +1916
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
nodece
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Although part of the issue has been addressed, I'm still concerned that asynchronous handling may lead to accessing variables after cmd have been cleared (for example, in the interceptor).
Do you mean BTW, I checked the code in Pulsar that |
|
Deep copy is a good idea. |
…y issues and incorrect responses (apache#24832) (cherry picked from commit 4457b08) (cherry picked from commit 4600efe)
…y issues and incorrect responses (apache#24832) (cherry picked from commit 4457b08) (cherry picked from commit 5c4ee53)
…y issues and incorrect responses (apache#24832) (cherry picked from commit 4457b08) (cherry picked from commit 4600efe)
…y issues and incorrect responses (apache#24832) (cherry picked from commit 4457b08)
…y issues and incorrect responses (apache#24832) (cherry picked from commit 4457b08) (cherry picked from commit 5c4ee53)
Motivation
PulsarDecoder unnecessarily keeps a reference to the last parsed message. It's better to clear it immediately to avoid unnecessary memory growth when there are multiple connections.
This mainly impacts
GET_TOPICS_OF_NAMESPACE_RESPONSEwhich can be large.While making this change, it was noticed that this revealed a major reliability issue in Pulsar. Lookup requests and Partition metadata requests were getting reused and potentially mixed up in cases where there would be multiple concurrent lookup requests in progress on the same connection. The impact of this has been that lookup or partition metadata requests have failed and also that they have returned incorrect information.
The problem in these cases has been a violation of the rule documented in #18998. The implementations of
handle*methods shouldn't keep references to the command after the method returns. When the command is handle asynchronously, a copy is usually needed if the asynchronous handler makes a reference to the command.Modifications
cmd.clear()in the finally block ofchannelRead.handleLookup,handlePartitionMetadataRequestandhandleCommandWatchTopicListDocumentation
docdoc-requireddoc-not-neededdoc-complete