-
Notifications
You must be signed in to change notification settings - Fork 3.7k
[improve][client] Deduplicate getTopicsUnderNamespace in BinaryProtoLookupService #24962
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[improve][client] Deduplicate getTopicsUnderNamespace in BinaryProtoLookupService #24962
Conversation
…ookupService Deduplicate concurrent getTopicsUnderNamespace requests with identical parameters(namespace, mode, topicsPattern, topicsHash) into a single CompletableFuture to reduce duplicate broker lookups. Follows existing deduplication pattern used by getBroker() and getPartitionedTopicMetadata(). Add unit tests verifying deduplication behavior and cleanup after completion. Signed-off-by: Vinkal Chudgar <vinkal.chudgar@gmail.com>
@vinkal-chudgar This change itself is reasonable, but I'm just wondering if this PR is addressing a real problem. How common would it be to have multiple consumers with exactly the same topics pattern using the same Pulsar client instance? |
|
@vinkal-chudgar Would you be interested in handling #24963. I noticed that bug while looking into the details. |
|
@lhotari - To the best of my knowledge, yes, this PR addresses a real case that occurs within a single 1. Is this a real problem?The client currently creates a new future and starts a new request for every call to 2. How common is “same pattern, same client”?It depends on the application, but it is not unusual. Pulsar encourages reusing one
Even if this is not constant traffic, when it happens it amplifies requests. The fix is small and keeps behavior consistent with the other two lookup paths. 3. Do I have this use case myself?No, I do not have a personal production deployment that hits this case. I did see it surface in CI: while working on unrelated changes, a CI run failed on P.S. I have created Issue #24964 with all the required details |
@vinkal-chudgar I was interested to hear what the possible real world impact could have been if you faced this in production. I do agree that it makes sense to make this change to be consistent. |
…ice-gettopics-deduplication
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #24962 +/- ##
=============================================
+ Coverage 38.69% 74.32% +35.62%
- Complexity 13362 34046 +20684
=============================================
Files 1863 1920 +57
Lines 145975 150178 +4203
Branches 16928 17424 +496
=============================================
+ Hits 56487 111620 +55133
+ Misses 81858 29695 -52163
- Partials 7630 8863 +1233
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
…ookupService (apache#24962) Signed-off-by: Vinkal Chudgar <vinkal.chudgar@gmail.com> (cherry picked from commit 1902735) (cherry picked from commit d90da78)
…ookupService (apache#24962) Signed-off-by: Vinkal Chudgar <vinkal.chudgar@gmail.com> (cherry picked from commit 1902735) (cherry picked from commit d90da78)
Fixes #24964
Motivation
BinaryProtoLookupService#getTopicsUnderNamespacecurrently does not deduplicate concurrent requests with identical parameters (namespace,mode,topicsPattern,topicsHash). For each call, it creates a newCompletableFutureand immediately invokes the helper to obtain a connection and send a request, which can cause request amplification when multiple components query the same namespace simultaneously.Current behavior in code
BinaryProtoLookupService#getTopicsUnderNamespaceallocates a new future and invokes the private helper unconditionally (no shared in-flight future).BinaryProtoLookupService#getBrokeruseslookupInProgress.computeIfAbsent(...)with removal inwhenComplete(...).BinaryProtoLookupService#getPartitionedTopicMetadatausespartitionedMetadataInProgress.computeIfAbsent(...)with removal inwhenComplete(...).This inconsistency means
BinaryProtoLookupService#getTopicsUnderNamespacecan issue duplicate network requests for identical inputs, increasing broker load, whileBinaryProtoLookupService#getBrokerandBinaryProtoLookupService#getPartitionedTopicMetadataalready deduplicate.Real impact scenarios in the current codebase
Pattern consumers recheck together.
PatternMultiTopicsConsumerImplcallsclient.getLookup().getTopicsUnderNamespace(...)during periodic rechecks. When multiple pattern consumers recheck at the same time, they make concurrent calls with identical (namespace,mode,topicsPattern,topicsHash). Without deduplication, each call sends a separate request to the brokerSeveral pattern consumers start at once.
During initial subscribe,
PulsarClientImpl#patternTopicSubscribeAsyncalso callsBinaryProtoLookupService#getTopicsUnderNamespace. Creating several pattern consumers around the same time leads to multiple concurrent calls with identical parameters (namespace,mode,topicsPattern,topicsHash); without deduplication, each call becomes a separate broker request.Effect
BinaryProtoLookupService#getBrokerandBinaryProtoLookupService#getPartitionedTopicMetadata, which already coalesce identical in-flight requests.What this PR changes
This PR adds deduplication to
BinaryProtoLookupService#getTopicsUnderNamespace. Concurrent calls with the same (namespace,mode,topicsPattern,topicsHash) now share oneCompletableFutureand one network request, reducing duplicate broker traffic and client work. The implementation follows the same pattern already used byBinaryProtoLookupService#getBrokerandBinaryProtoLookupService#getPartitionedTopicMetadata(coalesce viacomputeIfAbsent, remove onwhenComplete), keeping retry/backoff logic unchanged.Modifications
Added deduplication map
Implemented composite key class
Added
TopicsUnderNamespaceKey(static inner class) to uniquely identify requests by (namespace,mode,topicsPattern,topicsHash)Refactored
BinaryProtoLookupService#getTopicsUnderNamespacemethod:BinaryProtoLookupService#getTopicsUnderNamespacenow:ConcurrentHashMap.computeIfAbsent(...)to atomically coalesce identical in-flight requests and return the sameCompletableFuture.whenComplete(...)on the shared future to remove the map entry after completion (success or failure), matching the pattern used byBinaryProtoLookupService#getBrokerandBinaryProtoLookupService#getPartitionedTopicMetadata.Added unit tests in BinaryProtoLookupServiceTest
Verifying this change
Additional validation:
BinaryProtoLookupServiceTest#testGetTopicsUnderNamespaceDeduplicationVerifies that two in-flight calls with the same (
namespace,mode,topicsPattern,topicsHash) return the sameCompletableFutureand only one call is made to the connection pool; after the shared future completes, a new call with the same parameters returns a newCompletableFutureand makes one more call to the connection pool.BinaryProtoLookupServiceTest#testGetTopicsUnderNamespaceDeduplicationDifferentHashVerifies that calls with different
topicsHashvalues are not combined: each returns a differentCompletableFutureand makes a separate call to the connection pool. Cleanup is per key; completing one future does not affect the other in-flight call.(Tests are deterministic: no sleeps or timing assumptions; they do not execute the network request path)
Personal CI Results
Tested in Personal CI fork: vinkal-chudgar#4
Status: All checks have passed (50 successful checks, 2 skipped)
Does this pull request potentially affect one of the following parts:
If the box was checked, please highlight the changes
Documentation
docdoc-requireddoc-not-neededdoc-completeMatching PR in forked repository
PR in forked repository: vinkal-chudgar#4