Skip to content

Conversation

@poorbarcode
Copy link
Contributor

@poorbarcode poorbarcode commented Jun 24, 2025

Motivation

Background of deduplication: Always regard markers as non-repetitive
Background of replicated subscription: The internal producer of the replicator will send markers to replicate the subscription ack state to the remote side.

Reproduce steps of the issue

  1. Replicator replicate a user message(user-msg-1) to the remote cluster
  2. Replicator sends a replication marker(marker-1) to the remote cluster
  3. Replicator replicate a user message(user-msg-2) to the remote cluster
  4. The messages [marker, user-msg-2] have not received the send receipt yet.
  5. A disconnection of the internal producer occurs.
  6. Resend the 2 messages: [user-msg-1, marker, user-msg-1].
  7. The remote broker is handling the write of marker-1, because it regards all markers as non-repetitive.
    a. Has not replied to the response to the source cluster yet.
  8. Received the send receipt of user-msg-1, the producer gets a message ID whose value is -1:-1 because it is repeated. It is faster than the message marker-1 because it does not write BK.
    a. Attempts to remove the first pending send. But fails because the first pending send has not received a send receipt yet.
    b. Reconnects.
    c. Loop to Step-4

The send receipt of marker-2 will never be received due to infinite reconnection.

Scenarios that are affected

Background: There are 2` types of Maker: TXN and Replicated Subscription

  • TXN internal component only sends markers, never sends user messages
  • Replicator sends both markers and user messages, which may encounter the issue.

Solutions

  1. Let all receipts of sent in order:
    a. It can hardly be implemented, because the broker should maintain all pending sends for all producers
    b. It will break the behavior defined in the previous version.
    c. It only works for a few scenarios that are used by Replication.
  2. Let the replicator handle the receipts who have out of order(what the current PR does)

Modifications

Fix the issue with solution 2

Documentation

  • doc
  • doc-required
  • doc-not-needed
  • doc-complete

Matching PR in forked repository

PR in forked repository: x

…ve sending of user messages and replication markers
@poorbarcode poorbarcode added this to the 4.1.0 milestone Jun 24, 2025
@poorbarcode poorbarcode self-assigned this Jun 24, 2025
@github-actions github-actions bot added the doc-not-needed Your PR changes do not impact docs label Jun 24, 2025
@codecov-commenter
Copy link

codecov-commenter commented Jun 24, 2025

Codecov Report

❌ Patch coverage is 77.77778% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 74.27%. Comparing base (bbc6224) to head (874caba).
⚠️ Report is 1253 commits behind head on master.

Files with missing lines Patch % Lines
...pulsar/client/impl/GeoReplicationProducerImpl.java 77.77% 0 Missing and 2 partials ⚠️
Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff              @@
##             master   #24453      +/-   ##
============================================
+ Coverage     73.57%   74.27%   +0.70%     
+ Complexity    32624     2342   -30282     
============================================
  Files          1877     1868       -9     
  Lines        139502   145511    +6009     
  Branches      15299    16645    +1346     
============================================
+ Hits         102638   108085    +5447     
+ Misses        28908    28879      -29     
- Partials       7956     8547     +591     
Flag Coverage Δ
inttests 26.66% <0.00%> (+2.08%) ⬆️
systests 23.34% <0.00%> (-0.98%) ⬇️
unittests 73.77% <77.77%> (+0.92%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
...pulsar/client/impl/GeoReplicationProducerImpl.java 51.63% <77.77%> (ø)

... and 1088 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@poorbarcode poorbarcode requested a review from gaoran10 July 7, 2025 04:11
@poorbarcode poorbarcode merged commit af27c43 into apache:master Jul 7, 2025
104 of 107 checks passed
poorbarcode added a commit that referenced this pull request Jul 8, 2025
…ve sending of user messages and replication markers (#24453)

(cherry picked from commit af27c43)
poorbarcode added a commit that referenced this pull request Jul 8, 2025
…ve sending of user messages and replication markers (#24453)

(cherry picked from commit af27c43)
poorbarcode added a commit that referenced this pull request Jul 8, 2025
…ve sending of user messages and replication markers (#24453)

(cherry picked from commit af27c43)
codelipenghui pushed a commit to codelipenghui/incubator-pulsar that referenced this pull request Jul 15, 2025
…ve sending of user messages and replication markers (apache#24453)
KannarFr pushed a commit to CleverCloud/pulsar that referenced this pull request Sep 22, 2025
…ve sending of user messages and replication markers (apache#24453)
walkinggo pushed a commit to walkinggo/pulsar that referenced this pull request Oct 8, 2025
…ve sending of user messages and replication markers (apache#24453)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants