Skip to content

Conversation

@codelipenghui
Copy link
Contributor

@codelipenghui codelipenghui commented Oct 15, 2025

Summary

Fixes a critical race condition where ledger trimming could delete ledgers while cursors still pointed to them, causing:

  • Negative backlog values from getNumberOfEntriesInBacklog
  • Cursors pointing to non-existent ledgers after topic reload blocked data cleanup

Here is an example topic internal-stats to show the issue

{
  "entriesAddedCounter" : 947356,
  "numberOfEntries" : 14437764,
  "totalSize" : 7582862344,
  "currentLedgerEntries" : 402806,
  "currentLedgerSize" : 221336947,
  "lastLedgerCreatedTimestamp" : "2025-10-14T16:05:55.171Z",
  "waitingCursorsCount" : 0,
  "pendingAddEntriesCount" : 3,
  "lastConfirmedEntry" : "339371:402802",
  "state" : "LedgerOpened",
  "ledgers" : [ {
    "ledgerId" : 333163,
    "entries" : 76809,
    "size" : 173537945,
    "offloaded" : false,
    "underReplicated" : false
  }, ... {
    "ledgerId" : 339362,
    "entries" : 98219,
    "size" : 47366060,
    "offloaded" : false,
    "underReplicated" : false
  }, {
    "ledgerId" : 339365,
    "entries" : 544553,
    "size" : 288444337,
    "offloaded" : false,
    "underReplicated" : false
  }, {
    "ledgerId" : 339371,
    "entries" : 0,
    "size" : 0,
    "offloaded" : false,
    "underReplicated" : false
  } ],
  "cursors" : {
    "subscription" : {
      "markDeletePosition" : "333160:740812",
      "readPosition" : "339371:402809",
      "waitingReadOp" : true,
      "pendingReadOps" : 0,
      "messagesConsumedCounter" : 947397,
      "cursorLedger" : 339366,
      "cursorLedgerLastEntry" : 1044,
      "individuallyDeletedMessages" : "[(333160:742256..333160:742354],(333163:-1..333163:76808],(335756:-1..335756:153],(335761:-1..335761:16],(335765:-1..335765:193],(335770:-1..335770:16],(335773:-1..335773:13],(335776:-1..335776:32],(335779:-1..335779:98],(335783:-1..335783:19],(335787:-1..335787:120],(335792:-1..335792:350],(335795:-1..335795:40],(335797:-1..335797:71],(335799:-1..335799:161],(335801:-1..335801:32],(335803:-1..335803:100],(335805:-1..335805:293],(335807:-1..335807:30],(335809:-1..335809:58],(335811:-1..335811:1126],(335813:-1..335813:250],(335815:-1..335815:175],(335817:-1..335817:153],(335819:-1..335819:4853],(335821:-1..335821:291],(335823:-1..335823:121],(335825:-1..335825:369],(335827:-1..335827:8149],(335829:-1..335829:208],(335833:-1..335833:5],(335835:-1..335835:16138],(336566:-1..336566:0],(339193:-1..339193:281215],(339197:-1..339197:528770],(339201:-1..339201:567213],(339205:-1..339205:526913],(339209:-1..339209:221685],(339213:-1..339213:513837],(339217:-1..339217:218062],(339221:-1..339221:528247],(339225:-1..339225:281650],(339229:-1..339229:407551],(339233:-1..339233:414568],(339238:-1..339238:412851],(339242:-1..339242:72557],(339249:-1..339249:530444],(339257:-1..339257:523422],(339261:-1..339261:523159],(339265:-1..339265:552096],(339279:-1..339279:61660],(339283:-1..339283:529604],(339289:-1..339289:556833],(339293:-1..339293:541437],(339297:-1..339297:534261],(339306:-1..339306:294508],(339307:-1..339307:188358],(339314:-1..339314:81424],(339319:-1..339319:198963],(339323:-1..339323:521405],(339327:-1..339327:521142],(339342:-1..339342:379129],(339346:-1..339346:531631],(339350:-1..339350:519627],(339354:-1..339354:464938],(339358:-1..339358:252548],(339362:-1..339362:98218],(339365:-1..339365:544552],(339371:-1..339371:402720]]",
      "lastLedgerSwitchTimestamp" : "2025-10-14T15:55:55.155Z",
      "state" : "Open",
      "active" : true,
      "numberOfEntriesSinceFirstNotAckedMessage" : 14437771,
      "totalNonContiguousDeletedMessagesRange" : 69,
      "subscriptionHavePendingRead" : true,
      "subscriptionHavePendingReplayRead" : false,
      "properties" : { }
    }
  },
  "schemaLedgers" : [ ],
  "compactedLedger" : {
    "ledgerId" : -1,
    "entries" : -1,
    "size" : -1,
    "offloaded" : false,
    "underReplicated" : false
  }
}

Root Cause

The issue occurs when:

  1. Messages are acknowledged → cursor position advances in memory immediately
  2. Cursor state persists to BookKeeper asynchronously (can be slow)
  3. Ledger trimming runs during the persistence delay, using the in-memory cursor position
  4. Ledgers get deleted before cursor state is durably saved
  5. On topic reload, cursor reverts to old persistent position pointing to deleted ledgers

The Fix

Changed maybeUpdateCursorBeforeTrimmingConsumedLedger() in ManagedLedgerImpl.java:2704-2705 to use the persistent cursor position instead of the in-memory position to keep it consistent with the mark delete entry update in CursorContainer from cursor.delete().

// Before:
Position lastAckedPosition = cursor.getMarkDeletedPosition();

// After:
Position lastAckedPosition = cursor.getPersistentMarkDeletedPosition() != null
        ? cursor.getPersistentMarkDeletedPosition() : cursor.getMarkDeletedPosition();

This ensures ledgers are only deleted after cursor advancement has been durably persisted to BookKeeper.

Test Coverage

Added testCursorPointsToDeletedLedgerAfterTrim() in ManagedLedgerTest.java which:

  • Simulates BookKeeper persistence delay (30 seconds)
  • Acknowledges messages asynchronously during the delay
  • Triggers ledger trimming
  • Verifies ledgers are NOT trimmed when persistent position hasn't advanced yet

Verification

Without the fix, the test fails because:

  • First ledger gets trimmed even though persistent cursor still points to it
  • Creates the exact race condition seen in production

With the fix, the test passes because:

  • Trimming respects the persistent cursor position
  • Ledgers are only deleted after cursor state is durably saved

Documentation

  • doc
  • doc-required
  • doc-not-needed
  • doc-complete

… ledgers

Problem:
- getNumberOfEntriesInBacklog(isPrecise=false) could return negative values
- Cursors could point to deleted ledgers after trim operations
- Root cause: persistence ordering race between cursor advancement and ledger trimming

Root Cause:
When messages are acknowledged:
1. Cursor position advances in memory immediately
2. Cursor state persists to BookKeeper asynchronously
3. If ledger trimming occurs during the persistence delay, it uses the in-memory position
4. Ledgers get deleted before the cursor state is durably saved
5. On topic reload, cursor reverts to old persistent position pointing to deleted ledgers

The Fix:
Changed maybeUpdateCursorBeforeTrimmingConsumedLedger() to use the persistent cursor
position (getPersistentMarkDeletedPosition) instead of the in-memory position
(getMarkDeletedPosition) when determining which ledgers can be safely trimmed.

This ensures ledgers are only deleted after the cursor advancement has been durably
persisted to BookKeeper, preventing the cursor from pointing to deleted ledgers.

Test Coverage:
Added testCursorPointsToDeletedLedgerAfterTrim() which:
- Simulates BookKeeper persistence delay (30 seconds)
- Acknowledges messages asynchronously during the delay
- Triggers ledger trimming
- Verifies ledgers are NOT trimmed when persistent position hasn't advanced

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
@github-actions
Copy link

@codelipenghui Please add the following content to your PR description and select a checkbox:

- [ ] `doc` <!-- Your PR contains doc changes -->
- [ ] `doc-required` <!-- Your PR changes impact docs and you will update later -->
- [ ] `doc-not-needed` <!-- Your PR changes do not impact docs -->
- [ ] `doc-complete` <!-- Docs have been already added -->

@codelipenghui codelipenghui self-assigned this Oct 15, 2025
@codelipenghui codelipenghui added ready-to-test type/bug The PR fixed a bug or issue reported a bug area/broker category/reliability The function does not work properly in certain specific environments or failures. e.g. data lost labels Oct 15, 2025
@codelipenghui codelipenghui added this to the 4.2.0 milestone Oct 15, 2025
@github-actions github-actions bot added doc-not-needed Your PR changes do not impact docs and removed doc-label-missing labels Oct 15, 2025
@codecov-commenter
Copy link

Codecov Report

❌ Patch coverage is 0% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 74.30%. Comparing base (cc5e479) to head (c3c24de).
⚠️ Report is 6 commits behind head on master.

Files with missing lines Patch % Lines
...che/bookkeeper/mledger/impl/ManagedLedgerImpl.java 0.00% 2 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@              Coverage Diff              @@
##             master   #24855       +/-   ##
=============================================
+ Coverage     38.36%   74.30%   +35.93%     
- Complexity    13123    33451    +20328     
=============================================
  Files          1856     1913       +57     
  Lines        145070   149281     +4211     
  Branches      16836    17325      +489     
=============================================
+ Hits          55662   110923    +55261     
+ Misses        81843    29521    -52322     
- Partials       7565     8837     +1272     
Flag Coverage Δ
inttests 26.47% <0.00%> (+0.09%) ⬆️
systests 22.74% <0.00%> (-0.03%) ⬇️
unittests 73.81% <0.00%> (+39.28%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
...che/bookkeeper/mledger/impl/ManagedLedgerImpl.java 81.23% <0.00%> (+30.90%) ⬆️

... and 1417 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Member

@lhotari lhotari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, good work @codelipenghui

@lhotari lhotari merged commit a091ea7 into apache:master Oct 15, 2025
119 of 132 checks passed
@lhotari
Copy link
Member

lhotari commented Oct 15, 2025

@codelipenghui Does this problem apply to 3.0.x and 3.3.x?

@codelipenghui
Copy link
Contributor Author

@lhotari Yes, I think so.

@codelipenghui codelipenghui deleted the fix/cursor-deleted-ledger-negative-backlog branch October 15, 2025 15:34
codelipenghui added a commit that referenced this pull request Oct 15, 2025
… ledgers (#24855)

Co-authored-by: Claude <noreply@anthropic.com>
(cherry picked from commit a091ea7)
Technoboy- pushed a commit that referenced this pull request Oct 23, 2025
… ledgers (#24855)

Co-authored-by: Claude <noreply@anthropic.com>
ganesh-ctds pushed a commit to datastax/pulsar that referenced this pull request Oct 23, 2025
… ledgers (apache#24855)

Co-authored-by: Claude <noreply@anthropic.com>
(cherry picked from commit d56758f)
srinath-ctds pushed a commit to datastax/pulsar that referenced this pull request Nov 6, 2025
… ledgers (apache#24855)

Co-authored-by: Claude <noreply@anthropic.com>
(cherry picked from commit d56758f)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/broker category/reliability The function does not work properly in certain specific environments or failures. e.g. data lost cherry-picked/branch-4.0 cherry-picked/branch-4.1 doc-not-needed Your PR changes do not impact docs ready-to-test release/4.0.8 release/4.1.2 type/bug The PR fixed a bug or issue reported a bug

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants