Skip to content

Conversation

@codelipenghui
Copy link
Contributor

Motivation

The mark delete position may point to ledgers that have been cleaned up/deleted. This causes:

  1. pulsar_storage_backlog_age_seconds metric returns -1 instead of actual backlog age after topic unloading
  2. Prometheus monitoring shows incorrect backlog age for lagged subscriptions

You can use the newly added test to reproduce the issue.

Verifying this change

Added new test

Does this pull request potentially affect one of the following parts:

If the box was checked, please highlight the changes

  • Dependencies (add or upgrade a dependency)
  • The public API
  • The schema
  • The default values of configurations
  • The threading model
  • The binary protocol
  • The REST endpoints
  • The admin CLI options
  • The metrics
  • Anything that affects deployment

Documentation

  • doc
  • doc-required
  • doc-not-needed
  • doc-complete

@codelipenghui codelipenghui added ready-to-test type/bug The PR fixed a bug or issue reported a bug area/broker labels Jul 16, 2025
@codelipenghui codelipenghui self-assigned this Jul 16, 2025
@codelipenghui codelipenghui added this to the 4.1.0 milestone Jul 16, 2025
@github-actions github-actions bot added the doc-not-needed Your PR changes do not impact docs label Jul 16, 2025
@codelipenghui
Copy link
Contributor Author

/pulsarbot run-failure-checks

@codecov-commenter
Copy link

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 74.29%. Comparing base (bbc6224) to head (15639cf).
Report is 1195 commits behind head on master.

Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff              @@
##             master   #24518      +/-   ##
============================================
+ Coverage     73.57%   74.29%   +0.72%     
+ Complexity    32624    32498     -126     
============================================
  Files          1877     1868       -9     
  Lines        139502   145946    +6444     
  Branches      15299    16737    +1438     
============================================
+ Hits         102638   108431    +5793     
- Misses        28908    28922      +14     
- Partials       7956     8593     +637     
Flag Coverage Δ
inttests 26.72% <0.00%> (+2.14%) ⬆️
systests 23.26% <0.00%> (-1.07%) ⬇️
unittests 73.80% <100.00%> (+0.95%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
...sar/broker/service/persistent/PersistentTopic.java 80.31% <100.00%> (+1.85%) ⬆️

... and 1101 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Member

@lhotari lhotari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@codelipenghui codelipenghui merged commit 7282c06 into apache:master Jul 17, 2025
99 of 105 checks passed
@codelipenghui codelipenghui deleted the penghui/fix-backlog-age branch July 17, 2025 18:42
lhotari pushed a commit that referenced this pull request Jul 17, 2025
…tion point to a deleted ledger (#24518)

(cherry picked from commit 7282c06)
lhotari pushed a commit that referenced this pull request Jul 17, 2025
…tion point to a deleted ledger (#24518)

(cherry picked from commit 7282c06)
priyanshu-ctds pushed a commit to datastax/pulsar that referenced this pull request Jul 22, 2025
…tion point to a deleted ledger (apache#24518)

(cherry picked from commit 7282c06)
(cherry picked from commit 6c8d558)
srinath-ctds pushed a commit to datastax/pulsar that referenced this pull request Jul 24, 2025
…tion point to a deleted ledger (apache#24518)

(cherry picked from commit 7282c06)
(cherry picked from commit 6c8d558)
lhotari pushed a commit to lhotari/pulsar that referenced this pull request Jul 25, 2025
…tion point to a deleted ledger (apache#24518)

(cherry picked from commit 7282c06)
@lhotari
Copy link
Member

lhotari commented Jul 25, 2025

@codelipenghui When backporting this to branch-3.0, the added test fails. (backport in commit lhotari@eaa1670)

[ERROR] Tests run: 3, Failures: 1, Errors: 0, Skipped: 1, Time elapsed: 32.705 s <<< FAILURE! - in org.apache.pulsar.broker.service.persistent.PersistentTopicProtectedMethodsTest
[ERROR] org.apache.pulsar.broker.service.persistent.PersistentTopicProtectedMethodsTest.testEstimatedTimeBasedBacklogQuotaCheckWithTopicUnloading  Time elapsed: 10.282 s  <<< FAILURE!
org.awaitility.core.ConditionTimeoutException: Assertion condition defined as a org.apache.pulsar.broker.service.persistent.PersistentTopicProtectedMethodsTest expected [true] but found [false] within 10 seconds.
        at org.awaitility.core.ConditionAwaiter.await(ConditionAwaiter.java:167)
        at org.awaitility.core.AssertionCondition.await(AssertionCondition.java:119)
        at org.awaitility.core.AssertionCondition.await(AssertionCondition.java:31)
        at org.awaitility.core.ConditionFactory.until(ConditionFactory.java:985)
        at org.awaitility.core.ConditionFactory.untilAsserted(ConditionFactory.java:769)
        at org.apache.pulsar.broker.service.persistent.PersistentTopicProtectedMethodsTest.testEstimatedTimeBasedBacklogQuotaCheckWithTopicUnloading(PersistentTopicProtectedMethodsTest.java:130)
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
        at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.base/java.lang.reflect.Method.invoke(Method.java:569)
        at org.testng.internal.invokers.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:139)
        at org.testng.internal.invokers.InvokeMethodRunnable.runOne(InvokeMethodRunnable.java:47)
        at org.testng.internal.invokers.InvokeMethodRunnable.call(InvokeMethodRunnable.java:76)
        at org.testng.internal.invokers.InvokeMethodRunnable.call(InvokeMethodRunnable.java:11)
        at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
        at java.base/java.lang.Thread.run(Thread.java:840)
Caused by: java.lang.AssertionError: expected [true] but found [false]
        at org.testng.Assert.fail(Assert.java:110)
        at org.testng.Assert.failNotEquals(Assert.java:1577)
        at org.testng.Assert.assertTrue(Assert.java:56)
        at org.testng.Assert.assertTrue(Assert.java:66)
        at org.apache.pulsar.broker.service.persistent.PersistentTopicProtectedMethodsTest.lambda$testEstimatedTimeBasedBacklogQuotaCheckWithTopicUnloading$2(PersistentTopicProtectedMethodsTest.java:132)
        at org.awaitility.core.AssertionCondition.lambda$new$0(AssertionCondition.java:53)
        at org.awaitility.core.ConditionAwaiter$ConditionPoller.call(ConditionAwaiter.java:248)
        at org.awaitility.core.ConditionAwaiter$ConditionPoller.call(ConditionAwaiter.java:235)
        ... 4 more

Do you have a chance to backport this to branch-3.0?

@lhotari
Copy link
Member

lhotari commented Aug 26, 2025

@codelipenghui Do you have a chance to backport this to branch-3.0?

codelipenghui added a commit to codelipenghui/incubator-pulsar that referenced this pull request Aug 26, 2025
…th for backlog quota check

Cherry-picked PR apache#24518 to branch-3.0 with necessary API compatibility fixes:
- Cast ManagedLedger to ManagedLedgerImpl for getFirstPosition() and getNextValidPosition()
- Fixed admin client URL path from "backlogQuotaCheck" to "backlog-quota-check" to match server endpoint

This ensures the backlog quota check test passes and admin.brokers().backlogQuotaCheck() works correctly.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
(cherry picked from commit a20358067641808d8f7a792090f070741685602f)
lhotari pushed a commit that referenced this pull request Aug 27, 2025
ganesh-ctds pushed a commit to datastax/pulsar that referenced this pull request Aug 29, 2025
… delete position point to a deleted ledger (apache#24518) (apache#24671)

(cherry picked from commit 41c32d6)
srinath-ctds pushed a commit to datastax/pulsar that referenced this pull request Sep 3, 2025
… delete position point to a deleted ledger (apache#24518) (apache#24671)

(cherry picked from commit 41c32d6)
nodece pushed a commit to ascentstream/pulsar that referenced this pull request Sep 8, 2025
… delete position point to a deleted ledger (apache#24518) (apache#24671)

Signed-off-by: Zixuan Liu <nodeces@gmail.com>
KannarFr pushed a commit to CleverCloud/pulsar that referenced this pull request Sep 22, 2025
walkinggo pushed a commit to walkinggo/pulsar that referenced this pull request Oct 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants