Skip to content

Conversation

@coderzc
Copy link
Member

@coderzc coderzc commented Jun 23, 2025

Motivation

When some ledgers are not offloaded successfully, they will be removed from the metadata store so that seek by timestamp might fail.

Error opening ledger for reading at position 4098289:438 - org.apache.bookkeeper.mledger.ManagedLedgerException$LedgerNotExistException: No such ledger exists on Metadata Server

skipNonRecoverableData is true, but it does not handle the seek case. The workaround is to see by message id.

Modifications

When skipNonRecoverableData is true, OpFindNewest can auto skip non-recoverable data.

Verifying this change

  • Make sure that the change passes the CI checks.

(Please pick either of the following options)

This change is a trivial rework / code cleanup without any test coverage.

(or)

This change is already covered by existing tests, such as (please describe tests).

(or)

This change added tests and can be verified as follows:

(example:)

  • Added integration tests for end-to-end deployment with large payloads (10MB)
  • Extended integration test for recovery after broker failure

Does this pull request potentially affect one of the following parts:

If the box was checked, please highlight the changes

  • Dependencies (add or upgrade a dependency)
  • The public API
  • The schema
  • The default values of configurations
  • The threading model
  • The binary protocol
  • The REST endpoints
  • The admin CLI options
  • The metrics
  • Anything that affects deployment

Documentation

  • doc
  • doc-required
  • doc-not-needed
  • doc-complete

Matching PR in forked repository

PR in forked repository:

@github-actions github-actions bot added the doc-not-needed Your PR changes do not impact docs label Jun 23, 2025
@coderzc coderzc closed this Jun 23, 2025
@coderzc coderzc reopened this Jun 23, 2025
@coderzc coderzc self-assigned this Jun 23, 2025
@coderzc coderzc requested review from BewareMyPower, Technoboy-, Copilot, dao-jun, lhotari, poorbarcode and shibd and removed request for Copilot and dao-jun June 23, 2025 11:26
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Enhance OpFindNewest to skip non-recoverable ledger data during timestamp-based seeks and add end-to-end tests to verify this behavior.

  • Enable autoSkipNonRecoverableData in tests and broker config
  • Add testSeekByTimestampWithSkipNonRecoverableData in SubscriptionSeekTest
  • Add new tests in PersistentMessageFinderTest for auto-skip logic
  • Extend OpFindNewest to handle non-recoverable exceptions and advance search accordingly

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
pulsar-broker/src/test/java/org/apache/pulsar/broker/service/SubscriptionSeekTest.java Enable auto-skip and add a timestamp-seek test that deletes ledgers mid-stream
pulsar-broker/src/test/java/org/apache/pulsar/broker/service/PersistentMessageFinderTest.java Configure autoSkipNonRecoverableData and add two tests for message finding with deleted ledgers
managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/OpFindNewest.java Implement handling of NonRecoverableLedgerException in binary search states
Comments suppressed due to low confidence (2)

pulsar-broker/src/test/java/org/apache/pulsar/broker/service/SubscriptionSeekTest.java:627

  • [nitpick] The variable name l is ambiguous and may be mistaken for the digit 1; rename it to nextIndex or similar for clarity.
            int l = i;

pulsar-broker/src/test/java/org/apache/pulsar/broker/service/SubscriptionSeekTest.java:50

  • Apache Commons Lang3 does not provide an ArraySorter utility class; consider using java.util.Arrays.sort(timestamps) to avoid an invalid import.
import org.apache.commons.lang3.ArraySorter;

@coderzc coderzc requested a review from codelipenghui June 23, 2025 11:57
@codelipenghui
Copy link
Contributor

codelipenghui commented Jun 23, 2025

When some ledgers are not offloaded successfully, they will be removed from the metadata store so that seek by timestamp might fail.

@coderzc Could you please elaborate above description? It's not clear about why the ledger will be removed from the metadata store when they are not offloaded successfully.

As I understand, it should be an expected behavior for the ledger removed from the metadata storage. It might cause by BUGs, or users deleted the ledger manually without deleting the topic, right?

@coderzc
Copy link
Member Author

coderzc commented Jun 24, 2025

As I understand, it should be an expected behavior for the ledger removed from the metadata storage. It might cause by BUGs, or users deleted the ledger manually without deleting the topic, right?

Yes, this is an expected behavior, this fix only to unblock seeking by timestamp for topic with non-recoverable ledger. for why this expected behavior occurs, we need to continue investigating.

@coderzc coderzc force-pushed the fix_findnewest_nonRecoverableLedger branch from be0ff77 to 71f263f Compare June 24, 2025 03:56
@coderzc coderzc force-pushed the fix_findnewest_nonRecoverableLedger branch from 71f263f to 54bcb7b Compare June 24, 2025 04:03
@coderzc coderzc force-pushed the fix_findnewest_nonRecoverableLedger branch from 8a7f517 to 17074c2 Compare June 24, 2025 04:05
@coderzc coderzc requested a review from poorbarcode June 24, 2025 04:15
@coderzc
Copy link
Member Author

coderzc commented Jun 25, 2025

@coderzc coderzc merged commit 3e36632 into apache:master Jun 27, 2025
99 of 102 checks passed
@coderzc coderzc added the type/bug The PR fixed a bug or issue reported a bug label Jun 27, 2025
coderzc added a commit that referenced this pull request Jun 27, 2025
coderzc added a commit that referenced this pull request Jun 27, 2025
coderzc added a commit that referenced this pull request Jun 27, 2025
@coderzc coderzc deleted the fix_findnewest_nonRecoverableLedger branch June 27, 2025 02:06
ganesh-ctds pushed a commit to datastax/pulsar that referenced this pull request Jun 27, 2025
ganesh-ctds pushed a commit to datastax/pulsar that referenced this pull request Jun 30, 2025
srinath-ctds pushed a commit to datastax/pulsar that referenced this pull request Jun 30, 2025
srinath-ctds pushed a commit to datastax/pulsar that referenced this pull request Jun 30, 2025
coderzc added a commit that referenced this pull request Jul 2, 2025
coderzc added a commit that referenced this pull request Jul 2, 2025
nodece pushed a commit to ascentstream/pulsar that referenced this pull request Jul 28, 2025
nodece pushed a commit to ascentstream/pulsar that referenced this pull request Jul 28, 2025
KannarFr pushed a commit to CleverCloud/pulsar that referenced this pull request Sep 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants