Skip to content

Conversation

@Denovo1998
Copy link
Contributor

@Denovo1998 Denovo1998 commented Oct 16, 2025

Fixes #24849

Motivation

Detailed log as follows: https://gist.github.com/Denovo1998/0707fff19079a6694e4f7edbceddc10a

  • Previously the test fenced the topic (by opening/closing the ManagedLedger externally) before the producer sent any message. If the current ledger is empty (LAC = -1), the broker’s recovery path deletes that empty ledger. Meanwhile, the managed ledger’s metadata update loop still tries to read/write that deleted ledger, leading to BKNoSuchLedgerExistsOnMetadataServerException and putting the ManagedLedger into a Fenced state.
  • Recovery then fails to complete within the producer’s send timeout, so the first send blocks and times out, making the test flaky or consistently failing.
  • By moving the fence after the first message is successfully written, the current ledger is non-empty, recovery proceeds by rolling to a new ledger instead of deleting the empty one, and the test can reliably verify the sequenceId behavior across a transient send error.

Modifications

  • Reordered the test flow in SequenceIdWithErrorTest:
    • Create the producer and send the first message ("Hello-0") before fencing, ensuring the current ledger is not empty.
    • Then fence the topic by externally opening/closing the ManagedLedger (via ManagedLedgerClientFactory), which triggers broker recovery as intended.
    • Continue sending the remaining messages and verify their values and sequenceId.
  • Added a timeout to consumer.receive(..., 10, TimeUnit.SECONDS) to avoid unbounded waits in abnormal conditions.
  • Ensured resources are properly closed (ManagedLedger clientFactory.close() and eventLoopGroup.shutdownGracefully().get()), and the Pulsar client is closed at the end of the test.

Verifying this change

  • Make sure that the change passes the CI checks.

(Please pick either of the following options)

This change is a trivial rework / code cleanup without any test coverage.

(or)

This change is already covered by existing tests, such as (please describe tests).

(or)

This change added tests and can be verified as follows:

(example:)

  • Added integration tests for end-to-end deployment with large payloads (10MB)
  • Extended integration test for recovery after broker failure

Documentation

  • doc
  • doc-required
  • doc-not-needed
  • doc-complete

Matching PR in forked repository

PR in forked repository: Denovo1998#12

…publish to avoid empty-ledger deletion and send timeout
@codecov-commenter
Copy link

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 74.29%. Comparing base (ee33c99) to head (8a47bc6).
⚠️ Report is 8 commits behind head on master.

Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff              @@
##             master   #24861      +/-   ##
============================================
- Coverage     74.45%   74.29%   -0.16%     
- Complexity    33531    33848     +317     
============================================
  Files          1913     1913              
  Lines        149280   149312      +32     
  Branches      17324    17330       +6     
============================================
- Hits         111140   110933     -207     
- Misses        29341    29550     +209     
- Partials       8799     8829      +30     
Flag Coverage Δ
inttests 26.52% <ø> (-0.40%) ⬇️
systests 22.82% <ø> (-0.07%) ⬇️
unittests 73.81% <ø> (-0.16%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.
see 104 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@lhotari lhotari added this to the 4.2.0 milestone Oct 17, 2025
@lhotari lhotari merged commit b0ce575 into apache:master Oct 17, 2025
56 checks passed
Technoboy- pushed a commit that referenced this pull request Oct 23, 2025
…publish to avoid empty-ledger deletion and send timeout (#24861)
ganesh-ctds pushed a commit to datastax/pulsar that referenced this pull request Oct 23, 2025
…publish to avoid empty-ledger deletion and send timeout (apache#24861)

(cherry picked from commit 9ebf5fa)
Technoboy- pushed a commit that referenced this pull request Oct 27, 2025
…publish to avoid empty-ledger deletion and send timeout (#24861)
lhotari pushed a commit that referenced this pull request Oct 28, 2025
…publish to avoid empty-ledger deletion and send timeout (#24861)

(cherry picked from commit b0ce575)
priyanshu-ctds pushed a commit to datastax/pulsar that referenced this pull request Oct 29, 2025
…publish to avoid empty-ledger deletion and send timeout (apache#24861)

(cherry picked from commit b0ce575)
(cherry picked from commit fe84bdb)
srinath-ctds pushed a commit to datastax/pulsar that referenced this pull request Oct 30, 2025
…publish to avoid empty-ledger deletion and send timeout (apache#24861)

(cherry picked from commit b0ce575)
(cherry picked from commit fe84bdb)
nodece pushed a commit to ascentstream/pulsar that referenced this pull request Oct 30, 2025
…publish to avoid empty-ledger deletion and send timeout (apache#24861)

(cherry picked from commit b0ce575)
srinath-ctds pushed a commit to datastax/pulsar that referenced this pull request Nov 6, 2025
…publish to avoid empty-ledger deletion and send timeout (apache#24861)

(cherry picked from commit 9ebf5fa)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Flaky-test: SequenceIdWithErrorTest.testCheckSequenceId

4 participants