Skip to content

Conversation

@Denovo1998
Copy link
Contributor

@Denovo1998 Denovo1998 commented Sep 21, 2025

Fixes #xyz

Main Issue: #xyz

PIP: #xyz

Motivation

log: https://gist.github.com/Denovo1998/c48c58c1de0270a4008ae77b6161adfb

  • Running the full ExtensibleLoadManagerImplTest occasionally fails at testDisableBroker. After calling disableBroker(), the broker unregisters and the channel cleans ownerships. However, subsequent checkOwnershipAsync/getOwnerAsync against the disabled broker could still trigger a liveness verification on metadata-based table view, which throws a MetadataStoreException (wrapped by ExecutionException), causing flaky failures in full-suite runs.
  • Meanwhile, ServiceUnitStateChannelTest has explicit expectations on disabled/closed channels:
    • isOwner should still return true for Owned or Splitting states if the local broker is the designated owner/source,
    • getOwnerAsync should not complete immediately for Assigning (dst=local) or Releasing (dst=local), i.e., the future should remain unfinished to simulate “waiting for ownership”.
  • We need to unify the disabled/closed-channel behavior to short-circuit ownership queries without metadata liveness checks, while preserving the above per-state semantics, eliminating flaky behaviors and aligning with test expectations.

Modifications

  • Refined ServiceUnitStateChannelImpl#getActiveOwnerAsync to handle the disabled/closed channel states without performing liveness verification:
    • Owned, Splitting: return the owner (if any) as-is, so isOwner() on a disabled/closed channel remains true when appropriate.
    • Assigning, Releasing:
      • If the dst is the local broker, return an unfinished future (via dedupeGetOwnerRequest) to reflect “waiting for ownership”, matching testActiveGetOwner’s expectation (future is not done).
      • If the dst is another broker, return that broker ID to facilitate upper-layer redirection.
      • If no dst present, return Optional.empty().
    • Other states (Init, Free, Deleted, …): return Optional.empty().
  • For non-disabled/closed channels, keep the original behavior unchanged:
    • If the broker registry is not registered:
      • On metadata-based table view, still fail with MetadataStoreException (unchanged).
      • On in-memory table view, return owner directly (unchanged).
    • Otherwise, continue to dedupe and validate liveness using registry.lookupAsync(newOwner) (unchanged).
  • This change:
    • Prevents MetadataStoreException in disabled/closed scenarios, so testDisableBroker no longer fails randomly.
    • Keeps isOwner semantics on disabled/closed channels for Owned/Splitting.
    • Keeps Assigning/Releasing(dst=local) getOwnerAsync as an unfinished future, matching test expectations.
    • Does not affect the normal STARTED channel behavior nor the existing metadata-based/in-memory table view behaviors.

Verifying this change

  • Make sure that the change passes the CI checks.

(Please pick either of the following options)

This change is a trivial rework / code cleanup without any test coverage.

(or)

This change is already covered by existing tests, such as (please describe tests).

(or)

This change added tests and can be verified as follows:

(example:)

  • Added integration tests for end-to-end deployment with large payloads (10MB)
  • Extended integration test for recovery after broker failure

Documentation

  • doc
  • doc-required
  • doc-not-needed
  • doc-complete

Matching PR in forked repository

PR in forked repository: Denovo1998#11

@Denovo1998 Denovo1998 changed the title Fix test disable broker [fix][broker] Flaky-test: ExtensibleLoadManagerImplTest.testDisableBroker Sep 21, 2025
@github-actions github-actions bot added the doc-not-needed Your PR changes do not impact docs label Sep 21, 2025
Copy link
Member

@lhotari lhotari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@codecov-commenter
Copy link

Codecov Report

❌ Patch coverage is 22.22222% with 7 lines in your changes missing coverage. Please review.
✅ Project coverage is 74.23%. Comparing base (0c6ba1c) to head (e01a9c3).
⚠️ Report is 2 commits behind head on master.

Files with missing lines Patch % Lines
...xtensions/channel/ServiceUnitStateChannelImpl.java 22.22% 3 Missing and 4 partials ⚠️
Additional details and impacted files

Impacted file tree graph

@@              Coverage Diff              @@
##             master   #24770       +/-   ##
=============================================
+ Coverage     39.04%   74.23%   +35.18%     
- Complexity    13345    33266    +19921     
=============================================
  Files          1844     1901       +57     
  Lines        144311   148412     +4101     
  Branches      16730    17208      +478     
=============================================
+ Hits          56353   110171    +53818     
+ Misses        80460    29465    -50995     
- Partials       7498     8776     +1278     
Flag Coverage Δ
inttests 26.71% <0.00%> (-0.06%) ⬇️
systests 22.76% <0.00%> (-0.08%) ⬇️
unittests 73.77% <22.22%> (+38.59%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
...xtensions/channel/ServiceUnitStateChannelImpl.java 86.30% <22.22%> (+9.29%) ⬆️

... and 1397 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@BewareMyPower BewareMyPower merged commit e44e084 into apache:master Sep 28, 2025
148 of 154 checks passed
lhotari pushed a commit that referenced this pull request Oct 4, 2025
lhotari pushed a commit that referenced this pull request Oct 4, 2025
lhotari pushed a commit that referenced this pull request Oct 8, 2025
lhotari pushed a commit that referenced this pull request Oct 8, 2025
walkinggo pushed a commit to walkinggo/pulsar that referenced this pull request Oct 8, 2025
manas-ctds pushed a commit to datastax/pulsar that referenced this pull request Oct 15, 2025
…oker (apache#24770)

(cherry picked from commit e44e084)
(cherry picked from commit ff70e66)
manas-ctds pushed a commit to datastax/pulsar that referenced this pull request Oct 15, 2025
…oker (apache#24770)

(cherry picked from commit e44e084)
(cherry picked from commit edeb4e4)
srinath-ctds pushed a commit to datastax/pulsar that referenced this pull request Oct 30, 2025
…oker (apache#24770)

(cherry picked from commit e44e084)
(cherry picked from commit ff70e66)
nodece pushed a commit to ascentstream/pulsar that referenced this pull request Oct 30, 2025
srinath-ctds pushed a commit to datastax/pulsar that referenced this pull request Nov 6, 2025
…oker (apache#24770)

(cherry picked from commit e44e084)
(cherry picked from commit edeb4e4)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants