-
Notifications
You must be signed in to change notification settings - Fork 3.7k
[fix][ml] Fix asyncReadEntries might never complete if empty entries are read from BK #24515
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[fix][ml] Fix asyncReadEntries might never complete if empty entries are read from BK #24515
Conversation
dao-jun
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/OpReadEntry.java
Outdated
Show resolved
Hide resolved
|
|
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #24515 +/- ##
============================================
+ Coverage 73.57% 74.31% +0.74%
+ Complexity 32624 32552 -72
============================================
Files 1877 1869 -8
Lines 139502 146089 +6587
Branches 15299 16760 +1461
============================================
+ Hits 102638 108569 +5931
+ Misses 28908 28900 -8
- Partials 7956 8620 +664
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
|
@lhotari Could you take a look again? |
lhotari
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I forgot to submit these minor comments earlier.
managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/OpReadEntry.java
Outdated
Show resolved
Hide resolved
managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/OpReadEntry.java
Outdated
Show resolved
Hide resolved
managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/ManagedLedgerFactoryImpl.java
Show resolved
Hide resolved
managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/OpReadEntry.java
Show resolved
Hide resolved
managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/OpReadEntry.java
Show resolved
Hide resolved
managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/OpReadEntry.java
Show resolved
Hide resolved
managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/OpReadEntry.java
Outdated
Show resolved
Hide resolved
|
@BewareMyPower The "EntryCacheCreator" solution added in this PR is obsolete. There's already a way to intercept reads in Pulsar tests with PulsarMockReadHandleInterceptor. example usage: pulsar/managed-ledger/src/test/java/org/apache/bookkeeper/mledger/impl/ManagedLedgerTest.java Lines 3148 to 3156 in 86b036b
Since the changes to add the "EntryCacheCreator" collide with #24363 changes, I'd suggest that we remove the "EntryCacheCreator" and update the tests added in this PR to use PulsarMockReadHandleInterceptor. I can do a PR to handle that. |
|
the replacement using ReadHandleInterceptor is bkc.setReadHandleInterceptor(new PulsarMockReadHandleInterceptor() {
@Override
public CompletableFuture<LedgerEntries> interceptReadAsync(long ledgerId, long firstEntry, long lastEntry,
LedgerEntries entries) {
if (firstRead.compareAndSet(true, false)) {
// LedgerEntriesImpl doesn't allow empty entries list.
// Implementing a dummy LedgerEntries that returns an empty list.
return CompletableFuture.completedFuture(new LedgerEntries() {
@Override
public org.apache.bookkeeper.client.api.LedgerEntry getEntry(long entryId) {
return null;
}
@Override
public Iterator<org.apache.bookkeeper.client.api.LedgerEntry> iterator() {
return EmptyIterator.INSTANCE;
}
@Override
public void close() {
}
});
} else {
return CompletableFuture.completedFuture(entries);
}
}
}); |
|
Let me try this approach tomorrow |
…are read from BK (apache#24515) (cherry picked from commit eea7c13) (cherry picked from commit c887259)
…are read from BK (apache#24515) (cherry picked from commit eea7c13) (cherry picked from commit c887259)
…are read from BK (apache#24515) (cherry picked from commit eea7c13) (cherry picked from commit f1974aa)
…are read from BK (apache#24515) (cherry picked from commit eea7c13) (cherry picked from commit c887259)
…are read from BK (apache#24515) (cherry picked from commit eea7c13) (cherry picked from commit f1974aa)
…are read from BK (apache#24515) (cherry picked from commit eea7c13)
…are read from BK (apache#24515) (cherry picked from commit eea7c13)
Main Issue: #24497
Motivation
When the
OpReadEntry#readEntriesCompleteis triggered byRangeEntryCacheImpl#readFromStoragewith an empty list of entries, NPE will happen here:pulsar/managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/OpReadEntry.java
Line 95 in bc5b933
Then
updateReadPositionandcheckReadCompletionwill be skipped, where the result is very similar to #24497 (comment)asyncReadEntrieswill never completeentriesReadCountwill be updated but thereadPositionwon'tpendingReadOpswill not be reduced to 0Unfortunately, from the existing analysis, the issue of 24497 might not be caused by this bug, but this NPE is still possible.
Modifications
Remove the confusing
lastPositionargument ininternalReadEntriesComplete. It's only set when all entries have been filtered out and it needs to reset thereadPositioninManagedLedgerImpl#internalReadFromLedger. It can be replaced by a simple combination ofupdateReadPositionandcheckReadCompletion.In
internalReadEntriesComplete, just callcheckReadCompletionwhen no entries are read to trigger another round of read.In addition, catch exceptions when triggering user-provided callback and print logs with better information. Without this change, the exception will be caught by
SingleThreadTask#safeRunTaskand the log will beError while running task:that does not have description for a specific cursor or read position.Verifying this change
Make
EntryCacheconfigurable in tests add addtestReadNoEntriesto verify that theasyncReadEntrieswill eventually succeed even if one read operation returns an empty entry list.Documentation
docdoc-requireddoc-not-neededdoc-completeMatching PR in forked repository
PR in forked repository: