Make peer recovery send file chunks async#44040
Conversation
This reverts commit 782920c.
|
Pinging @elastic/es-distributed |
|
Jenkins run elasticsearch-ci/2 (both unrelated failures) |
original-brownbear
left a comment
There was a problem hiding this comment.
Some question/suggestion type comments, in general this looks very nice I think :)
server/src/main/java/org/elasticsearch/indices/recovery/MultiFileReader.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/indices/recovery/RecoverySourceHandler.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/indices/recovery/RecoverySourceHandler.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/indices/recovery/MultiFileReader.java
Outdated
Show resolved
Hide resolved
|
@original-brownbear Thank you for reviewing. I have addressed your comments. Would you mind taking another look? |
|
Please hold off the review. There's a dead-lock issue after I pushed 5c1977a. |
|
This is ready again. |
server/src/main/java/org/elasticsearch/indices/recovery/RecoverySourceHandler.java
Outdated
Show resolved
Hide resolved
henningandersen
left a comment
There was a problem hiding this comment.
Thanks for looking into this @dnhatn , I have left a few initial comments to consider.
server/src/main/java/org/elasticsearch/indices/recovery/RecoverySourceHandler.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/indices/recovery/RecoverySourceHandler.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/indices/recovery/RecoverySourceHandler.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/indices/recovery/RecoverySourceHandler.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/indices/recovery/RecoverySourceHandler.java
Outdated
Show resolved
Hide resolved
|
@henningandersen I've pushed e8dfd75 to make MultiFileSender an AsyncIOProcessor. Can you please take another look? Thank you! |
original-brownbear
left a comment
There was a problem hiding this comment.
Looks good in general :)
One suggestion (that might not be so trivial so feel free to ignore for now) and +1 to Yannick's comment on the exception ignoring . We should def. not do that imo.
| } | ||
| listener.onResponse(null); | ||
| } catch (Exception ignored) { | ||
| // we can safely ignore this exception as it happens after we have released the resource and notified the caller. |
There was a problem hiding this comment.
++ I don't think we should have this. If we run into a Exception in the listener.onResponse we should fix the listener to handle that exception? Otherwise we're adding another situation where a listener behaves unexpectedly and we won't even see a log line for it?
| } | ||
|
|
||
| private void addItem(long requestSeqId, StoreFileMetaData md, Exception failure) { | ||
| processor.put(new FileChunkResponseItem(requestSeqId, md, failure), e -> { assert e == null : e; }); |
There was a problem hiding this comment.
I wonder if we should somehow assert that this never blocks? (since AsyncIOProcessor might be using blocking ArrayBlockingQueue#put on the item). It seems to me logically that's not an option with the current code, but maybe something we should make sure of to avoid (admittedly super unlikely) deadlocks in the future?
There was a problem hiding this comment.
The current implementation is perfectly fine as we control the number of possible items in the queue. We can assert the remaining capacity before putting an item but this option subjects to a race condition. Another option is to use "offer" instead of "put" in AsyncIOProcessor if the assertion is enabled and blocking is not allowed. I will think about it more.
There was a problem hiding this comment.
Yea, I didn't have a good idea on how to do this now either. Just figured I'd raise it :) Either way, once we start using this in tests that use DeterministicTaskQueue we'll probably be guarded against deadlocks/blocking anyway :)
|
@ywelsch @original-brownbear Can you take another look? Thank you |
original-brownbear
left a comment
There was a problem hiding this comment.
LGTM, thanks @dnhatn !
|
Thanks everyone :) |
|
I have reverted this change on master and 7.x as the new assertion was tripped. |
Relates #36981
Relates #36195
Supersedes #39769