Don't block on peer recovery on the target side#37076
Conversation
Today we block using the generic threadpool on the target side until the source side has fully executed the recovery. We still block on the source side executing the recovery in a blocking fashion but there is no reason to block on the target side. This will release generic threads early if there are many concurrent recoveries happen. Relates to elastic#36195
|
Pinging @elastic/es-distributed |
ywelsch
left a comment
There was a problem hiding this comment.
The response handler will need to run under GENERIC as both the happy and the retry/fail path do expensive blocking calls. As a follow-up, we can investigate whether we can get rid of RecoveryTarget.cancellableThreads.
server/src/main/java/org/elasticsearch/indices/recovery/PeerRecoveryTargetService.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/indices/recovery/PeerRecoveryTargetService.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/indices/recovery/PeerRecoveryTargetService.java
Show resolved
Hide resolved
|
@DaveCTurner @ywelsch I pushed changes |
|
folks I added back the cancelableThreads for now, I am not 100% convinced it's not needed and I would like to investigate it in a followup. I will also try to remove it entirely from the |
ywelsch
left a comment
There was a problem hiding this comment.
LGTM.
Yet, give how delicate the place of this call is I would like to only do it in 7.0 and only if we don't see issues backport it maybe later
++
Today we block using the generic thread-pool on the target side until the source side has fully executed the recovery. We still block on the source side executing the recovery in a blocking fashion but there is no reason to block on the target side. This will release generic threads early if there are many concurrent recoveries happen. Relates to #36195
Today we block using the generic thread-pool on the target side
until the source side has fully executed the recovery. We still
block on the source side executing the recovery in a blocking fashion
but there is no reason to block on the target side. This will
release generic threads early if there are many concurrent recoveries
happen.
Relates to #36195