Do not mutate RecoveryResponse#37204
Conversation
|
Pinging @elastic/es-distributed |
server/src/main/java/org/elasticsearch/indices/recovery/RecoveryResponse.java
Outdated
Show resolved
Hide resolved
| out.writeVLong(phase1Time); | ||
| out.writeVLong(phase1ThrottlingWaitTime); | ||
| if (out.getVersion().before(Version.V_7_0_0)) { | ||
| out.writeVLong(0L); // phase1ThrottlingWaitTime - not used |
There was a problem hiding this comment.
it's not used here because it's wrongly implemented (we add the throttle time to the source shard instead of the target shard, see createRecoverySourceHandler in PeerRecoverySourceService. Instead I think it should be send back to the target shard and added there at the end of recovery.
Let's keep the phase1ThrottlingWaitTime here for now, and follow-up with a fix for the actual stats.
There was a problem hiding this comment.
Hmm, we send the source throttle time each FileChunk request to the target
Should phase1ThrottlingWaitTime be the total throttle time on both source and target? Note that we currently use RecoveryReponse only for logging purpose. Anyway, I added phase1ThrottlingWaitTime back.
There was a problem hiding this comment.
Should phase1ThrottlingWaitTime be the total throttle time on both source and target?
yes, we can transfer the knowledge of the source throttle time as part of the response object back to the target, and then add it to the target throttle time there. An alternative would be to transfer info about source throttle time with each file chunk that we send.
|
@ywelsch Thanks for looking. I have addressed your comments. |
ywelsch
left a comment
There was a problem hiding this comment.
s/mutable/mutate/ in PR title :-)
|
@elasticmachine run gradle build tests 2 |
|
@ywelsch thanks for reviewing. |
Today we create a global instance of RecoveryResponse then mutate it when executing each recovery step. This is okay for the current sequential recovery flow but not suitable for an asynchronous recovery which we are targeting. With this commit, we return the result of each step separately, then construct a RecoveryResponse at the end. Relates #37174
Today we create a global instance of RecoveryResponse then mutate it when executing each recovery step. This is okay for the current sequential recovery flow. However, this is not suitable for an asynchronous recovery which we are targeting. With this commit, we return the result of each step separately and construct a RecoveryResponse at the end.
Relates #37174