lfs/transfer_queue: support multiple retries per object#1535
Merged
lfs/transfer_queue: support multiple retries per object#1535
Conversation
ttaylorr
commented
Sep 21, 2016
| for _, t := range batch { | ||
| q.retry(t.(Transferable)) | ||
| var errOnce sync.Once | ||
| for _, o := range batch { |
Contributor
Author
There was a problem hiding this comment.
If eventually we end up grouping retried objects and "fresh" objects in the same batch, we should only fail objects that have exceeded their retry budget, and not all of the items in the batch at once. To keep the old behavior of reporting to the errorc channel the same, I wrapped it in a sync.Once, so it can only happen once.
technoweenie
approved these changes
Sep 22, 2016
Contributor
technoweenie
left a comment
There was a problem hiding this comment.
This is a great small step forward 👍
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This pull-request teaches the
lfs/transfer_queuehow to retry an object more than once per transfer.Previously
The transfer queue used to follow these steps when executing a transfer:
Add()batchApiRoutineorindividualApiRoutinebased on its capabilities.retryif it failed with a retriable error.retryCollectorThis works fine, but limits the transfer queue to:
Now
Now, the transfer queue knows how to immediately retry an object. It works like this:
Add()it to the next batch, or API channel (in legacy mode)batchApiRoutinefunction to receive a new batch.(The ability to flush a batch was introduced in #1528, and enables the ability to immediately retry an object in a batch.)
In order to make sure that all items are processed before exiting, the way we treat the internal waitgroup
q.waitchanged slightly. Previously,q.waitwas incremented every time we try to preform a transfer on an item. To prevent a situation where theWaitGroupwould reach zero while waiting in between failing an object and retrying it, theWaitGroupis now only incremented the first time an object begins a transfer.The transferqueue also now keeps track of the number of retries made per OID in order to prevent infinitely retrying that transfer. Currently, the maximum number of retries per object is set to 1, in order to keep the same behavior as from before this PR which is why there are no new tests.
I've also left some line comments throughout the code to clarify some further things.
/cc @technoweenie @rubyist @sinbad for 👀 and 💭