Skip to content

feat: Add ResumableUpload interface#5299

Open
innovate-invent wants to merge 7 commits intorclone:masterfrom
innovate-invent:resumable
Open

feat: Add ResumableUpload interface#5299
innovate-invent wants to merge 7 commits intorclone:masterfrom
innovate-invent:resumable

Conversation

@innovate-invent
Copy link
Contributor

What is the purpose of this change?

This adds an interface for fs to implement resumable uploads. This includes generic uploaders that do not require a backend to implement the resumable interface to benefit from the functionality.

This was split out of the TUS http uploader PR to reduce its size

Was the change discussed in an issue or in the forum before?

https://forum.rclone.org/t/http-upload-contribution/13271/69

Checklist

  • I have read the contribution guidelines.
  • I have added tests for all changes in this PR if appropriate.
  • I have added documentation for the changes if appropriate.
  • All commit messages are in house style.
  • I'm done, this Pull Request is ready for review :-)

@ivandeex
Copy link
Member

ivandeex commented May 3, 2021

Competes with #4547 (Resumer interface)

@ivandeex ivandeex marked this pull request as draft May 3, 2021 09:44
@innovate-invent
Copy link
Contributor Author

@ivandeex @ncw Can this be approved before I go to the trouble of implementing the tests?
There is a competing PR and it would be very disappointing to polish this any more than it already is to not have it be accepted.

I am also not sure why the CI failed.

@ivandeex ivandeex mentioned this pull request May 3, 2021
5 tasks
@ivandeex
Copy link
Member

ivandeex commented May 3, 2021

@innovate-invent
Please refer to #5154 and the discussion in #4547 for my entry requirements to any sequential resumer interface in rclone.

UPDATE
A concrete implementation can be found at https://github.com/rclone/rclone/pull/5185/commits

@innovate-invent
Copy link
Contributor Author

@innovate-invent
Please refer to #5154 and the discussion in #4547 for my entry requirements to any sequential resumer interface in rclone.

Are you referring to the ability to serve a list of hashes? That should be trivial to implement.

@ivandeex ivandeex marked this pull request as ready for review May 4, 2021 05:26
@ivandeex
Copy link
Member

ivandeex commented May 4, 2021

@innovate-invent
Please refer to #5154 and the discussion in #4547 for my entry requirements to any sequential resumer interface in rclone.

Are you referring to the ability to serve a list of hashes? That should be trivial to implement.

Based on discussion in #4547 where you request to remove any mentions of optional rolling hashes from the upload interface I can conclude that implementing it with TUS is not that trivial.

@innovate-invent
Copy link
Contributor Author

@innovate-invent
Please refer to #5154 and the discussion in #4547 for my entry requirements to any sequential resumer interface in rclone.

Are you referring to the ability to serve a list of hashes? That should be trivial to implement.

Based on discussion in #4547 where you request to remove any mentions of optional rolling hashes from the upload interface I can conclude that implementing it with TUS is not that trivial.

Nowhere do I request removing the rolling hash functionality, just relocate. I should also point out that the ResumableUpload interface was designed to serve more than the tus protocol. Implementing hashing in a Hasher interface would be trivial, and the tus uploader service would not use it.

@ivandeex
Copy link
Member

ivandeex commented May 6, 2021

Nowhere do I request removing the rolling hash functionality, just relocate

You request to relocate it to something that does not exist on master or in any pull request as of the time of this writing.

@innovate-invent
Copy link
Contributor Author

innovate-invent commented May 6, 2021

Nowhere do I request removing the rolling hash functionality, just relocate

You request to relocate it to something that does not exist on master or in any pull request as of the time of this writing.

I think you are misunderstanding me. I am not requesting that it be removed from your PR, I am requesting that it be moved out of the resumer interface to elsewhere in the code in your PR.

@ivandeex
Copy link
Member

ivandeex commented May 6, 2021

Nowhere do I request removing the rolling hash functionality, just relocate

You request to relocate it to something that does not exist on master or in any pull request as of the time of this writing.

I think you are misunderstanding me. I am not requesting that it be removed from your PR, I am requesting that it be moved out of the resumer interface to elsewhere in the code in your PR.

PR #4547 is not mine.
It was submitted in September 2020 by github user @mcalman and developed by him since then.
I sent him a request in the beginning of 2021 to add support for features of the chunker backend I think are good. He accepted it.
I also requested to develop unit tests. That request is still open, I am still waiting for the developer (as of the time of this post).
I did not request the interface be composable or whatever since the code seemed clear to me and passed my requerements.
Architectural stuff belongs with overall project health and is under @ncw control.
You or whoever on github can request the author whatever you like. It's up to @mcalman to accept it or reject.
Please be clear and ready to explain your position.
Thanks.

@innovate-invent
Copy link
Contributor Author

@ivandeex I implemented a CRC on the fragment uploader that is available to the backend if it is able to use it

@ivandeex
Copy link
Member

ivandeex commented May 14, 2021

I implemented a CRC on the fragment uploader that is available to the backend if it is able to use it

Thank you for this work.
Resumable uploading is a frequently demanded feature.
AFAIK S3, Dropbox, Local, FTP, SFTP (and probably GDrive, I'm not sure) support it natively.
Using chunker as a helper layer we could cover the rest (like we already do to add MD5/SHA1 support to lacking backends).

Sadly my backlog for 1.56 and 1.57 is pretty crowded.
After that I will sketch a chunker counterpart for your solution. We'll see how it works together.
Expect my review comments then.

Would be great if you could fix the PR test.

@innovate-invent
Copy link
Contributor Author

@ncw I would prefer a decision on which competing PR you will prefer before continuing to build out the tests and backend implementations.

@innovate-invent
Copy link
Contributor Author

@ncw Is there any chance you will be able to review this soon?

@innovate-invent
Copy link
Contributor Author

@ncw Hi Nick, does this mean you have decided to discard my PR? #4547 (review)

@ivandeex
Copy link
Member

ivandeex commented Oct 16, 2021

@ncw @innovate-invent @mcalman

I have an idea to merge #4547 after the release 1.57 - see #4547 (comment).
The merge will give us the draft API/framework, and basic unit tests to secure against breakage.

Then between 1.57 and 1.58, while resume exists solely in beta builds, we will have enough time (2-3 months) to review and merge a series of patches:

  • this one can bring more structure to the Resume API (instead of creating a competing interface),
    secured by unit tests against breakage
  • my commit d14ff61 will improve resume for the local backend (aka resumable downloads) by improving resume checksum selection
    it's currently living on the fs: add Resumer interface #4547 branch but I will make it into independent PR and rebase on top of cleaner API
  • chunker: implement Resumer interface #5185 will let us cover backends where resume is not yet directly available
  • I guess there will be more

Thoughts?

@ivandeex ivandeex added this to the v1.58 milestone Oct 18, 2021
@ivandeex ivandeex mentioned this pull request Oct 18, 2021
@ivandeex
Copy link
Member

ivandeex commented Nov 10, 2021

@innovate-invent
According to your comment your TUS PR (submitted 2 years ago, never able to build) depends on accepting this PR (submitted 6 months ago, much more generic than rclone serve).
This PR according to your words has super structure and thoroughly engineered interfaces.
Unfortunately it's hard to review because it does not build due to import loop cycle.
And it lacks documentation and unit tests.
You promise to add them only if @ncw closes out another PR which started 6 months before yours, because (to your words) its interfaces are bad and because it has a framework while it shouldn't have frameworks.
Thus you created a review cycle, stopping both.
You had several months to at least make your patch build - without prior conditions.
Could you please not create more cycles please? Maybe it's time to get rid of some?

P.S. Resumable transfers is one of the highest demanded features (after iCloud backend and FsNotify).

@innovate-invent
Copy link
Contributor Author

This project began Dec 17 2019, the TUS PR was the original PR before I split out the resumable interface to try and make it easier to review. If you have a read through https://forum.rclone.org/t/http-upload-contribution/13271/69, the interface I am proposing is a second draft that @ncw was supposed to review before I committed to and built out all the tests and documentation. We went back and forth on the specifics of the interface in that thread, refining the constraints and requirements. There are some far reaching implications with how resumable uploads are handled, especially surrounding how state is managed. As far as I am concerned, the resumable interface is still in the development phase and doesn't warrant a complete, let alone compile-able, implementation. I need @ncw to add his notes on the interface before we commit to anything. He has been largely MIA since we left off on the forum.

While I would value input on the FragmentUploader and concatUploader, the main item I am waiting for feedback on is the interfaces:

https://github.com/rclone/rclone/pull/5299/files#diff-c7b54bb28e1654d0ca47364a2c97456c124ee0d3a65854db18d116c509d89136R935-R955

I also can't justify investing any more time in this until I have assurances that my work will not be simply discarded in the end.

@ivandeex
Copy link
Member

Again.
The resume function has been widely requested by users.
I am personally using the alternative PR as a rolling patch for releases on my fork.
It has already saved me many hours resuming multi-gig downloads broken due to a network error.
I would gladly use this PR's patch if it built and passed practice tests.
However, this patch is not even POC or MVP atm

This patch by nature is pretty invasive. Reviewing it just as a design pattern, a piece of code, is not enough.
"Reviewing" in this case rather means "pair coding", unless we want to introduce subtle bugs.
Unfortunately I don't feel that @innovate-invent nor @ncw seem to have time or motivation to get this rolling.

If nothing reasonable happens here in 3 months, I am going to allocate time next January
and produce new PR taking the best (to my taste) from the two alternatives
and replace them both
cc @ncw

@ivandeex
Copy link
Member

ivandeex commented Nov 17, 2021

@innovate-invent
The alternative PR

will not be merged on rclone master until 1.58, nor will it be released with 1.58 i.e. at least 3 months from now.

@ivandeex ivandeex modified the milestones: v1.58, v1.59 Nov 19, 2021
@click0
Copy link

click0 commented Jun 4, 2022

Up.
I really miss uploading large files via FTPS.
I have to use lftp ....

@innovate-invent
Copy link
Contributor Author

This patch by nature is pretty invasive. Reviewing it just as a design pattern, a piece of code, is not enough.
"Reviewing" in this case rather means "pair coding", unless we want to introduce subtle bugs.
Unfortunately I don't feel that @innovate-invent nor @ncw seem to have time or motivation to get this rolling.

I am actually quite motivated to complete this PR. I just have no interest investing tremendous amounts of time on something that will be discarded in the end. This feature needs to be well integrated into the Fs interface, I am not expecting "pair coding" but simply a confirmation on the interface changes. Once the Fs interface is finalized, I can then begin implementing it more fully.

My last communication with @ncw was a back and forth on the interface design. This PR, in its current state, is a demonstration of how the interface would be implemented. I did go ahead and write some utility classes to further demonstrate how the abstractions I am proposing would interact. It doesn't matter how many PRs are opened reimplementing solutions, until the interface is confirmed by @ncw nobody is getting anything merged.

@ncw ncw modified the milestones: v1.59, v1.60 Jul 9, 2022
@ncw ncw removed this from the v1.60 milestone Dec 5, 2022
@Rootax
Copy link

Rootax commented Jul 30, 2023

Is this dead ? It seems a pretty important feature still missing, and potentially needed for other stuff like generic // chunck uploads...

@gfelbing gfelbing mentioned this pull request Aug 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants