chunker: implement Resumer interface

### What is your current rclone version?

1.54.1

### What problem are you are trying to solve?

This ticket requests the _Resume_ feature in the chunker backend.

from https://github.com/rclone/rclone/issues/87#issuecomment-671506822 by @mcalman:

> I'm interested in addressing the case when an upload is interrupted for a large file, and must be restarted. It would be nice if the user was able to resume uploading that file from where they left off [...]
> I have been looking into using the chunker backend to support an upload resume feature. I noticed that when an upload is done with chunker and is quit during a file upload, the chunks that have already been uploaded are left on the remote, but then ignored. I have been working on modifying rclone chunker to check for these existing chunks, and if present, use them rather than re-upload those chunks.

Note that here we ask only for _sequential resume_, which is irrelevant to the _multi-thread upload feature_ covered by requests #5041 (for chunker) and #4798 (general discussion).

### How do you think rclone should be changed to solve that?

from the 1st https://github.com/rclone/rclone/pull/4547#discussion_r583522691:

> We can't just chain to the lower backend in general case. If a file is chunked, its remote will chain to a small metadata (or nothing if metadata is disabled). If it's not chunked, it can become chunked after resume, but we can't predict it [in a general case].

from the 2nd https://github.com/rclone/rclone/pull/4547#discussion_r583543773:

> Chunker can tolerate objects uploaded from multiple clients thanks to transactions [and save partially uploaded chunks per transaction].
Later, upon a resume request it can select the "best" incomplete transaction given the rolling hash state and size of already uploaded chunks.

from the 3rd https://github.com/rclone/rclone/pull/4547#discussion_r583534761:

> Golang's [Hash](https://golang.org/pkg/hash/#example__binaryMarshaler) interface allows to [save](https://golang.org/pkg/encoding/#BinaryMarshaler)/restore intermediate hash state for any (TBC) type of hash.
> [The common Resume handler will] [keep](https://golang.org/pkg/encoding/base64/#example) it in the resume metadata json together with hash name,
> [and will] negotiate with [chunker] whether operation should be continued from the last point or [retry] from the start

from the 4th https://github.com/rclone/rclone/pull/4547#issuecomment-786889739:

> The use of intermediate (aka rolling or accrued) hashsums will prevent the following scenario:
> * user uploads a large file
> * network broken, upload canceled
> * source file is changed or another attempt is changing the partial upload on target
> * user asks to resume a file
> * rclone resumes (here we could have checked validity of partial upload and rewind from start)
> * after some hours rclone finds that fingerprint is wrong

from the 5th https://github.com/rclone/rclone/pull/4547#issuecomment-786593956:

> [Let's] add a new per-transaction control chunk to save info about partial hash and [probably] hashes of uploaded chunks.
>
> [Let's also] add a code that selects transaction to resume given a partial hash and the total uploaded size so far. Maybe select the "best" partial transaction (when rename is fast) or just pick a single partial transaction ID (when it's slow).

The implementation will obey the _Resumer_ interface developed by PR #4547.

> In case of chunker the _resumer cache_ usage can be somewhat decreased because already uploaded chunks are isolated remotely and marked by a "transaction ID". The _resumer proper_ will just re-check them based on negotiations with chunker.

**NOTE** This change will create a new version of the chunker metadata and grow the number of tested combinations. I think we can commit this together with other chunker PRs on a _dedicated branch_ which will produce a beta release for public beta-testing. Later we can merge these commits together from there on the master branch using a single metadata version number.

### References

- Related to feature request #87 (Resume uploads)
- Depends on pull request #4547 (add Resumer interface)
- _Orthogonal_ to feature request #5041 (multi-thread uploads in chunker)
- _Orthogonal_ to discussion #4798 (multi-thread uploads for different backends)
- Related to thread https://forum.rclone.org/t/intelligent-faster-chunker-file-updates-on-checksum-enabled-remotes/22313/7

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

chunker: implement Resumer interface #5154

What is your current rclone version?

What problem are you are trying to solve?

How do you think rclone should be changed to solve that?

References

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

chunker: implement Resumer interface #5154

Description

What is your current rclone version?

What problem are you are trying to solve?

How do you think rclone should be changed to solve that?

References

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions