Feat/sync retry download by benedikt-schaber · Pull Request #23 · huggingface/hf-hub

benedikt-schaber · 2023-09-11T08:44:03Z

What this PR does

This PR is related to #18. It does not support resuming downloads. Rather it adds retry capabilities to the sync API. It adds the max_retries field to ApiBuilder and Api that is then used in download.
We stream the download to a tempfile, and, if we encounter an error retry at most max_retries times, using range requests to only request the data we do not have already.
We use a new struct ResumableReader to keep track of the number of bytes we downloaded successfully, it is basically just a reduced version of Progressbar.
We assume all bytes that we did receive to be correct.

Left ToDo

Use Progressbar with ResumableReader (easy)
Proper Testing

Testing

Testing this functionality properly is a bit harder, since we need to ensure the failure of the request. Further, it would be nice to test that we indeed do not continue from the start of the file.

There are different ways we could go about achieving this, here are some I evaluated:

mock a http server
use a mocking library to mock ureq client and response
use something like a hexagonal architecture approach to reduce required mocking

Using a mock server would be simplest, however many existing frameworks, like wiremock or httpmock do not support us returning an ok, in the range case 206, request but then streaming and failing the body if I understood that correctly. So I think we would be forced to implement a proper serve on our own.

Second would be a bit annoying, since we not only use the client, but then also the response (although we just turn that into a Reader), so it would be doable. We of course would not test if the Reader we get from the request truly behaves like we expect it (and like our mock does), but I think this is acceptable.

Third would be much like second although reducing what we need to mock, but also changing some things about the architecture which may or may not be acceptable.

So, this is the main area I am looking to for some feedback, since I think whatever choice is taken, it will have an effect on the rest of the project.

Additionally if we also want to test that it does indeed not just always resume from the start we would also potentially get some issues with std::io::copy since we do not know its buffer size. But that should be rather easy to resolve, so I will look into it once a basic testing strategy is outlined.

Narsil · 2023-09-21T06:51:03Z

+        let mut reader = reader::ResumableReader::new(reader, current);
+        std::io::copy(&mut reader, file)?;


Given this is only linearly increasing, couldn't we just Seek into the file ?
Making ResumableReader not necessary (less code is almost always better) ?

@McPatate FYI

Yes, you are right. file is already at the right position, but we could SeekFrom::End(0) to get the length and use that as current. I will implement the change later today

You don't need to SeekFrom::End I think the response Content-Length should already give that information.
And even then you don't need it either since you would be using increasing Seek.

Ohh I just realized what you meant. Seek::End would be equal to the partial-length, I see, yes that works too (actually wouldn't the cursor already be there naturally given io::copy 's nature?

The cursor is already there, but we need to get its byte position to plug it into the RANGE header. We could also use SeekFrom::Current(0) (or the equivalent .stream_position()). SeekFrom::End(0) would have the advantage, that we could also use it with .incomplete files directly once we implement that, however, that would also be easily achieved by just doing it once in the beginning.

SeekFrom::Current(0) might have slightly better performance, I would have to check how it is implemented, so we might prefer it.

Or am I missing another method of retrieving the necessary byte offset?

Sorry for the delay. Maybe file.metadata() then ? I don't know which is the simplest.
Also is there a way to fuse the retry loop with the outer "regular" logic?

Please also excuse the late response. file.metadata() would require us to first call file.sync_all() if I am not mistaken, so I do not think that it is desirable. I have used stream_position for now (see next commit).
I'll look into combining the logic this evening.

without simulating a failure

and remove ResumableReader

Narsil

After waaaayy too long, sorry about that, LGTM.

benedikt-schaber · 2024-12-26T00:00:14Z

Hey, thanks for coming back to it.
I had a very busy time back then, so definitely not the best communication from my part, Sorry.
I hope I can contribute more and more efficiently in the future.

Narsil · 2024-12-27T10:31:17Z

No, totally my bad, better late than never in any case.

Thanks a lot

benedikt-schaber marked this pull request as ready for review September 11, 2023 08:44

Narsil reviewed Sep 21, 2023

View reviewed changes

benedikt-schaber and others added 5 commits December 26, 2024 00:38

Add max_retires field to sync api

aeb37b8

Add basic retry capability to sync download

5fc0470

Add simple test for download with retries

e3a2a01

without simulating a failure

Use stream position when continuing download

8a9c54d

and remove ResumableReader

Fix conflict.

733600d

Narsil force-pushed the feat/sync-retry-download branch from 6b7d7e9 to 733600d Compare December 25, 2024 23:40

Narsil approved these changes Dec 25, 2024

View reviewed changes

Narsil merged commit edda880 into huggingface:main Dec 25, 2024

benedikt-schaber deleted the feat/sync-retry-download branch December 26, 2024 08:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feat/sync retry download#23

Feat/sync retry download#23
Narsil merged 5 commits into
huggingface:mainfrom
benedikt-schaber:feat/sync-retry-download

benedikt-schaber commented Sep 11, 2023 •

edited

Loading

Uh oh!

Narsil Sep 21, 2023

Uh oh!

Narsil Sep 21, 2023

Uh oh!

benedikt-schaber Sep 21, 2023

Uh oh!

Narsil Sep 21, 2023

Uh oh!

Narsil Sep 21, 2023

Uh oh!

benedikt-schaber Sep 21, 2023

Uh oh!

Narsil Sep 26, 2023

Uh oh!

benedikt-schaber Oct 3, 2023

Uh oh!

Narsil left a comment

Uh oh!

benedikt-schaber commented Dec 26, 2024

Uh oh!

Narsil commented Dec 27, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		let mut reader = reader::ResumableReader::new(reader, current);
		std::io::copy(&mut reader, file)?;

Conversation

benedikt-schaber commented Sep 11, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does

Left ToDo

Testing

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Narsil left a comment

Choose a reason for hiding this comment

Uh oh!

benedikt-schaber commented Dec 26, 2024

Uh oh!

Narsil commented Dec 27, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

benedikt-schaber commented Sep 11, 2023 •

edited

Loading