feat: Change which download errors are retryable#3229
feat: Change which download errors are retryable#3229Bravo555 merged 2 commits intothin-edge:mainfrom
Conversation
Robot Results
|
Codecov ReportAll modified and coverable lines are covered by tests ✅ Additional details and impacted files📢 Thoughts on this report? Let us know! |
Whereas previously we considered only some errors to be permanent and the rest to be retryable, that logic was inverted so we check if an error is one of a few that is transient, and the rest are considered permanent. Biggest changes: - connect errors are considered transient - timeouts are considered transient Signed-off-by: Marcel Guzik <marcel.guzik@inetum.com>
5ee377a to
8e0ab51
Compare
Signed-off-by: Marcel Guzik <marcel.guzik@inetum.com>
| let mut downloader = Downloader::new(target_path, None, CloudRootCerts::from([])); | ||
| downloader.set_backoff(ExponentialBackoff { | ||
| current_interval: Duration::ZERO, | ||
| max_elapsed_time: Some(Duration::ZERO), |
There was a problem hiding this comment.
I'm a bit puzzled by this change. As far as I understand, this disables the backoff since it will cause next_backoff to always return None, so we will never retry?
There was a problem hiding this comment.
Yes. As now the connect errors are considered transient, it causes DNS resolution failure to be considered transient, which caused the test to hang instead of immediately returning.
I think the intention behind setting current_interval to zero would be to disable retries in the test, but it wasn't actually doing anything, and only max_elapsed_time set to zero actually disables the retries.
But another question is should DNS resolution failure be considered transient. On one hand we can timeout during the query because the network is down, then we should retry, but if the network is up and the domain does not exist, then we should return ASAP.
I'll check if we can extract information from reqwest::Error so that we can distinguish between these 2 cases.
There was a problem hiding this comment.
But another question is should DNS resolution failure be considered transient.
Yes DNS resolution errors are transient and should be retried, see #2321
There was a problem hiding this comment.
I'm a bit puzzled by this change. As far as I understand, this disables the backoff since it will cause
next_backoffto always returnNone, so we will never retry?
I was puzzled too, till I noticed this is only in the scope of a test.
Proposed changes
Whereas previously we considered only some errors to be permanent and the rest to be retryable, that logic was inverted so we check if an error is one of a few that is transient, and the rest are considered permanent.
Biggest changes:
Types of changes
Paste Link to the issue
Checklist
cargo fmtas mentioned in CODING_GUIDELINEScargo clippyas mentioned in CODING_GUIDELINESFurther comments