fix: use proper HTTP clients to fetch files#66
Conversation
Add ``--timeout`` option in chisel cut so that users can specify their preferred limit for each request to avoid context deadline error [0]. Duration (with unit) should be specified as the argument of the new option. A timeout of zero means no timeout. Default timeout is 60 seconds. Additionally, increase the hardcoded Timeout in the http client in archive.go to match the default 60 seconds limit. Fixes canonical#14. References: - [0] canonical#14
cjdcordeiro
left a comment
There was a problem hiding this comment.
Nice, thanks. Just a couple of comments and a question: is it possible to have a test for this in archive_test?
--timeout option in cut command--timeout option in cut command
Pass --real-archive option in go test to run this test with real archives.
|
Added a test for the new option. Please make sure to pass the |
Cool! |
The --real-archive flag specified in archive_test is used to test real archives and timeout limits. Run these tests in the github workflow.
|
I was a bit hesitant to run the tests with Additionally, the new |
There was a problem hiding this comment.
I'd prefer the timeout to be passed down via archive.Options but that may be tricky as in the Archive interface we don't use http.Client directly but rather just the global httpDo() function. Apparently it's possible to create interruptible requests via NewRequestWithContext() but that would need to be cancelled from another goroutine.
In conclusion, LGTM. :-)
niemeyer
left a comment
There was a problem hiding this comment.
There's a detail I mention below about the option name, but reviewing this code made me realize that the implementation is currently misusing a concept that I've ported from earlier applications that I wrote a while ago. Note how we have two HTTP clients: httpCilent, and bulkClient, with different timeouts explicitly defined. The distinction exists precisely because one of them provides interactivity for smaller API-like content, while the latter provides a configuration for bulk data (IOW, large downloads, such as packages and Content files).
So this change is in effect working around the fact we're misusing these two values: it's strugling with the client made for quick interactions, and attempting to turn it into the bluk client. We might as well actually use the two clients properly instead.
So here is the proposal: can we please drop the explicit flag for now, and make this PR simply change the implementation to use the two different clients correctly, for the proper cases?
We can do that by simply introducing a flag parameter to the fetch() function:
type fetchFlags uint
const (
fetchBulk fetchFlags = 1 << iota
fetchDefault fetchFlags = 0
)
func (a ...) fetch(..., flags fetchFlags) ... { ... }
Then, packages can use the bulk client, while indexes can use the short timeout one.
How does that sound?
This reverts commit 62b704e.
|
Hiya @niemeyer, thanks for the insight! I also wondered what the intentions of Although you suggested using the shorter timeout Perhaps, using the bulk client for downloading indexes doesn't seem intuitive or natural. Let me know if you want to use the shorter timeout |
--timeout option in cut command
niemeyer
left a comment
There was a problem hiding this comment.
Thanks for the changes.
Fixes #14.
As suggested below by Gustavo, use the short timeout
httpClientand long timeoutbulkClientin archive.go properly. Use the quickhttpClientto fetch small files such as Release files. For packages and indexes, use thebulkClientas the files can be quite big.