Skip to content

CacheableResponse forces download of entire HTTP response #2034

@katrinafyi

Description

@katrinafyi

Originally posted by @jcharaoui in #27

I'm having a similar issue where a website I want to check has a lot of download links, but lychee is struggling to complete checking all the links and timing out on many of them.

Example URL: https://dist.torproject.org/torbrowser/16.0a2/tor-browser-linux-x86_64-16.0a2.tar.xz

When checking this link using the GET method, it downloads the whole file, while using the HEAD method the check completes in a matter of milliseconds.

The inconsistency observed is likely due to variations in the configurations of the various HTTP daemons out there.

It looks like the entire response body is cached and this causes the entire body to be read, even when it's not needed (i.e., when fragment checking is disabled).

impl CacheableResponse {
async fn try_from(response: Response) -> Result<Self> {
let status = response.status();
let headers = response.headers().clone();
let url = response.url().clone();
let text = response.text().await.map_err(ErrorKind::ReadResponseBody)?;

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions