Skip to content

Do not assume input to be a URL if the local path doesn't exist #1595

@mre

Description

@mre

Context

At the moment, lychee uses the http:// scheme as a fallback for local input paths that don't exist:

> lychee foo
Error: Network error

Caused by:
    0: error sending request for url (http://foo/)
    1: client error (Connect)
    2: dns error: failed to lookup address information: nodename nor servname provided, or not known
    3: failed to lookup address information: nodename nor servname provided, or not known

The idea was to model the behavior after curl, which does the same:
https://github.com/curl/curl/blob/70ac27604a2abfa809a7b2736506af0da8c3c8a9/lib/urlapi.c#L1104-L1124

Issue

Getting the scheme guessing correct is quite tricky.

Furthermore, it can be misleading if the user assumes a local path exists, but doesn't. In this case, we make an unexpected network request.
This can even go unnoticed in CI/CD in case there happens to be a URL with the same name.

For example, assume we want to check a ZIP archive (which isn't supported, but might be in the future).
Furthermore, assume the file doesn't exist.
If we run:

lychee --dump-inputs url.zip
http://url.zip/

This would assume http://url.zip/ instead! 💥
(Yes, .zip is a TLD.)

Proposal

I propose to remove the fallback to http. This is a breaking change, which we should make before 1.0.

Necessary changes

Remove the condition here and return an error instead:

// Invalid path; check if a valid URL can be constructed from the input
// by prefixing it with a `http://` scheme.
// Curl also uses http (i.e. not https), see
// https://github.com/curl/curl/blob/70ac27604a2abfa809a7b2736506af0da8c3c8a9/lib/urlapi.c#L1104-L1124
let url = Url::parse(&format!("http://{value}")).map_err(|e| {
ErrorKind::ParseUrl(e, "Input is not a valid URL".to_string())
})?;
InputSource::RemoteUrl(Box::new(url))

Add a test to prove correct behavior.

Pull requests greatly appreciated.

Metadata

Metadata

Assignees

No one assigned

    Type

    No fields configured for Task.

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions