Skip to content

User agent not set when fetching command-line URLs #1886

@katrinafyi

Description

@katrinafyi

These two commands will behave differently. The second sets User-agent to lychee/0.21.0 (the default value), but the first does not set it at all.

cargo run -- https://webhook.site/a1895fe4-5d07-47e2-bbfe-28982b72d16f           
echo 'https://webhook.site/a1895fe4-5d07-47e2-bbfe-28982b72d16f' | cargo run -- -  

By going to the view link, you can see the two requests. (The link might expire - you can make your own at https://webhook.site).

This is related to the url_contents function which implements the input fetching logic. This is a separate code path from the main link checking request logic which explains the discrepancy. (Imo, it would be good to unify the code paths - this would also help with doing recursion).

The omission of user-agent is the underlying cause of #1767 (comment), where a wikipedia page on the command line appears to contain no links at all.

This issue was suggested by @ thomas-zahner in #1883 (comment).

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No fields configured for Bug.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions