Skip to content

planemo lint --urls UserAgent? #578

@peterjc

Description

@peterjc

Just noticed via https://travis-ci.org/peterjc/galaxy_blast/jobs/162157454 that however planemo accesses the URL it can be blocked, apprently based on the UserAgent, here linting https://github.com/peterjc/galaxy_blast/blob/master/tools/ncbi_blast_plus/ncbi_makeblastdb.xml

Applying linter tool_urls... FAIL
.. ERROR: HTTP Error 403 accessing https://www.ncbi.nlm.nih.gov/books/NBK279690/
.. INFO: URL OK http://dx.doi.org/10.1186/s13742-015-0080-7
.. INFO: URL OK http://dx.doi.org/10.1186/1471-2105-10-421
.. INFO: URL OK http://toolshed.g2.bx.psu.edu/view/devteam/ncbi_blast_plus

Curiously the URL https://www.ncbi.nlm.nih.gov/books/NBK279690/ is fine in web-browsers or with curl, but wget also gets the Error 403 reply:

$ wget https://www.ncbi.nlm.nih.gov/books/NBK279690/
--2016-09-23 14:54:10--  https://www.ncbi.nlm.nih.gov/books/NBK279690/
Resolving www.ncbi.nlm.nih.gov... 130.14.29.110, 2607:f220:41e:4290::110
Connecting to www.ncbi.nlm.nih.gov|130.14.29.110|:443... connected.
HTTP request sent, awaiting response... 403 Forbidden
2016-09-23 14:54:11 ERROR 403: Forbidden.

You might argue that for linting the user-facing documentation (the reStructuredText in the <help> tag) we should fetch the link with a standard web-browser User Agent, while for package download URLs then the default arbitrary User Agent makes sense as the URL would be used for programmatic downloads (e.g. in Python when installing via the ToolShed).

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions