-
Notifications
You must be signed in to change notification settings - Fork 97
Closed
Description
Linting this file https://github.com/peterjc/pico_galaxy/blob/801daf8dc7932a087eb83a96d0be1e99ed0447c3/tools/chromosome_diagram/chromosome_diagram.xml is failing,
$ planemo --version
planemo, version 0.33.0.dev0
$ planemo shed_lint --tools --urls tools/chromosome_diagram/ ; echo "Return code $?"
(snip)
Applying linter tool_urls... FAIL
.. ERROR: URL Error <urlopen error unknown url type: pmid> accessing pmid:19304878
.. INFO: URL OK http://dx.doi.org/10.1093/bioinformatics/btp163
Failed linting
Return code 1
Or,
$ planemo lint --urls tools/chromosome_diagram/ ; echo "Return code $?"
...
Applying linter tool_urls... FAIL
.. ERROR: URL Error <urlopen error unknown url type: pmid> accessing pmid:19304878
.. INFO: URL OK http://dx.doi.org/10.1093/bioinformatics/btp163
Failed linting
Return code 1
This is triggered by the RST help text in the tool XML file:
Cock et al 2009. Biopython: freely available Python tools for computational
molecular biology and bioinformatics. Bioinformatics 25(11) 1422-3.
http://dx.doi.org/10.1093/bioinformatics/btp163 pmid:19304878.
It appears pmid:19304878 is wrongly being picked up as a URL despite not having a double slash after the colon?
planemo/planemo/shed/__init__.py
Line 137 in 6a6f164
| # http://stackoverflow.com/questions/7676255/find-and-replace-urls-in-a-block-of-te |
>>> import re
>>> HTTP_REGEX_PATTERN = re.compile(r"""(?i)\b((?:[a-z][\w-]+:(?:/{1,3}|[a-z0-9%])|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}/)(?:[^\s()<>\[\]]+|\(([^\s()<>\[\]]+|(\([^\
... \s()<>\[\]]+\)))*\))+(?:\(([^\s()<>\[\]]+|(\([^\s()<>\[\]]+\)))*\)|[^\s`!(){};:'".,<>?\[\]]))""")
>>> HTTP_REGEX_PATTERN.findall("\nSee pmid:12345678 for details.")
[('pmid:12345678', '', '', '', '')]
>>> HTTP_REGEX_PATTERN.findall("\nSee http://example.org or pmid:12345678.")
[('http://example.org', '', '', '', ''), ('pmid:12345678', '', '', '', '')]Reactions are currently unavailable