Skip to content

Link Checker broken due to excessive requests to GitHub #474

@effigies

Description

@effigies

GitHub seems to be rate-limiting requests, and we're now getting a lot of results like:

URL        `https://github.com/bids-standard/bids-specification/pull/13'
Name       `#13'
Parent URL file:///root/build/site/CHANGES.html, line 1067, col 41
Real URL   https://github.com/bids-standard/bids-specification/pull/13
Check time 0.397 seconds
Size       1KB
Result     Error: 429 too many requests

See, for example, https://app.circleci.com/pipelines/github/bids-standard/bids-specification/1076/workflows/7acda833-9ecf-4c1a-a6f7-51f45641fbc1/jobs/2434.

We're running it twice, though I don't understand the explanatory comment:

linkchecker -t 1 ~/build/site/
# check external separately by pointing to all *html so no
# failures for local file:/// -- yoh found no better way,
linkchecker -t 1 --check-extern --ignore-url 'file:///.*' --ignore-url https://fonts.gstatic.com ~/build/site/*html ~/build/site/*/*.html

@yarikoptic Any thoughts? Maybe we can adjust the commands to ensure there's no duplication? If linkchecker could output a list of targets that could be combined and deduplicated, that might be a useful approach.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions