External link support was not built because fetching remote content is slow and flaky. Ideas:
- Support sitemap.xml only or at least attempt to use it as fastpath.
- Maybe make user cache/store sitemaps for all external domains so flakiness can be kept in check
- Add subcommand to generate sitemap.xml for own static site
Why do it this way? Because our actual usecase is only for checking links from docs.sentry.io to sentry.io. Both are static sites we control, so we could make sure everything has sitemaps and still get away with very fast builds. sentry.io already has a sitemap
However, for a general-purpose external links checker we probably really need to support real HTTP + build a local cache file, maybe. Also for anchor-checking sitemap.xml doesn't work.
External link support was not built because fetching remote content is slow and flaky. Ideas:
Why do it this way? Because our actual usecase is only for checking links from docs.sentry.io to sentry.io. Both are static sites we control, so we could make sure everything has sitemaps and still get away with very fast builds. sentry.io already has a sitemap
However, for a general-purpose external links checker we probably really need to support real HTTP + build a local cache file, maybe. Also for anchor-checking sitemap.xml doesn't work.