Skip to content

SEO issue: do not use useLocation() to compute canonical urls #9170

@slorber

Description

@slorber

Have you read the Contributing Guidelines on issues?

Prerequisites

  • I'm using the latest version of Docusaurus.
  • I have tried the npm run clear or yarn clear command.
  • I have tried rm -rf node_modules yarn.lock package-lock.json and re-installing packages.
  • I have tried creating a repro with https://new.docusaurus.io.
  • I have read the console error message carefully (if applicable).

Description

The way we compute the canonical url today:

function useDefaultCanonicalUrl() {
  const {
    siteConfig: {url: siteUrl, baseUrl, trailingSlash},
  } = useDocusaurusContext();
  const {pathname} = useLocation();
  const canonicalPathname = applyTrailingSlash(useBaseUrl(pathname), {
    trailingSlash,
    baseUrl,
  });
  return siteUrl + canonicalPathname;
}

Using useLocation().pathname works in most cases but it is a bad idea because it is a dynamic value that depends on the current browser URL. This means the static canonical URL might be ok in the html files, but once React hydrates, the canonical URL is updated to something else that can depend on the browser URL.

Notably, if you use your CDN/reverse proxy to configure aliases, if a doc exists at /doc1 and you also make it available at /doc1alias, then if you go to /doc1alias and after React hydrates, the canonical URL will be /doc1alias instead of /doc1 (ie 2 canonical URLs for the same doc).

I'm not sure it's a big deal for SEO, considering crawlers probably try to extract the static canonical URL in the page which is correct before React hydration, but we should still rather try to find a solution.

Note doing such reverse proxy alias might be common, and we also discuss it as part of this issue as a good solution if you want to have docs version aliases: see also #9049

Similarly, hreflang values depend on useLocation and can be wrong on aliased documents.

Related to #9128

Reproducible demo

No response

Steps to reproduce

We don't have any doc alias in our prod website, but the 404 case is a great example.

Take a look at https://docusaurus.io/not/found/path

  • before hydration, canonical URL is https://docusaurus.io/404.html
  • after hydration, canonical URL is https://docusaurus.io/not/found/path

Expected behavior

The canonical url, hreflang and other metadata using pathname should always be the same before/after React hydration

Actual behavior

The values are different before/after hydration

Your environment

No response

Self-service

  • I'd be willing to fix this bug myself.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugAn error in the Docusaurus core causing instability or issues with its execution

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions