Skip to content

Stamp sitemap index entries with per-file lastmod#16837

Merged
ematipico merged 1 commit into
withastro:mainfrom
jdevalk:sitemap-index-per-file-lastmod
May 25, 2026
Merged

Stamp sitemap index entries with per-file lastmod#16837
ematipico merged 1 commit into
withastro:mainfrom
jdevalk:sitemap-index-per-file-lastmod

Conversation

@jdevalk

@jdevalk jdevalk commented May 22, 2026

Copy link
Copy Markdown
Contributor

Fixes #16838.

Changes

The problem (#16838): @astrojs/sitemap writes per-URL <lastmod> into the child sitemaps but never into the <sitemap> entries of sitemap-index.xml. The index gets a <lastmod> only if you set the global lastmod option, and then every entry carries the same date. So the index cannot tell a crawler which child sitemap actually changed — even though the freshness data is already computed and sitting in the child sitemaps.

This PR derives each index entry's <lastmod> from the child sitemap it points to:

  • Each <sitemap> entry is stamped with the most recent <lastmod> among the URLs that land in that file. URLs are written in source order, limit per file, so the date is computed from items.slice(i * limit, (i + 1) * limit).
  • Works for both chunked (chunks) and non-chunked output, and stays accurate when a sitemap overflows into multiple numbered files.
  • When a child sitemap has no per-URL lastmod, the entry falls back to the configured lastmod option — existing behaviour preserved.
  • customSitemaps entries keep using the global lastmod (there are no items to derive a date from).

Before / after, for the reproduction in #16838:

<!-- before -->
<sitemap><loc>https://example.com/sitemap-0.xml</loc></sitemap>

<!-- after -->
<sitemap><loc>https://example.com/sitemap-0.xml</loc><lastmod>2024-09-15T00:00:00.000Z</lastmod></sitemap>

The changeset is patch. It is a behaviour change for anyone setting per-URL lastmod via serialize (their index now carries accurate per-file dates), so happy to bump to minor if preferred.

Testing

New test/index-lastmod.test.ts:

  • Chunked — distinct lastmod values across blog/glossary chunks; asserts each index entry surfaces the newest date in its child sitemap, and that a chunk with no per-URL lastmod falls back to the configured lastmod.
  • Non-chunked, multiple filesentryLimit: 1 so each URL gets its own file; asserts every index entry's lastmod equals the date in the child sitemap it points to (exercises the per-file slicing for i > 0).

Full @astrojs/sitemap suite passes (40/40). biome, eslint, knip, and tsc -b are clean.

Docs

No docs change needed — this refines the existing lastmod behaviour with no new or changed API surface. The lastmod option keeps working as a fallback for child sitemaps without per-URL dates.

The sitemap index gave every `<sitemap>` entry the same global `lastmod`,
so crawlers could not tell which child sitemaps actually changed. Each
index entry is now stamped with the newest `lastmod` of the URLs in the
child sitemap it points to, falling back to the configured `lastmod`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@changeset-bot

changeset-bot Bot commented May 22, 2026

Copy link
Copy Markdown

🦋 Changeset detected

Latest commit: 0b6f306

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 8 packages
Name Type
@astrojs/sitemap Patch
@test/sitemap-chunks Patch
@test/sitemap-dynamic Patch
@test/sitemap-i18n-fallback Patch
@test/sitemap-ssr Patch
@test/sitemap-static Patch
@test/sitemap-trailing-slash Patch
@test/astro-vercel-integration-assets Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@github-actions github-actions Bot added the pkg: integration Related to any renderer integration (scope) label May 22, 2026
@ematipico

Copy link
Copy Markdown
Member

The problem with this PR is that there's no issue filed that actually shows there's a problem. And I admit, the listed changes don't help framing the issue. Hence, I don't know what i'm reviewing.

@jdevalk

jdevalk commented May 22, 2026

Copy link
Copy Markdown
Contributor Author

Thanks @ematipico, fair — there was no issue and the framing was thin. Fixed both:

The short version of what's being reviewed: index <lastmod> should reflect the child sitemap it points to (so crawlers can tell which child changed), and today it never does.

@ematipico ematipico left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@ematipico ematipico merged commit 783c4a6 into withastro:main May 25, 2026
24 checks passed
@astrobot-houston astrobot-houston mentioned this pull request May 25, 2026
@jdevalk jdevalk deleted the sitemap-index-per-file-lastmod branch May 25, 2026 10:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pkg: integration Related to any renderer integration (scope)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

@astrojs/sitemap: <lastmod> is missing from sitemap-index.xml entries

2 participants