Skip to content

Improve content collection styles and scripts build perf#10959

Merged
ematipico merged 7 commits intomainfrom
improve-cc-handling
May 8, 2024
Merged

Improve content collection styles and scripts build perf#10959
ematipico merged 7 commits intomainfrom
improve-cc-handling

Conversation

@bluwy
Copy link
Copy Markdown
Member

@bluwy bluwy commented May 6, 2024

Changes

This PR refactors how we handle propagated styles and scripts in the Rollup generateBundle phase.

Context

A propagated module, e.g./Users/.../blog.mdx?astroPropagatedAsset, contains styles and scripts. The styles and scripts are only attached to the head if that blog.mdx is actually rendered, prevent other content styles and scripts from leaking.

Before


The data structure we used to track propagated modules to it styles and scripts is something like this:

Map {"/src/pages/index.astro" => Map {
    "/Users/.../blog1.mdx?astroPropagatedAsset" => { styles: Set {}, scripts: Set {} }
    "/Users/.../blog2.mdx?astroPropagatedAsset" => { styles: Set {}, scripts: Set {} } 
  },
  "/src/pages/other.astro" => Map {
    "/Users/.../blog1.mdx?astroPropagatedAsset" => { styles: Set {}, scripts: Set {} }
  }
}

This is great because we have information of which top-level pages uses what propagated modules and its styles and scripts. However, (correct me if I'm wrong) I noticed we're not using this information at all.

The only place we consume the information is here, but you can notice that both propagated styles searching and propagated scripts searching do the same logic, so in practice the work done in all places are kinda negated.

After

This PR simplifies and flattens it to:

// internals.propagatedStylesMap
Map {
    "/Users/.../blog1.mdx?astroPropagatedAsset" => Set {}
    "/Users/.../blog2.mdx?astroPropagatedAsset" => Set {}
 }

// internals.propagatedScriptsMap
Map {
    "/Users/.../blog1.mdx?astroPropagatedAsset" => Set {}
    "/Users/.../blog2.mdx?astroPropagatedAsset" => Set {}
 }

This way there's only a single "propagated module to styles/scripts" map, each key deduplicated, and we can save the work to figure out which page uses what propagated module, which was very expensive.

This is actually the pattern that Content Collection Cache used, and seems like it's on the right track abstracting this. So this PR also unifies how it's handled with CCC.

Performance

The goal of this refactor is to reduce calls to getParentModuleInfos, before it took 7.5s, now it takes 0.05s. Seems like making the work O(n^2) to O(n) is quite significant.

Testing

Existing tests should pass. I think we have some comprehensive tests around this. I also clicked around the local docs build and all looks good as before.

Docs

Added a changeset because I'm a bit afraid this may break some edge cases.

@changeset-bot
Copy link
Copy Markdown

changeset-bot bot commented May 6, 2024

🦋 Changeset detected

Latest commit: 0cdf248

The changes in this PR will be included in the next version bump.

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@github-actions github-actions bot added the pkg: astro Related to the core `astro` package (scope) label May 6, 2024
It was actually a bug. There was an empty module script injected.
@github-actions github-actions bot added the pkg: integration Related to any renderer integration (scope) label May 6, 2024
@natemoo-re
Copy link
Copy Markdown
Member

Oooh that's great, nice work identifying this! I wouldn't be surprised if there were a few other inefficient data structures lurking around this part of the codebase...

@bluwy bluwy added this to the 4.8.0 milestone May 7, 2024
@bluwy
Copy link
Copy Markdown
Member Author

bluwy commented May 7, 2024

Marked this for 4.8 as I'd like to merged this in a minor together (to be safe)

Copy link
Copy Markdown
Member

@sarah11918 sarah11918 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, @bluwy , and happy to have content collections improvements! Tiny tweak to the changeset mostly so I didn't have to do any grammar research. 😅 (And, collections plural). Otherwise, approved for docs! 🥳

Co-authored-by: Sarah Rainsberger <sarah@rainsberger.ca>
@ematipico ematipico merged commit 685fc22 into main May 8, 2024
@ematipico ematipico deleted the improve-cc-handling branch May 8, 2024 09:24
@astrobot-houston astrobot-houston mentioned this pull request May 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pkg: astro Related to the core `astro` package (scope) pkg: integration Related to any renderer integration (scope)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants