Skip to content

perf: make fast-glob always ignore node_modules while allowing patterns to include node_modules#7436

Closed
brillout wants to merge 3 commits intovitejs:mainfrom
brillout:feat/perf-glob
Closed

perf: make fast-glob always ignore node_modules while allowing patterns to include node_modules#7436
brillout wants to merge 3 commits intovitejs:mainfrom
brillout:feat/perf-glob

Conversation

@brillout
Copy link
Contributor

Description

This is a performance improvement on top of @frandiox's PR #6056.

Additional context

Hydrogen and vps frameworks (will) use import.meta.glob for crawling inside node_modules/, e.g. import.meta.glob('node_modules/some-framework/**/*.page.js'). This PR increases the performance by, potentially, several orders of magnitude.

What is the purpose of this pull request?

  • Bug fix
  • New Feature
  • Documentation update
  • Other

Before submitting the PR, please make sure you do the following

  • Read the Contributing Guidelines.
  • Read the Pull Request Guidelines and follow the Commit Convention.
  • Check that there isn't already a PR that solves the problem the same way to avoid creating a duplicate.
  • Provide a description in this PR that addresses what the PR is solving, or reference the issue that it solves (e.g. fixes #123).
  • Ideally, include relevant tests that fail without this PR but pass with it.

@patak-cat
Copy link
Member

Interesting optimization. Did you perform some profiling? It would be great to have some numbers for real apps.

One issue is that this is potentially a breaking change in case a pattern will look into node_modules/package/node_modules, no? Should we detect this case and bailout? Maybe it isn't needed, I don't think nobody is using it in this way.

Added to discuss with the team.

@patak-cat patak-cat added the p2-nice-to-have Not breaking anything but nice to have (priority) label Mar 24, 2022
@brillout
Copy link
Contributor Author

The number of files in node_modules is much higher than the number of files in user land.

For example, the demo vps framework https://github.com/brillout/vite-plugin-ssr/tree/0.4.0/examples/framework. The framework/node_modules/ has 3102 files while framework/ without node_modules/ has only 15 files. So that's 200x less files for fast-glob to go through. For Hydrogen and future vps frameworks, we can expect this to be much more than 200x.

This means fast-glob will be much faster if we skip node_modules.

One issue is that this is potentially a breaking change in case a pattern will look into node_modules/package/node_modules, no? Should we detect this case and bailout? Maybe it isn't needed, I don't think nobody is using it in this way.

Yes, this PR makes import.meta.glob('node_modules/framework/**/*.page.js') ignore node_modules/framework/node_modules/some-library/pages/some.page.js. I also don't think anyone uses it: I don't see a use case for it, and Fran's PR is quite recent.

Added to discuss with the team.

As always, I can join the discussion as a guest, if there is interest. (I think, in general, inviting guests that the Vite team trusts to join team meetings for specific PRs could make sense.)

@patak-cat
Copy link
Member

We talked about the PR today @brillout.

Since we now have added { as: 'raw' } to import.meta.glob, we think that it is better to avoid adding a new caveat. We can directly add an option to let the user ignore certain patterns:

import.meta.glob('node_modules/framework/**/*.page.js', { 
  ignore: 'node_modules/framework/**/node_modules/**' 
})

We didn't discuss this, but given that we support fast glob arrays in options like optimizeDeps.entries where we can pass ignored patterns using !. We could also support this as:

import.meta.glob([
  'node_modules/framework/**/*.page.js',
  '!node_modules/framework/**/node_modules' 
])

I don't know if we should support both, so we probably need to keep iterating. But I think we will end up with one of these two options.

@brillout
Copy link
Contributor Author

Makes sense, I like it, and PR is done: #7482.

Closing this in favor of the new PR.

@brillout brillout closed this Mar 26, 2022
@brillout brillout deleted the feat/perf-glob branch September 18, 2024 08:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

p2-nice-to-have Not breaking anything but nice to have (priority)

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

2 participants