feat(core): targeted file scanner#6614
Conversation
🦋 Changeset detectedLatest commit: 90d8516 The changes in this PR will be included in the next version bump. This PR includes changesets to release 14 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
|
I think the second issue you linked is incorrect |
|
Good catch, fixed. |
ematipico
left a comment
There was a problem hiding this comment.
I checked the tests and the implementation, and overall I think that want, good job! I left a comment regarding the --config-path implementation, where I have some doubts.
It would be great if we could add new tests in the monorepo cases, where we use --config-path.
There was a problem hiding this comment.
Should we start documenting the scanner's behaviour on our website? Imho, we should. Not just for the users, but for us too, so we document the business logic of Biome, and when PRs are raised against it, we can verify if issue is in line with the docs or not.
If we shouldn't, then we should at least document it internally somewhere. At the moment, the business logic is scattered across files/modules, so having a central place to document the business logic might help us in the future
There was a problem hiding this comment.
Yes, agreed. I think we can document it on the website, we just need to be careful to be not too specific to the point that people may start to rely on implementation details where it would become a breaking change if we improve the behaviour.
| // directory and configuration directory, we use whichever one is higher | ||
| // up in the file system. |
There was a problem hiding this comment.
Are you sure it's the correct expectations? Let's say we call the CLI from /Users/app with biome check --config-path=webapp/biome.json, it means we have two directories
/Users/appwhich isfs.working_directory/Users/app/webappwhich is the directory from--config-path
Based on this logic, /Users/app is chosen as project_dir (based on the comment), however this folder isn't the real biome project. So I would expect /Users/app/webapp to be my working directory where we scan the project and where the root configuration is.
Now, I may have misunderstood the comment or the new functionality, so could you please clarify it for me?
EDIT: I know it's kind of an edge case, however, we shouldn't forget that Biome can be installed globally, which means that via CLI a user could run biome from the root of the project, even though the main configuration is in a nested folder.
There was a problem hiding this comment.
I agree it’s questionable, but there was an existing test case that said /Users/app must be used in such a case, so I stuck to it. It means users are not forced to have biome.json in the root of their repository, which I suppose some people may rely upon.
But indeed, there is a downside that potentially we register the wrong folder in such a case. But indeed in terms of behaviour I think nothing should change.
| return Some(ScanKind::NoScanner); | ||
| } else { | ||
| return Some(ScanKind::TargetedKnownFiles { | ||
| target_paths: vec![BiomePath::new(working_dir.join(path))], |
There was a problem hiding this comment.
This may be a change in functionality. Until now, we haven't tinkered with the paths passed via stdin, so if this change may be what we want, we should document it (update the docs) and possibly the doc comments of --stdin-file-path
There was a problem hiding this comment.
Yes, it is. Pretty sure it is what we want though, as users who provide --stdin-file-path expect the path to have meaning. This allows us to apply the configuration that would be applied if the file at the given path were checked normally (ie. without using stdin).
| if self | ||
| .workspace | ||
| .is_path_ignored(IsPathIgnoredParams { | ||
| project_key: self.project_key, | ||
| path: path.clone(), | ||
| // The scanner only cares about the top-level | ||
| // `files.includes` | ||
| features: FeaturesBuilder::new().build(), | ||
| }) | ||
| .unwrap_or_default() |
There was a problem hiding this comment.
FYI, not for this PR, but I intend to create a new method inside the workspace that queries only files.includes, and it doesn't check for paths ignored via VCS.
That should increase performance for this particular case.
Co-authored-by: Emanuele Stoppa <my.burning@gmail.com>
Summary
Implemented a more targeted version of the scanner, which ensures that if you provide file paths to handle on the CLI, the scanner will exclude directories that are not relevant to those paths.
Note that for many commands, such as
biome checkandbiome format, the file paths to handle are implicitly set to the current working directory if you do not provide any path explicitly. The targeted scanner also works with such implicit paths, which means that if you run Biome from a subfolder, other folders that are part of the project are automatically exempted.Use cases where you invoke Biome from the root of the project without providing a path, as well as those where project rules are enabled, are not expected to see performance benefits from this.
Fixed #6234.
Fixed #6483.
Fixed #6563.
Test Plan
I have added a bunch of automated tests, but could use a little bit of extra verification from others as well.
Even though we could technically push this as a patch release, there may be subtle behavioral changes. I don't expect any issues, but just in case I would prefer to push this as a minor instead.