Improve performance of excluded files filter#5157
Conversation
Generated by 🚫 Danger |
e99f0a9 to
a253a0d
Compare
4094558 to
30d4908
Compare
|
Periphery had a similar performance issue not long ago, and I noticed there wasn't a solid glob to regex implementation, so I ported Python's fnmatch to Swift: https://github.com/ileitch/swift-filename-matcher. It might be useful here too. |
|
Last time we tried to speed this up, it caused some slight differences in the result of what was matched vs not, so please be super careful here. |
|
At the moment, I'm rather concerned here that normal runs without any excludes and includes seem to become much slower sometimes.
This is a very helpful tip. I don't want to invent a half-backed version myself. Thanks! |
bfd249c to
6a0585a
Compare
a4e724b to
a95044b
Compare
a95044b to
ca66b0c
Compare
|
Any chance this can land? 🙏 |
ca66b0c to
e5989f1
Compare
This is a critical change that needs thorough testing. Unfortunately, I'm lacking own projects with nifty @JaviSoto: In case this change took effect in your projects, I'd appreciate your feedback. |
e5989f1 to
dad51eb
Compare
21c0add to
61acd6d
Compare
61acd6d to
6ec4d52
Compare
28d212f to
b60d907
Compare
|
@SimplyDanny, thanks for drastically improving SwfitLint's speed 💪. On one of our projects, I noticed that Project with ~400 files: Project with ~2000 files: |
You're saying this is a regression?
Down to ~10% of linting time is impressive. However, one sample with a single |
b60d907 to
1d38106
Compare
Yes, a little.
It's ~15 excludes with 7-8 globs. Also, I found https://github.com/davbeck/swift-glob which probably will be even faster with async file matching. |
|
Running Here are the results in comparison between => => PR #5157 (3.5 seconds) |
6f5d61b to
8358e63
Compare
|
I appreciate your performance reports! This really helps a lot to evaluate this critical change in real-world settings.
That's now fixed with the most recent update. Thanks @nekrich and @ileitch for your efforts!
This concerns me. @CraigSiemens: I assume the project you ran SwiftLint on is not open source, is it? I'm really interested in what causes this slow down. I don't want to make the tool slightly slower in most common settings only to improve performance in edge cases. |
|
2.1 million Swift LOC codebase with 19 excluded patterns.
This PR |
6f63dd8 to
9c3791f
Compare
|
In summary, according to the reports, we see performance improvements down to 6% of However, there are also constellations where the new approach of file matching is slightly slower. In my observations, this happens when the file system (or the file system API) has cached enough information. In this case, collecting all files is very fast, even faster than the regular expression matching. This is usually the case in performance tests and the reason why regex matching appears slower. In real word linting, these optimal conditions are not too common. Seems like we can give this a try in the next release. |
97c70cb to
c79a650
Compare
The current algorithm is like "collect all included files and subtract all excluded files". Collecting all included and all excluded files relies on the file system. This can become slow when the patterns used to exclude files resolve to a large number of files. The new approach only collects all lintable files and checks them against the exclude patterns. This can be done by in-memory string-regex-match and does therefore not require file system accesses.
c79a650 to
2ca5e3c
Compare
|
@SimplyDanny Can we cut a new release? This is a nice fix |
|
Unfortunately the new exclusion algorithm does not work if used with an Xcode build tool plugin that relies on a configuration file. I guess the path matching doesn't work correctly in that case. Please note that I'm using my own build tool plugin variant (with the official It does still work with the I'm not sure if you consider this a supported use case, so let me know if I should create an issue or not. |
This reverts commit 152355e. # Conflicts: # tools/oss-check
This reverts commit 152355e. # Conflicts: # tools/oss-check
Everything should work as before with >=0.58.1, as I've reverted this change. Unfortunately, there are much more things to think thoroughly about before this optimization can be re-included in a future release. |
The current algorithm is like "collect all included files and subtract all excluded files". Collecting all included and all excluded files relies on the file system. This can become slow when the patterns used to exclude files resolve to a large number of files.
The new approach only collects all lintable files and checks them against the exclude patterns. This can be done by in-memory string-regex-match and does therefore not require file system accesses.
The most critical part is the conversion of glob patterns to regular expressions. I might have missed cases.
Fixes #5018.