Initialize caches for packages and standalone files#5237
Conversation
crates/ruff_cli/src/commands/run.rs
Outdated
| &settings.lib, | ||
| ); | ||
| (&**package_root, cache) | ||
| (*cache_root, cache) |
There was a problem hiding this comment.
I didn't spend any time optimizing this, feel free!
There was a problem hiding this comment.
(Not sure if worth calling .unique() above, if it's worth the .collect() allocation in order to allow the .par_iter(), if it's worth using the .par_iter() at all given far fewer entries, etc.)
1ea2fc7 to
e1d1af9
Compare
PR Check ResultsEcosystem✅ ecosystem check detected no changes. BenchmarkLinuxWindows |
|
388K is great for the cache size BTW -- that's > 10x smaller than when we started, I think? |
MichaReiser
left a comment
There was a problem hiding this comment.
I don't have much context on how packages work. I'll leave it to @Thomasdezeeuw review.
I would be in favor of removing the par_iter if it doesn't impact performance much.
crates/ruff_cli/src/commands/run.rs
Outdated
| let cache = caches.as_ref().and_then(|caches| caches.get(&package_root)); | ||
|
|
||
| let cache_root = package.unwrap_or_else(|| path.parent().unwrap_or(path)); | ||
| let cache = caches.as_ref().and_then(|caches| caches.get(&cache_root)); |
There was a problem hiding this comment.
(maybe for another) I understand the need to remove the unwrap as it was causing panics, but it does mean something went wrong before this state (as the caches should be filled if we reach this point). Maybe we should emit a warning or something?
crates/ruff_cli/src/commands/run.rs
Outdated
| .collect::<Vec<_>>(); | ||
| cache_roots | ||
| .par_iter() |
There was a problem hiding this comment.
Why are we collecting only to iterate over it again?
There was a problem hiding this comment.
None of those structs implement .par_iter().
e1d1af9 to
24677a1
Compare
Summary
While fixing #5233, I noticed that in FastAPI, 343 out of 823 files weren't hitting the cache. It turns out these are standalone files in the documentation that lack a "package root". Later, when looking up the cache entries, we fallback to the package directory.
This PR ensures that we initialize the cache for both kinds of files: those that are in a package, and those that aren't.
The total size of the FastAPI cache for me is now 388K. I also suspect that this approach is much faster than as initially written, since before, we were probably initializing one cache per directory.
Test Plan
Ran
cargo run -p ruff_cli -- check ../fastapi --verbose; verified that, on second execution, there were no "Checking" entries in the logs.