Use parallel for_each to load all ports#694
Use parallel for_each to load all ports#694Thomas1664 wants to merge 48 commits intomicrosoft:mainfrom
Conversation
BillyONeal
left a comment
There was a problem hiding this comment.
In addition to the data race I made a comment about, I highly doubt that update_version_db_file and update_baseline_version are safe to call concurrently.
I am additionally concerned about adding additional runtime dependencies on Linux (OpenMP). We are a bootstrapping tool in a lot of situations, so we have to be careful there. If that dependency would ever be a separate apt package, for example, we probably do not want to do that.
That isn't saying that I dislike the parallel algorithms library, or that I dislike what this is doing. On the contrary, I think this concept in general is a good improvement, and certainly on Windows where it doesn't introduce any additional deployment problems, I'm all for it. (If anything I'm biased to say use more parallel algos as the original author of most of MSVC++'s parallel algorithms implementation :)) It's just that I think we need to value broad compatibility over performance improvements like this given the position that we are in. We ripped out the std::filesystem dependency for similar reasons.
Paths forward here if you want to keep contributing in a similar area:
- Parallel for_each isn't that complex an algorithm to do for this, particularly when we know the 'elements' are going to be relatively expensive / do disk I/O like this. You could build such a thing on top of pthreads which would eliminate the extra dependencies concern on other platforms.
- Our installation process currently could benefit a lot from being pipelined. For example, building the next thing doesn't have to wait for building binary cache packages and/or copying installed bits for the current thing.
Thanks for your contribution and I hope it works out!
A quick note to make the implicit explicit: we do have potential race conditions between multiple installs (manipulations within the downloads folder). The obvious improvement here is to start the next build while {packing up, uploading to binary caches, installing files}, but we currently cannot start another build while the current build is in progress. |
We can only do binary caching in parallel because post build lint needs to happen before the next build starts. Parallel downloads are impossible because downloads are triggered from inside cmake and there might be file conflicts. If we were to install ports in parallel we have to keep the console output consistent maybe by adding In general, this is does not require parallel algorithms but is more a problem of |
I'd prefer |
src/vcpkg/commands.add-version.cpp
Outdated
| { | ||
| std::mutex mtx; | ||
|
|
||
| auto work = [&](const std::string& port_name) { |
There was a problem hiding this comment.
From this large block, I believe there are two operations worth parallelizing: Paragraphs::try_load_port and the formatting check. Everything else done in this function is a rounding error -- and worse, is user-visible I/O (printing messages, terminating the program).
I'd prefer to see these two operations individually parallelized.
- They would be safer & simpler to use
- They could be reused elsewhere in the program
- Achieves 99.9% of the performance of wrapping the rest of this logic
- Avoids printing messages from multiple threads, which makes the final output significantly more correct (all messages will be deterministically printed in the correct order, they will all be printed, and they will never be "chopped off" due to early terminates)
There was a problem hiding this comment.
@ras0219-msft The thing is that try_load_port() itself may call Checks::check_exit(). Given the fact that we sprinkle Checks::check_exit everywhere in our code, there is nearly nothing that we can parallelize!
IMO we should replace this by throwing an exception instead and print the error message properly by wrapping everything in main() inside a try ... catch and call Checks::check_exit from catch.
There was a problem hiding this comment.
Then as part of this, try_load_port() should be fixed to never call Checks::check_exit() and always return its errors via the existing return type of Expected<>.
Interleaving console output is still not ok -- users can and will get torn messages and nondeterministic behavior.
There was a problem hiding this comment.
@ras0219-msft The formatting check can't be parallelized because it calls too many functions that may exit. Please also have a look at microsoft/vcpkg#27152 .
|
Hi @BillyONeal, @ras0219-msft, As mentioned above, we can't use parallel algorithms if the functions that are called from different threads may exit. Unfortunately, exiting from a function is widespread across the codebase and even happens in base things like |
|
It seems unlikely to me that we are going to invest in this direction anytime soon. |
This PR adds the ability to use parallel algorithms to
CMakeLists.txt.GCC needs OpenMP to use this feature (docs). Fortunately, OpenMP is preinstalled and we can easily find it with cmake.I implemented the parallel version of
std::for_eachfor UNIX by myself because Clang doesn't support parallel algorithms at all and GCC requires run-time dependencies.x-add-version --allI was able to get 2x performance improvement [12 s before vs. 6 s after; GCC Debug]. I had similar results on Windows release and debug builds. I also tested if there are any downsides when using this command without
--alland found out thatfor_eachis a little bit faster than the range based for loop that was used previously.