Skip to content

Use parallel for_each to load all ports#694

Closed
Thomas1664 wants to merge 48 commits intomicrosoft:mainfrom
Thomas1664:load-all-ports
Closed

Use parallel for_each to load all ports#694
Thomas1664 wants to merge 48 commits intomicrosoft:mainfrom
Thomas1664:load-all-ports

Conversation

@Thomas1664
Copy link
Copy Markdown
Contributor

@Thomas1664 Thomas1664 commented Sep 3, 2022

This PR adds the ability to use parallel algorithms to CMakeLists.txt. GCC needs OpenMP to use this feature (docs). Fortunately, OpenMP is preinstalled and we can easily find it with cmake.

I implemented the parallel version of std::for_each for UNIX by myself because Clang doesn't support parallel algorithms at all and GCC requires run-time dependencies.

x-add-version --all

I was able to get 2x performance improvement [12 s before vs. 6 s after; GCC Debug]. I had similar results on Windows release and debug builds. I also tested if there are any downsides when using this command without --all and found out that for_each is a little bit faster than the range based for loop that was used previously.

@Thomas1664 Thomas1664 marked this pull request as ready for review September 3, 2022 19:33
@Thomas1664 Thomas1664 marked this pull request as draft September 7, 2022 22:33
Copy link
Copy Markdown
Member

@BillyONeal BillyONeal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In addition to the data race I made a comment about, I highly doubt that update_version_db_file and update_baseline_version are safe to call concurrently.

I am additionally concerned about adding additional runtime dependencies on Linux (OpenMP). We are a bootstrapping tool in a lot of situations, so we have to be careful there. If that dependency would ever be a separate apt package, for example, we probably do not want to do that.

That isn't saying that I dislike the parallel algorithms library, or that I dislike what this is doing. On the contrary, I think this concept in general is a good improvement, and certainly on Windows where it doesn't introduce any additional deployment problems, I'm all for it. (If anything I'm biased to say use more parallel algos as the original author of most of MSVC++'s parallel algorithms implementation :)) It's just that I think we need to value broad compatibility over performance improvements like this given the position that we are in. We ripped out the std::filesystem dependency for similar reasons.

Paths forward here if you want to keep contributing in a similar area:

  • Parallel for_each isn't that complex an algorithm to do for this, particularly when we know the 'elements' are going to be relatively expensive / do disk I/O like this. You could build such a thing on top of pthreads which would eliminate the extra dependencies concern on other platforms.
  • Our installation process currently could benefit a lot from being pipelined. For example, building the next thing doesn't have to wait for building binary cache packages and/or copying installed bits for the current thing.

Thanks for your contribution and I hope it works out!

@ras0219-msft
Copy link
Copy Markdown
Collaborator

Our installation process currently could benefit a lot from being pipelined. For example, building the next thing doesn't have to wait for building binary cache packages and/or copying installed bits for the current thing.

A quick note to make the implicit explicit: we do have potential race conditions between multiple installs (manipulations within the downloads folder). The obvious improvement here is to start the next build while {packing up, uploading to binary caches, installing files}, but we currently cannot start another build while the current build is in progress.

@Thomas1664
Copy link
Copy Markdown
Contributor Author

A quick note to make the implicit explicit: we do have potential race conditions between multiple installs (manipulations within the downloads folder). The obvious improvement here is to start the next build while {packing up, uploading to binary caches, installing files}, but we currently cannot start another build while the current build is in progress.

We can only do binary caching in parallel because post build lint needs to happen before the next build starts. Parallel downloads are impossible because downloads are triggered from inside cmake and there might be file conflicts.

If we were to install ports in parallel we have to keep the console output consistent maybe by adding [port-name] in front of it.

In general, this is does not require parallel algorithms but is more a problem of std::async.

@Thomas1664
Copy link
Copy Markdown
Contributor Author

  • Parallel for_each isn't that complex an algorithm to do for this, particularly when we know the 'elements' are going to be relatively expensive / do disk I/O like this. You could build such a thing on top of pthreads which would eliminate the extra dependencies concern on other platforms.

I'd prefer std::async because it's C++ and doesn't require additional dependencies.

{
std::mutex mtx;

auto work = [&](const std::string& port_name) {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From this large block, I believe there are two operations worth parallelizing: Paragraphs::try_load_port and the formatting check. Everything else done in this function is a rounding error -- and worse, is user-visible I/O (printing messages, terminating the program).

I'd prefer to see these two operations individually parallelized.

  1. They would be safer & simpler to use
  2. They could be reused elsewhere in the program
  3. Achieves 99.9% of the performance of wrapping the rest of this logic
  4. Avoids printing messages from multiple threads, which makes the final output significantly more correct (all messages will be deterministically printed in the correct order, they will all be printed, and they will never be "chopped off" due to early terminates)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ras0219-msft The thing is that try_load_port() itself may call Checks::check_exit(). Given the fact that we sprinkle Checks::check_exit everywhere in our code, there is nearly nothing that we can parallelize!

IMO we should replace this by throwing an exception instead and print the error message properly by wrapping everything in main() inside a try ... catch and call Checks::check_exit from catch.

Copy link
Copy Markdown
Collaborator

@ras0219-msft ras0219-msft Sep 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then as part of this, try_load_port() should be fixed to never call Checks::check_exit() and always return its errors via the existing return type of Expected<>.

Interleaving console output is still not ok -- users can and will get torn messages and nondeterministic behavior.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ras0219-msft The formatting check can't be parallelized because it calls too many functions that may exit. Please also have a look at microsoft/vcpkg#27152 .

@Thomas1664 Thomas1664 marked this pull request as draft October 3, 2022 13:05
@Thomas1664
Copy link
Copy Markdown
Contributor Author

Hi @BillyONeal, @ras0219-msft,

As mentioned above, we can't use parallel algorithms if the functions that are called from different threads may exit. Unfortunately, exiting from a function is widespread across the codebase and even happens in base things like Filesystem or Json. Even if I replaced all of those exits with ExpectedS, there might be future changes down in the call stack that add calls to exit. Therefore, I don't think it's safe to use any kind of parallelization until we have a policy on when it is ok to exit the program.

@BillyONeal
Copy link
Copy Markdown
Member

It seems unlikely to me that we are going to invest in this direction anytime soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants