regular terminating and respawning of processes#3722
regular terminating and respawning of processes#3722TomasVotruba merged 10 commits intorectorphp:mainfrom
Conversation
|
/cc @yguedidi |
|
@Flyingmana looks like we worked on improving parallel runs on big projects in parallel 😅 my PR is ready for review now :) |
|
@Flyingmana what if you try #3563 with the proposed configuration in #3563 (comment) (or a similar one adapted to your project)? |
|
I hope this PR is better suited for a Review and merge by the limited amount of changes compared to #3563. |
| break; | ||
| } | ||
| $workerCommandLineFactory = $this->workerCommandLineFactory; | ||
| $this->processSpawner = function( |
There was a problem hiding this comment.
any need to have it as a property?
There was a problem hiding this comment.
you can make it a local variable, and pass it by reference in the use list to use it inside
There was a problem hiding this comment.
maybe naming it $startProcessCallable would make it clearer?
There was a problem hiding this comment.
I actively wanted to avoid having it as a local variable, because there are to many of them which makes reading and changing the code a lot more complex. I would have preferred to make this an actual method in the class, but this would have needed more rewrites
There was a problem hiding this comment.
if a private method is too expensive, to me a local variable is a better fit here than a property, as it's a callable used only locally there
| } | ||
| if ($this->processRunCounter[$processIdentifier] >= $this->maxProcessIterations) { | ||
| $this->processPool->quitProcess($processIdentifier); | ||
| if (is_callable($this->processSpawner)){ |
There was a problem hiding this comment.
how this can be false at this stage?
There was a problem hiding this comment.
phpstan complained, I dont think it can be false, but the call structure is complex
There was a problem hiding this comment.
by following the suggestion in #3722 (comment) it shouldn't complain, as the local variable will be assigned directly, never null
|
@raymondschouten progress bar is probably overlapped on large project, even on current dev-main, that 100% is printed later after " [OK] 1592 files have been changed by Rector" message: so probably "just fine". Do you have progress bar issue on mini project, eg: 100 files? @Flyingmana that would be awesome if you can catch a possible bug on progress bar not 100% but done. |
I debugged it a bit, its quite easy to reproduce with Obvious issue is probably, that But actually I think there is a place which does count wrong, and as the counting happens in the same place, where we collect the fileDiffs, there might be a bug which actually loses these Diffs. You say the progress bar issue happens independently of my changes? Then I would like to check it in a separate Issue/PR |
|
Any remaining changes which are still required for this? |
samsonasik
left a comment
There was a problem hiding this comment.
Looks good for me, for progress bar, if you have a chance to check, it should still apply to 100% even overlapped after "Rector is done" message so we have same behavior as before, better if that really resolved ( 100% show before "Rector is done" message)
samsonasik
left a comment
There was a problem hiding this comment.
@Flyingmana I tried your branch, it seems when show the diff, it doesn't show 100% progress bar,
so it maybe overlapped. could you check run to see the progress bar to actually show 100% ?
time bin/rector process src tests rules-tests packages packages-tests
|
@TomasVotruba the progress bar may need to be fixed to show 100% on many files |
|
|
||
| for ($i = 0; $i < $numberOfProcesses; ++$i) { | ||
| // nothing else to process, stop now | ||
| if ($jobs === []) { |
There was a problem hiding this comment.
is force progress bar to 100% here possible/correct?
There was a problem hiding this comment.
I tried apply progress bar to 100% here, but seems not working, it only show 97%:
➜ rector-src git:(df-parallel) time bin/rector process src tests rules-tests packages packages-tests --dry-run --clear-cache
1490/1530 [▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓░] 97%
1 file with changes
===================
1) packages/Parallel/Application/ParallelFileProcessor.php:3
There was a problem hiding this comment.
it seems it can be applied on $inDecoder->on() process on jobs empty:
if ($jobs === []) {
$this->processPool->quitProcess($processIdentifier);
+ $this->rectorOutputStyle->progressFinish();
return;
}There was a problem hiding this comment.
The issue with it is it show different total when applied, when on progress, it shows 1530:
➜ rector-src git:(df-parallel) ✗ time bin/rector process src tests rules-tests packages packages-tests --dry-run --clear-cache
0/1530 [░░░░░░░░░░░░░░░░░░░░░░░░░░░░]
when done, it shows 1550:
➜ rector-src git:(df-parallel) ✗ time bin/rector process src tests rules-tests packages packages-tests --dry-run --clear-cache
1550/1550 [▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓] 100%
2 files with changes
====================
1) packages/Parallel/Application/ParallelFileProcessor.php:3
There was a problem hiding this comment.
I see, it can actually be forced on reporting on ConsoleOutputFormatter:
public function report(ProcessResult $processResult, Configuration $configuration): void
{
+ if ($configuration->shouldShowProgressBar()) {
+ $this->rectorOutputStyle->progressFinish();
+ }By this, this can show 100%:
➜ rector-src git:(df-parallel) ✗ time bin/rector process src tests rules-tests packages packages-tests --dry-run --clear-cache
1530/1530 [▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓] 100%
1 file with changes
===================
1) packages/Parallel/Application/ParallelFileProcessor.php:3
There was a problem hiding this comment.
@Flyingmana I create a commit to your branch 88a755b 👍 to handle progress bar 100% issue.
and move $processRunCounter into class property
|
All checks have passed 🎉 @TomasVotruba I think it is ready to give a try 👍 |
|
Thank you all for your hard work. Let's give this a try :) I'll tag a new version, so we have clean plate for testing this one 🙏 |
|
@TomasVotruba @Flyingmana it seems caused error on downgrade so we need to revert |
This reverts commit ff05d32.
|
Reverted at #4489 |
|
@samsonasik Got any reproducer we can use here as a test case? |
|
I will try 👍 , some process probably forced terminated before it actually complete |




Context
As described here rectorphp/rector#7891
I had some issues with memory usage, which caused some issues on a very large project.
I tried out this PR #3563 but as described in #3563 (comment) it was not enough for my usecase.
What is this PR doing
This PR introduces the ability to "restart" spawned workers to reset their memory usage.
In the current Draft it is doing this hardcoded after 10 Iterations. Which should be high enough to not affect the normal usage, but low enough to improve the situation for big projects.
And its abstracted into a property, so adding a setting later is relatively easy. (later, because it opens up questions on how to configure the parallelization properly, especially the conditions for the restarting)
What I changed
I basically just added the lines 186-195 and moved the process spawning into an own callable, so it can get called repeatedly.
I would like to move more into class properties and methods, but thats not so easy and might make the Review harder
Performance comparison
files per restart = jobsize * (onData iterations till restart)
First one comparison extended with the previously linked PR
#3563 (comment)
But now also one Run for the Full project (19.943 files), not only a subset
With 1.000 Files per restart, it still exceeded to much my available memory, but with 500 it did fit into memory and I was able to use 100% of my available CPUs, drastically cutting down the needed Real Time.
updated performance test
Setup:
WSL2, Linux, php 8.1
cpu: physical cores 6, logical 12
Ram: 32 GB (25GB allowed for WSL2)
used config:
On a current symfony/symfony checkout, 9629 files
last release: rector 1.6.0
and this one is after merging in the changes from the last release