Skip to content

Split unit test over multiple jobs in GitHub Actions, lowering the total run time#4985

Merged
muglug merged 2 commits intovimeo:masterfrom
bendavies:speed-up-tests
Jan 11, 2021
Merged

Split unit test over multiple jobs in GitHub Actions, lowering the total run time#4985
muglug merged 2 commits intovimeo:masterfrom
bendavies:speed-up-tests

Conversation

@bendavies
Copy link
Copy Markdown
Contributor

@bendavies bendavies commented Jan 11, 2021

Hi Matt,

I noticed that the unit test took quite a while to run in GitHub Actions, with the last few runs taking over 20 minutes to complete.

This is quite a long feedback loop, so I've tried to speed up it up.

What I've done is:

  1. List all the tests and randomize them with gnu shuf
  2. consistently split these tests into a N configurable number of groups with gnu split
  3. run these groups in N separate jobs, in M parallel processes, using gnu parallel instead of paratest.
  4. the output uses groups so renders nicely with collapsible sections for each test.
  5. similarly, we use ::error:: to make failing tests easy to spot.

Started with 5 jobs with 5 parallel phpunit processes each, but can tweak this easily over time.

Runtime should be down to around 5 minutes, which is a much better feedback loop.

Thanks.

@bendavies bendavies marked this pull request as ready for review January 11, 2021 22:00
@muglug muglug merged commit 53afd28 into vimeo:master Jan 11, 2021
@muglug
Copy link
Copy Markdown
Collaborator

muglug commented Jan 11, 2021

Thanks! CircleCI is faster (and I was thinking of ditching phpunit in Github Actions) but this is a great solution

@bendavies
Copy link
Copy Markdown
Contributor Author

Yeah, the current GA runners have low resources, while circle's have lots.
It's on GAs roadmap to provide more powerful runners, although no idea if OS projects will be able to use them: github/roadmap#95

danog pushed a commit to danog/psalm that referenced this pull request Jan 29, 2021
…tal run time (vimeo#4985)

* Split unit test over multiple jobs in GitHub Actions, lowering the total run time

* rename FileManipulationTest to FileManipulationTestCase so it does not run as a standalone Test
@bendavies bendavies mentioned this pull request Jan 5, 2022

mkdir -p build/parallel/ build/phpunit/logs/

find tests -name '*Test.php' | shuf --random-source=<(get_seeded_random) > build/tests_all
Copy link
Copy Markdown
Contributor

@Slamdunk Slamdunk Jul 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't sound right to me: this line shuffles the tests based on a seed that is unique for each spawned jobs.tests, because get_seeded_random gets called indeed inside each of them.

So every job shuffles them differently and this lead to some test being run twice or worse, not being run at all.

The random-source should be generated only once in the chuck-matrix job.

/cc @weirdan @danog

Copy link
Copy Markdown
Contributor Author

@bendavies bendavies Jul 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the random source is consistent. the seed is always vimeo/psalm
you can verfy this yourself by running this multiple times

find tests -name '*Test.php' | shuf --random-source=<(openssl enc -aes-256-ctr -pass pass:"vimeo/psalm" -nosalt </dev/zero 2>/dev/null)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I confirm I always saw error in jobs to be consistent from one run to another

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the random source is consistent.

This is good news indeed.

I wonder why shuf is needed in the first place then 🤔
Maybe you know you have some long-running tests and other quick tests, and you want to spread them evenly across the jobs? It would be good to write a comment about the why, just to clear the things out

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants