Skip to content

Add pytest-xdist to speed up the CI#6170

Merged
nabenabe0928 merged 10 commits intooptuna:masterfrom
gen740:speedup-ci-test
Jul 2, 2025
Merged

Add pytest-xdist to speed up the CI#6170
nabenabe0928 merged 10 commits intooptuna:masterfrom
gen740:speedup-ci-test

Conversation

@gen740
Copy link
Copy Markdown
Member

@gen740 gen740 commented Jun 20, 2025

Motivation

Current ci test takes around 15 minutes. To improve the speed of ci tests, I added the pytest-xdist to parallelize.

Description of the changes

  • Add pytest-xdist to pyproject.toml
  • Add -n 8 to every pytest execution.
  • Add lock file for getting free port to avoid race-condition of GRPC server.

Additional context

To determine best worker number for the ci test, I ran a lot of test on my repository, and I found it seems the 8 workers are enough.

Following figures are the mean and stddev of 4 ci-tests varying the worker number from 1 to 10. The measurement was done by my repository. The elapsed time is just pytest execution. So this graph DOES NOT contain the pip install or other setup time.
Figure_1

@nabenabe0928
Copy link
Copy Markdown
Contributor

@kAIto47802
Could you review this PR?

@nabenabe0928
Copy link
Copy Markdown
Contributor

Copy link
Copy Markdown
Contributor

@nabenabe0928 nabenabe0928 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the PR, I left some comments:)

pyproject.toml Outdated
markers = [
"skip_coverage: marks tests are skipped when calculating the coverage",
"slow: marks tests as slow (deselect with '-m \"not slow\"')",
"serial: marks tests as fragile when executed in parallel.",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we still need this line?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I forgot to delete this line.

Comment on lines +1663 to +1664
if storage_mode == "grpc_journal_file":
pytest.skip("gRPC journal file storage does not support multi-threading")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not confident if it is fine to skip this test.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I referred to in #6172, this test with grpc_journal_file is potentially fragile, so I think it's better to skip it.
This test with thread sizes more than 20 tends to fail on grpc_journal_file, and with xdist exceeding this thread size.

@kAIto47802
Copy link
Copy Markdown
Collaborator

kAIto47802 commented Jun 27, 2025

How about using -n 4 instead of -n 8 because the default CPU count in GitHub Actions is 4?

Regarding the -n option, how about using -n auto, which uses as many processes as the number of physical CPU cores1.

Footnotes

  1. https://pytest-xdist.readthedocs.io/en/stable/distribution.html

fi

pytest tests -m "$target"
pytest tests -m "$target" -n 8
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably, you wanted to do something like this?

pytest tests -m "not serial" -n auto
pytest tests -m "serial"

gen740 and others added 2 commits July 1, 2025 13:15
Co-authored-by: Shuhei Watanabe <47781922+nabenabe0928@users.noreply.github.com>
Co-authored-by: kAIto47802 <115693559+kAIto47802@users.noreply.github.com>
@gen740
Copy link
Copy Markdown
Member Author

gen740 commented Jul 1, 2025

Regarding the -n option, how about using -n auto, which uses as many processes as the number of physical CPU cores1.

I think the reason the CI test is so slow isn’t the computational load but the tests that include sleep (such as heartbeat tests). In this case, increasing parallelism beyond the number of CPU cores can be effective.

@gen740 gen740 force-pushed the speedup-ci-test branch from f9138b5 to 959a008 Compare July 1, 2025 04:29
@gen740 gen740 force-pushed the speedup-ci-test branch from 480d931 to f649d55 Compare July 1, 2025 06:29
Copy link
Copy Markdown
Contributor

@nabenabe0928 nabenabe0928 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@nabenabe0928 nabenabe0928 added the CI Continuous integration. label Jul 2, 2025
@nabenabe0928 nabenabe0928 added this to the v4.5.0 milestone Jul 2, 2025
Copy link
Copy Markdown
Collaborator

@kAIto47802 kAIto47802 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the PR. LGTM!

@nabenabe0928 nabenabe0928 merged commit 7ad8214 into optuna:master Jul 2, 2025
25 checks passed
@gen740 gen740 deleted the speedup-ci-test branch July 11, 2025 04:24
@nabenabe0928 nabenabe0928 changed the title Add pytest-xdist to speedup the ci test Add pytest-xdist to speed up the CI Jul 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CI Continuous integration.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants