Skip to content

Performance optimization for test_study.py by removing redundancy#6120

Merged
contramundum53 merged 15 commits intooptuna:masterfrom
himkt:study-test-speedup
Jun 4, 2025
Merged

Performance optimization for test_study.py by removing redundancy#6120
contramundum53 merged 15 commits intooptuna:masterfrom
himkt:study-test-speedup

Conversation

@himkt
Copy link
Copy Markdown
Member

@himkt himkt commented May 31, 2025

  • timeout is unnecessarily long for the very simple objective function abb7817
    • objectives in test_study.py just return constant value, which finishes to run very shortly
  • n_trials is unnecessarily large for scenarios where it just needs multiple trials abb7817
  • Using concurrent.futures.ProcessPoolExecutor instead of multiprocessing.Pool 2c0bb80
  • Introduce a helper function to check if stderr prosibly contain progressbar c252476

Before

=========== 595 passed, 5 skipped, 91 warnings in 131.25s (0:02:11) ============

After

============ 595 passed, 5 skipped, 91 warnings in 75.25s (0:01:15) ============

Motivation

Description of the changes

@himkt himkt self-assigned this May 31, 2025
Comment on lines +1641 to +1658
with ProcessPoolExecutor() as pool:
# Test normal behaviour.
pool.submit(_process_tell, study=study, trial=trial0, values=1.2).result()
assert len(study.trials) == 1
assert study.best_trial.state == TrialState.COMPLETE
assert study.best_value == 1.2

# Test study.tell using trial number.
trial = study.ask()
pool.submit(_process_tell, study=study, trial=trial.number, values=1.5).result()
assert len(study.trials) == 2
assert study.best_trial.state == TrialState.COMPLETE
assert study.best_value == 1.2

# Should fail because the trial0 is already finished.
with pytest.raises(ValueError):
fut = pool.submit(_process_tell, study=study, trial=trial0, values=1.2)
fut.result()
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using ProcessPoolExecutor and submit is consistently faster than multiprocessing.Pool and starmap.

Before

> git checkout abb781787
HEAD is now at abb781787 chore: introduce consts for minimal values of n_trials and timeout
> uv run python -m pytest tests/study_tests/test_study.py -k test_tell_from_another_process
=============================================================================== test session starts ===============================================================================
platform darwin -- Python 3.12.0, pytest-8.3.5, pluggy-1.6.0
rootdir: /Users/himkt/work/github.com/himkt/optuna
configfile: pyproject.toml
plugins: profiling-1.8.1
collected 600 items / 599 deselected / 1 selected

tests/study_tests/test_study.py .                                                                                                                                           [100%]

======================================================================== 1 passed, 599 deselected in 4.43s ========================================================================

After

> git checkout 2c0bb80c4
Previous HEAD position was abb781787 chore: introduce consts for minimal values of n_trials and timeout
HEAD is now at 2c0bb80c4 chore: use ProcessPoolExecutor instead of multiprocessing.Pool
>uv run python -m pytest tests/study_tests/test_study.py -k test_tell_from_another_process
=============================================================================== test session starts ===============================================================================
platform darwin -- Python 3.12.0, pytest-8.3.5, pluggy-1.6.0
rootdir: /Users/himkt/work/github.com/himkt/optuna
configfile: pyproject.toml
plugins: profiling-1.8.1
collected 600 items / 599 deselected / 1 selected

tests/study_tests/test_study.py .                                                                                                                                           [100%]

======================================================================== 1 passed, 599 deselected in 2.26s ========================================================================

@himkt himkt requested a review from Copilot May 31, 2025 08:25
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR reduces the number of trials and the timeout values for trivial objective functions to speed up test execution and switches from multiprocessing.Pool to using ProcessPoolExecutor. The changes update several tests to use defined constants and adjust assertions accordingly.

  • Reduced n_trials from 10 to a constant value for faster tests.
  • Updated timeout values for short-running objectives.
  • Switched from multiprocessing.Pool to ProcessPoolExecutor for cross-process communication.
Comments suppressed due to low confidence (1)

tests/study_tests/test_study.py:50

  • The constant 'NUM_MINMAL_TRIALS' appears to have a typo. Consider renaming it to 'NUM_MINIMAL_TRIALS' for clarity.
NUM_MINMAL_TRIALS = 2

@y0z y0z added the sprint-20250531 PR from the online sprint event May 31, 2025. label May 31, 2025
@himkt himkt requested a review from Copilot May 31, 2025 08:33
@himkt himkt changed the title (wip) Performance optimization for test_study.py by removing redundancy May 31, 2025
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR reduces test execution time by introducing minimal trial/time constants, switches to ProcessPoolExecutor for cross-process testing, and centralizes progress bar validation.

  • Define NUM_MINMAL_TRIALS and MINIMUM_TIMEOUT_SEC and replace hard-coded values
  • Add ProcessPoolExecutor for testing tell() across processes
  • Introduce check_progressbar helper and refactor progress bar assertions
Comments suppressed due to low confidence (2)

tests/study_tests/test_study.py:50

  • There’s a typo in the constant name NUM_MINMAL_TRIALS; consider renaming it to NUM_MINIMAL_TRIALS for clarity.
NUM_MINMAL_TRIALS = 2

tests/study_tests/test_study.py:8

  • The multiprocessing import is no longer used after switching to ProcessPoolExecutor; it can be removed to clean up unused imports.
import multiprocessing

_, err = capsys.readouterr()

assert "Best trial: 0" in err
assert "Best trial: 0"
Copy link

Copilot AI May 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This assertion always passes since it doesn’t check the error output. It should be assert "Best trial: 0" in err.

Suggested change
assert "Best trial: 0"
assert "Best trial: 0" in err

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


CallbackFuncType = TypingCallable[[Study, FrozenTrial], None]

NUM_MINMAL_TRIALS = 2
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
NUM_MINMAL_TRIALS = 2
NUM_MINIMAL_TRIALS = 2

Copilot is actually right and there's a typo.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed in 41664ac. 🙏

from concurrent.futures import ProcessPoolExecutor
from concurrent.futures import ThreadPoolExecutor
import copy
import multiprocessing
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
import multiprocessing

multiprocessing is no longer needed (again, by copilot)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, multiprocessing is used in another test... 😄

n_jobs_actual = n_jobs if n_jobs != -1 else multiprocessing.cpu_count()

@himkt himkt marked this pull request as draft May 31, 2025 14:18
@himkt himkt marked this pull request as ready for review June 2, 2025 06:12
@himkt
Copy link
Copy Markdown
Member Author

himkt commented Jun 2, 2025

The PR is now ready for review again.

sampler = study.sampler
with patch.object(sampler, "reseed_rng", wraps=sampler.reseed_rng) as mock_object:
study.optimize(f, n_trials=1, n_jobs=2)
study.optimize(f, n_trials=1, n_jobs=n_jobs)
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not confident but I think test_optimize_with_reseeding unexpectedly runs tests with the fixed configuration (n_jobs=2) even with parametrization.

@not522
Copy link
Copy Markdown
Member

not522 commented Jun 2, 2025

Could you check the CI fail?

@himkt
Copy link
Copy Markdown
Member Author

himkt commented Jun 2, 2025

Sorry I just thought I fixed all failures... resolved in 79b9807.

Copy link
Copy Markdown
Member

@contramundum53 contramundum53 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@contramundum53 contramundum53 merged commit ca18359 into optuna:master Jun 4, 2025
14 checks passed
@y0z y0z added this to the v4.4.0 milestone Jun 4, 2025
@HideakiImamura HideakiImamura added the test Unit test. label Jun 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

sprint-20250531 PR from the online sprint event May 31, 2025. test Unit test.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants