Sync owned trials when calling `study.ask` and `study.get_trials` by c-bata · Pull Request #4431 · optuna/optuna

c-bata · 2023-02-17T04:18:00Z

Motivation

read_trials_from_remote_storage method in _CachedStorage does not sync trials owned by itself. This behavior causes problems when another worker (e.g. Optuna Dashboard) make the trial completed as follows:
https://gist.github.com/c-bata/69e81a34c395d3ea368e24b0d5e0d367

Description of the changes

This PR changes excluded_trial_ids from owned_or_finished_trial_ids to finished_trial_ids only.

c-bata · 2023-02-17T05:52:37Z

tests/study_tests/test_study.py


        # Should fail because the trial0 is already finished.
-        with pytest.raises(RuntimeError):
+        with pytest.raises(ValueError):


FYI: Previously, check_trial_is_updatable() method in optuna/storages/_rdb/storage.py raises RuntimeError. However, from this change, optuna/study/_tell.py" raises ValueError like the following CI output.
https://github.com/optuna/optuna/actions/runs/4200669694/jobs/7286925478

c-bata · 2023-02-17T06:11:01Z

@amylase @contramundum53 Could you review this PR?

codecov-commenter · 2023-02-17T07:21:00Z

Codecov Report

Merging #4431 (b7dcb02) into master (cc7f873) will increase coverage by 0.37%.
The diff coverage is 100.00%.

📣 This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more

@@            Coverage Diff             @@
##           master    #4431      +/-   ##
==========================================
+ Coverage   90.37%   90.74%   +0.37%     
==========================================
  Files         173      185      +12     
  Lines       13583    15037    +1454     
==========================================
+ Hits        12275    13645    +1370     
- Misses       1308     1392      +84

Impacted Files	Coverage Δ
optuna/storages/_cached_storage.py	`98.75% <100.00%> (+0.03%)`	⬆️
optuna/study/study.py	`95.46% <100.00%> (+0.50%)`	⬆️

... and 35 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

amylase

Also, could you update the section of _CachedStorage's docstring that mentions read_trials_from_remote_storage?

https://github.com/optuna/optuna/pull/4431/files#diff-8ce5c3176c6b5fa3a21ed11da3f19d07c7c0291b7b7b6b67221b057855b7ac50R55

amylase · 2023-02-20T07:52:03Z

optuna/storages/_cached_storage.py

            return copy.deepcopy(trials) if deepcopy else trials

-    def read_trials_from_remote_storage(self, study_id: int) -> None:
+    def read_trials_from_remote_storage(self, study_id: int, sync_owned_trials: bool) -> None:


Did you consider to implement two separate methods instead of having a boolean parameter?

Would you mind motivating this change a bit?

amylase · 2023-02-20T07:55:12Z

optuna/storages/_cached_storage.py

    ) -> List[FrozenTrial]:
        if study_id not in self._studies:
-            self.read_trials_from_remote_storage(study_id)
+            self.read_trials_from_remote_storage(study_id, sync_owned_trials=False)


Why can we skip syncing owned trials here? If external processes can access & update them, I think we should always sync them.

Because it may cause a performance degradation for users who use study.optimize() api. However, on second thought, Optuna samplers call study.get_trials() so there seems to be a performance degradation either way. It may be better to remove the sync_owned_trials option and always sync them.

github-actions · 2023-03-06T23:07:22Z

This pull request has not seen any recent activity.

github-actions · 2023-03-21T23:08:52Z

This pull request was closed automatically because it had not seen any recent activity. If you want to discuss it, you can reopen it freely.

optuna/storages/_cached_storage.py

c-bata · 2023-03-24T07:06:26Z

optuna/storages/_cached_storage.py

-        with self._lock:
-            cached_trial = self._get_cached_trial(trial_id)
-            if cached_trial is not None:
-                self._check_trial_is_updatable(cached_trial)
-
-                study_id, _ = self._trial_id_to_study_id_and_number[trial_id]
-                cached_dist = self._studies[study_id].param_distribution.get(param_name, None)
-                if cached_dist:
-                    distributions.check_distribution_compatibility(cached_dist, distribution)
-                else:
-                    # On cache miss, check compatibility against previous trials in the database
-                    # and INSERT immediately to prevent other processes from creating incompatible
-                    # ones. By INSERT, it is assumed that no previous entry has been persisted
-                    # already.
-                    self._backend._check_and_set_param_distribution(
-                        study_id, trial_id, param_name, param_value_internal, distribution
-                    )
-                    self._studies[study_id].param_distribution[param_name] = distribution
-
-                params = copy.copy(cached_trial.params)
-                params[param_name] = distribution.to_external_repr(param_value_internal)
-                cached_trial.params = params
-
-                dists = copy.copy(cached_trial.distributions)
-                dists[param_name] = distribution
-                cached_trial.distributions = dists
-
-                if cached_dist:  # Already persisted in case of cache miss so no need to update.
-                    self._backend.set_trial_param(
-                        trial_id, param_name, param_value_internal, distribution
-                    )
-                return


This logic updates the cached owned trials. However, from this PR, owned trials are always synced, so this logic is no longer needed.

c-bata · 2023-03-24T07:08:46Z

tests/storages_tests/test_cached_storage.py

-    trial_id = storage.create_new_trial(study_id)
-    with patch.object(
-        base_storage, "_check_and_set_param_distribution", return_value=True
-    ) as set_mock:
-        storage.set_trial_param(
-            trial_id, "paramA", 1.2, optuna.distributions.FloatDistribution(-0.2, 2.3)
-        )
-        assert set_mock.call_count == 1


This test case checks that the owned trial cache is updated. However, as explained in https://github.com/optuna/optuna/pull/4431/files#r1147191207, the owned trials always synced.

c-bata · 2023-03-24T07:21:39Z

FYI: I ran the following benchmark to compare how CachedStorage slows down with this change.

https://gist.github.com/c-bata/16d654471393f59e064482ffb8b52bf8

	master	This PR
RDBStorage	47.9438s	51.0639s
CachedStorage(RDBStorage)	36.6144s	54.9168s

c-bata · 2023-03-30T06:42:53Z

Let me unassign reviewers since the performance become slower than I expected. I will mark this ready for review after fixing the performance issue.

github-actions · 2023-04-11T23:08:05Z

This pull request has not seen any recent activity.

github-actions bot added the optuna.storages Related to the `optuna.storages` submodule. This is automatically labeled by github-actions. label Feb 17, 2023

Sync owned trials when calling study.ask and study.get_trials

94dcd58

c-bata force-pushed the cached-storage-sync-owned-trials branch from d6af03f to 94dcd58 Compare February 17, 2023 04:23

c-bata added the enhancement Change that does not break compatibility and not affect public interfaces, but improves performance. label Feb 17, 2023

Fix a broken test

b7dcb02

c-bata commented Feb 17, 2023

View reviewed changes

c-bata marked this pull request as ready for review February 17, 2023 05:55

c-bata assigned amylase and contramundum53 Feb 17, 2023

amylase reviewed Feb 20, 2023

View reviewed changes

toshihikoyanase mentioned this pull request Mar 3, 2023

Support DDP in PyTorch-Lightning #4384

Merged

github-actions bot added the stale Exempt from stale bot labeling. label Mar 6, 2023

github-actions bot closed this Mar 21, 2023

c-bata reopened this Mar 22, 2023

c-bata removed the stale Exempt from stale bot labeling. label Mar 22, 2023

c-bata commented Mar 23, 2023

View reviewed changes

optuna/storages/_cached_storage.py Outdated Show resolved Hide resolved

c-bata marked this pull request as draft March 23, 2023 05:44

c-bata added 4 commits March 24, 2023 14:59

Remove sync_owned_trials option and always sync them

ea25ad6

Merge branch 'master' into cached-storage-sync-owned-trials

18acff2

Remove unused attributes from CachedStorage

ad01311

Remove the logic to update owned trials cache

9aca86c

c-bata commented Mar 24, 2023

View reviewed changes

c-bata unassigned amylase and contramundum53 Mar 30, 2023

github-actions bot added the stale Exempt from stale bot labeling. label Apr 11, 2023

c-bata closed this Apr 12, 2023

not522 mentioned this pull request Apr 24, 2023

Sync owned trials when calling study.ask and study.get_trials #4631

Merged

Uh oh!

Conversation

c-bata commented Feb 17, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Description of the changes

Uh oh!

c-bata Feb 17, 2023

Choose a reason for hiding this comment

Uh oh!

c-bata commented Feb 17, 2023

Uh oh!

codecov-commenter commented Feb 17, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

amylase left a comment

Choose a reason for hiding this comment

Uh oh!

amylase Feb 20, 2023

Choose a reason for hiding this comment

Uh oh!

c-bata Feb 21, 2023

Choose a reason for hiding this comment

Uh oh!

amylase Feb 20, 2023

Choose a reason for hiding this comment

Uh oh!

c-bata Feb 24, 2023

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Mar 6, 2023

Uh oh!

github-actions bot commented Mar 21, 2023

Uh oh!

Uh oh!

c-bata Mar 24, 2023

Choose a reason for hiding this comment

Uh oh!

c-bata Mar 24, 2023

Choose a reason for hiding this comment

Uh oh!

c-bata commented Mar 24, 2023

Uh oh!

c-bata commented Mar 30, 2023

Uh oh!

github-actions bot commented Apr 11, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

c-bata commented Feb 17, 2023 •

edited

Loading

codecov-commenter commented Feb 17, 2023 •

edited

Loading