Sync owned trials when calling `study.ask` and `study.get_trials` by not522 · Pull Request #4631 · optuna/optuna

not522 · 2023-04-24T04:51:21Z

Motivation

This PR takes over #4431 and #4603.

#4431

read_trials_from_remote_storage method in _CachedStorage does not sync trials owned by itself. This behavior causes problems when another worker (e.g. Optuna Dashboard) make the trial completed as follows:
https://gist.github.com/c-bata/69e81a34c395d3ea368e24b0d5e0d367

#4603

This is part of the work to deprecate _CachedStorage.

When directly using RDBStorage, _get_latest_trial is a bottleneck because it reads system_attrs from the remote storage.

Description of the changes

#4431

This PR changes excluded_trial_ids from owned_or_finished_trial_ids to finished_trial_ids only.

#4603

Getting system_attrs lazily. This change eliminates access to remote storage for system_attrs by samplers and pruners except GridSampler, SuccessiveHalvingPruner, and HyperbandPruner.

…nc-owned-trials

…-trials

codecov-commenter · 2023-04-24T05:48:47Z

Codecov Report

Merging #4631 (82f4211) into master (792754e) will increase coverage by 0.03%.
The diff coverage is 100.00%.

📣 This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more

@@            Coverage Diff             @@
##           master    #4631      +/-   ##
==========================================
+ Coverage   90.89%   90.93%   +0.03%     
==========================================
  Files         184      184              
  Lines       13959    13910      -49     
==========================================
- Hits        12688    12649      -39     
+ Misses       1271     1261      -10

Impacted Files	Coverage Δ
optuna/storages/_cached_storage.py	`98.25% <100.00%> (-0.47%)`	⬇️
optuna/study/study.py	`94.17% <100.00%> (+0.04%)`	⬆️
optuna/trial/_trial.py	`95.67% <100.00%> (+0.29%)`	⬆️

... and 2 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

not522 · 2023-04-24T07:08:09Z

This PR slows down the optimization by about 15 seconds for 1,000 trials with TPESampler.
https://github.com/not522/optuna/actions/runs/4783015010/jobs/8502860477

master

  ============================= =========
    storage, sampler, n_trials           
  ----------------------------- ---------
      inmemory, random, 1000     697±0ms 
     inmemory, random, 10000     14.3±0s 
       inmemory, tpe, 1000       28.8±0s 
      inmemory, cmaes, 1000      2.07±0s 
       sqlite, random, 1000      35.6±0s 
        sqlite, tpe, 1000        1.21±0m 
       sqlite, cmaes, 1000       53.7±0s 
   cached_sqlite, random, 1000   36.3±0s 
     cached_sqlite, tpe, 1000    1.18±0m 
    cached_sqlite, cmaes, 1000   49.5±0s 
      journal, random, 1000      6.62±0s 
        journal, tpe, 1000       36.3±0s 
       journal, cmaes, 1000      11.6±0s 
  ============================= =========

PR

  ============================= =========
    storage, sampler, n_trials           
  ----------------------------- ---------
      inmemory, random, 1000     762±0ms 
     inmemory, random, 10000     15.0±0s 
       inmemory, tpe, 1000       28.7±0s 
      inmemory, cmaes, 1000      2.15±0s 
       sqlite, random, 1000      46.2±0s 
        sqlite, tpe, 1000        1.38±0m 
       sqlite, cmaes, 1000       1.20±0m 
   cached_sqlite, random, 1000   45.2±0s 
     cached_sqlite, tpe, 1000    1.42±0m
    cached_sqlite, cmaes, 1000   1.24±0m 
      journal, random, 1000      6.70±0s 
        journal, tpe, 1000       35.8±0s 
       journal, cmaes, 1000      11.5±0s 
  ============================= =========

not522 · 2023-04-24T07:08:47Z

@c-bata @gen740 Could you review this PR?

optuna/trial/_trial.py

github-actions · 2023-05-07T23:09:40Z

This pull request has not seen any recent activity.

Co-authored-by: Gen <54583542+gen740@users.noreply.github.com>

…nc-owned-trials

gen740

Almost LGTM! I left a minor comment.

gen740 · 2023-05-08T09:03:50Z

tests/trial_tests/test_trial.py

+        # _LazyTrialSystemAttrs gets attrs the first time it is needed.
+        # Then, we create the instance for each method, and test the first and second use.
+
+        system_attrs = optuna.trial._trial._LazyTrialSystemAttrs(trial._trial_id, storage)


My pyright language server raised the issue that "_trial" is not a known member of module "optuna.trial". This is because pyright recognizes _trial as a private module, and it could not resolve it. Importing directly (from optuna.trial._trial import _LazyTrialSystemAttrs) could resolve this issue.

Although this is not a fatal issue, do you think it should be fixed?

OK. I updated it.

Thank you very much!

gen740

LGTM!

c-bata

@not522 Thank you for your pull request!
Looks almost good to me. I left one question though.

Just for your information, I found a bug in delete_study method that does not related to this PR and fixed it in #4670. I will make it ready for review after merging this PR.

optuna/study/study.py

c-bata · 2023-05-12T01:44:47Z

optuna/storages/_cached_storage.py

-                self._add_trials_to_cache(study_id, [self._backend.get_trial(trial_id)])
-                self._studies[study_id].owned_or_finished_trial_ids.add(trial_id)
-        return ret
+        return self._backend.set_trial_state_values(trial_id, state=state, values=values)


Just for the record, I compared the speed when applying the following change, but found no performance improvement. So the current logic seems to be reasonable.

Changes:

ret = self._backend.set_trial_state_values(trial_id, state=state, values=values) if state.is_finished() and trial_id in self._trial_id_to_study_id_and_number: backend_trial = self._backend.get_trial(trial_id) study_id, trial_number = self._trial_id_to_study_id_and_number[trial_id] with self._lock: study = self._studies[study_id] study.trials[trial_number] = backend_trial study.finished_trial_ids.add(trial_id) return ret

Benchmark script:
https://gist.github.com/c-bata/fa8efc075e86a1f7e6dd6d83b85c1017

Result:

Branch n_params=5 (mean ± std) (s) n_params=30 (mean ± std) (s)

PR 4631 10.1026 ± 0.1046 35.8279 ± 0.1306

cache-set-trial-state-values 9.6567 ± 0.1974 35.8301 ± 0.1761

main 7.6167 ± 0.0959 32.8131 ± 0.1010

v3.1.1 9.0216 ± 0.1590 42.7193 ± 0.1940

(Repeated 5 times per each)

c-bata and others added 11 commits February 17, 2023 13:23

Sync owned trials when calling study.ask and study.get_trials

94dcd58

Fix a broken test

b7dcb02

Remove sync_owned_trials option and always sync them

ea25ad6

Merge branch 'master' into cached-storage-sync-owned-trials

18acff2

Remove unused attributes from CachedStorage

ad01311

Remove the logic to update owned trials cache

9aca86c

Get system_attrs lazily

41e69da

Add test for _LazyTrialSystemAttrs

1c97089

Merge remote-tracking branch 'upstream/master' into cached-storage-sy…

c41da10

…nc-owned-trials

Merge branch 'lazy-trial-system-attrs' into cached-storage-sync-owned…

52920a2

…-trials

Fix to sync with remote storage

ae2d8a0

not522 assigned c-bata and gen740 Apr 24, 2023

github-actions bot added optuna.storages Related to the `optuna.storages` submodule. This is automatically labeled by github-actions. optuna.trial Related to the `optuna.trial` submodule. This is automatically labeled by github-actions. labels Apr 24, 2023

not522 mentioned this pull request Apr 24, 2023

Get system_attrs lazily for samplers and pruners #4603

Closed

Remove dead code

7c52f74

not522 marked this pull request as ready for review April 24, 2023 07:09

gen740 reviewed Apr 28, 2023

View reviewed changes

optuna/trial/_trial.py Outdated Show resolved Hide resolved

github-actions bot added the stale Exempt from stale bot labeling. label May 7, 2023

not522 and others added 2 commits May 8, 2023 13:02

Update optuna/trial/_trial.py

48e5ad6

Co-authored-by: Gen <54583542+gen740@users.noreply.github.com>

Merge remote-tracking branch 'upstream/master' into cached-storage-sy…

82f4211

…nc-owned-trials

gen740 reviewed May 8, 2023

View reviewed changes

Import _LazyTrialSystemAttrs directly

098d634

not522 added the enhancement Change that does not break compatibility and not affect public interfaces, but improves performance. label May 8, 2023

github-actions bot removed the stale Exempt from stale bot labeling. label May 8, 2023

gen740 approved these changes May 10, 2023

View reviewed changes

c-bata mentioned this pull request May 12, 2023

Fix a logic for invalidating the cache in CachedStorage #4670

Merged

c-bata approved these changes May 12, 2023

View reviewed changes

c-bata unassigned gen740 May 12, 2023

c-bata added this to the v3.2.0 milestone May 12, 2023

c-bata merged commit 466cddf into optuna:master May 12, 2023

not522 deleted the cached-storage-sync-owned-trials branch May 12, 2023 07:46

not522 mentioned this pull request May 12, 2023

Read trials from remote storage whenever get_all_trials of _CachedStorage is called #4672

Merged

not522 mentioned this pull request Oct 14, 2025

Cache distributions to skip consistency check #6301

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Sync owned trials when calling `study.ask` and `study.get_trials`#4631

Sync owned trials when calling `study.ask` and `study.get_trials`#4631
c-bata merged 15 commits intooptuna:masterfrom
not522:cached-storage-sync-owned-trials

not522 commented Apr 24, 2023

Uh oh!

codecov-commenter commented Apr 24, 2023 •

edited

Loading

Uh oh!

not522 commented Apr 24, 2023 •

edited

Loading

Uh oh!

not522 commented Apr 24, 2023

Uh oh!

Uh oh!

github-actions bot commented May 7, 2023

Uh oh!

gen740 left a comment

Uh oh!

gen740 May 8, 2023

Uh oh!

not522 May 8, 2023

Uh oh!

gen740 May 10, 2023

Uh oh!

gen740 left a comment

Uh oh!

c-bata left a comment

Uh oh!

Uh oh!

c-bata May 12, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Branch	`n_params=5` (mean ± std) (s)	`n_params=30` (mean ± std) (s)
PR 4631	10.1026 ± 0.1046	35.8279 ± 0.1306
cache-set-trial-state-values	9.6567 ± 0.1974	35.8301 ± 0.1761
main	7.6167 ± 0.0959	32.8131 ± 0.1010
v3.1.1	9.0216 ± 0.1590	42.7193 ± 0.1940

Uh oh!

Conversation

not522 commented Apr 24, 2023

Motivation

Description of the changes

Uh oh!

codecov-commenter commented Apr 24, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

not522 commented Apr 24, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

not522 commented Apr 24, 2023

Uh oh!

Uh oh!

github-actions bot commented May 7, 2023

Uh oh!

gen740 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gen740 left a comment

Choose a reason for hiding this comment

Uh oh!

c-bata left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

codecov-commenter commented Apr 24, 2023 •

edited

Loading

not522 commented Apr 24, 2023 •

edited

Loading