Add storage cache by ytsmiling · Pull Request #1140 · optuna/optuna

ytsmiling · 2020-04-20T05:11:04Z

Motivation

Flushing updates of trials to persistent storages on every update (including parameter suggests) can be a significant bottleneck. This PR mitigates the bottleneck by introducing a wrapper class that wraps the `RDBStorage/ class and providing the lazy-sync capability.

Description of the changes

Add CachedStorage, which is a wrapper of RDBStorage and provide the lazy-sync capability.
- While CachedStorage only supports RDBStorage, it is straightforward to support other classes such as RedisStorage. Thus, CachedStorage file is placed directly under the optuna/storages class.
Introduce two additional public methods to RDBStorage.
- One returns FrozenTrial when creating a new trial. Note, create_new_trial in BaseStorage only returns the trial-id.
- The other receives a FrozenTrial and flush the updates to persistent storages.

Current Status

Implement CachedStorage class.
Add tests.
Add docs where necessary.

Microbench

(DB: official mysql docker image (tag:8.0.19), storage: SSD, CPU: Intel Core i7-6700K, python: 3.5.2)

perf:optimize-study: time taken for a single optimization

This PR

{'n_study': 1, 'n_trial': 100, 'n_param': 3}
perf:optimize-study                               0:00:10.169289
{'n_study': 1, 'n_trial': 100, 'n_param': 30}
perf:optimize-study                               0:00:47.566834

Current master

{'n_study': 1, 'n_trial': 100, 'n_param': 3}
perf:optimize-study                               0:00:22.499407
{'n_study': 1, 'n_trial': 100, 'n_param': 30}
perf:optimize-study                               0:02:33.151298

Benchmark code:

import argparse
from collections import defaultdict
from datetime import datetime
import math

import sqlalchemy
import optuna

profile_result = defaultdict(lambda: datetime.now() - datetime.now())


class Profile:
    def __init__(self, name):
        self.name = name

    def __enter__(self):
        self.start = datetime.now()

    def __exit__(self, exc_type, exc_val, exc_tb):
        profile_result[self.name] += datetime.now() - self.start


def print_profile():
    for key, value in sorted(profile_result.items(), key=lambda i: i[1],
                             reverse=True):
        print(key.ljust(50) + '{}'.format(value))


def build_objective_fun(n_param):
    def objective(trial):
        return sum([
            math.sin(trial.suggest_float('param-{}'.format(i), 0, math.pi * 2))
            for i in range(n_param)
        ])

    return objective


def define_flags(parser):
    parser.add_argument('mysql_user', type=str)
    parser.add_argument('mysql_password', type=str)
    parser.add_argument('mysql_host', type=str)
    parser.add_argument('n_study', type=int)
    parser.add_argument('n_trial', type=int)
    parser.add_argument('n_param', type=int)
    return parser


if __name__ == '__main__':
    parser = define_flags(argparse.ArgumentParser())
    args = parser.parse_args()
    print(vars(args))

    storage_str = 'mysql+pymysql://{}:{}@{}/'.format(
        args.mysql_user,
        args.mysql_password,
        args.mysql_host,
    )
    engine = sqlalchemy.create_engine(storage_str)
    sampler = optuna.samplers.TPESampler()
    conn = engine.connect()
    conn.execute("commit")
    database_str = 'profile_storage_s{}_t{}_p{}'.format(
        args.n_study, args.n_trial, args.n_param)

    try:
        conn.execute(
            "drop database {}".format(database_str))
    except:
        pass
    conn.execute("create database {}".format(
        database_str
    ))
    conn.close()

    for i in range(args.n_study):
        storage = optuna.storages.get_storage(storage_str + database_str)
        study = optuna.create_study(sampler=sampler, storage=storage)
        study_id = study.study_id
        with Profile('perf:optimize-study'):
            study.optimize(build_objective_fun(args.n_param), n_trials=args.n_trial, gc_after_trial=False)

    print_profile()

ytsmiling · 2020-04-20T09:55:15Z

After fixing some bugs, I updated the microbenchmark.

{'n_study': 1, 'n_trial': 100, 'n_param': 30}
perf:optimize-study                               0:00:45.771852
{'n_study': 1, 'n_trial': 100, 'n_param': 3}
perf:optimize-study                               0:00:13.391903

sile

Thank you for the PR! I took the first look and left some review comments.

optuna/storages/cached_storage.py

optuna/storages/__init__.py

optuna/storages/rdb/storage.py

optuna/storages/cached_storage.py

codecov-io · 2020-04-27T06:52:20Z

Codecov Report

Merging #1140 into master will increase coverage by 0.62%.
The diff coverage is 95.61%.

@@            Coverage Diff             @@
##           master    #1140      +/-   ##
==========================================
+ Coverage   90.50%   91.13%   +0.62%     
==========================================
  Files         131      144      +13     
  Lines       11396    12628    +1232     
==========================================
+ Hits        10314    11508    +1194     
- Misses       1082     1120      +38

Impacted Files	Coverage Δ
optuna/storages/rdb/storage.py	`95.35% <93.22%> (-0.87%)`	⬇️
optuna/storages/cached_storage.py	`93.51% <93.51%> (ø)`
tests/storages_tests/rdb_tests/test_storage.py	`97.41% <96.42%> (+0.34%)`	⬆️
optuna/storages/__init__.py	`93.75% <100.00%> (+1.44%)`	⬆️
optuna/testing/storage.py	`84.61% <100.00%> (+1.75%)`	⬆️
tests/storages_tests/test_cached_storage.py	`100.00% <100.00%> (ø)`
tests/storages_tests/test_storages.py	`98.89% <100.00%> (+<0.01%)`	⬆️
optuna/integration/chainermn.py	`75.70% <0.00%> (-1.70%)`	⬇️
optuna/samplers/random.py	`84.44% <0.00%> (-1.56%)`	⬇️
... and 33 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4f1b4e3...28008cc. Read the comment docs.

sile

Thank you for addressing my review comments. This PR seems almost okay but I left a comment to follow the storage specification change.

optuna/storages/cached_storage.py

grafi-tt · 2020-05-11T05:41:46Z

optuna/storages/cached_storage.py

+                    param_value_internal
+                )
+                cached_trial.distributions[param_name] = distribution
+                if cached_dist:


I guess the line

self._dirty_trials.add(trial_id)

should be moved under if cached_dist: .

The logic here is a bit difficult too. Some comments, like "If cached_dist isn't available, we have to access the storage in order to check distribution compatibility." would help readers.

Thank you for your comment. Indeed, the line should be moved, and the whole logic requires more readability. I'll address them.

ytsmiling · 2020-05-11T10:48:42Z

This PR should not be merged before #1191.

optuna/storages/cached_storage.py

codecov-commenter · 2020-05-19T21:55:07Z

Codecov Report

Merging #1140 into master will increase coverage by 0.40%.
The diff coverage is 95.34%.

@@            Coverage Diff             @@
##           master    #1140      +/-   ##
==========================================
+ Coverage   86.61%   87.01%   +0.40%     
==========================================
  Files          93       94       +1     
  Lines        6924     7233     +309     
==========================================
+ Hits         5997     6294     +297     
- Misses        927      939      +12

Impacted Files	Coverage Δ
optuna/storages/rdb/storage.py	`95.83% <93.15%> (-0.24%)`	⬇️
optuna/storages/cached_storage.py	`95.83% <95.83%> (ø)`
optuna/storages/__init__.py	`100.00% <100.00%> (ø)`
optuna/testing/storage.py	`82.35% <100.00%> (+2.35%)`	⬆️
optuna/integration/chainermn.py	`75.70% <0.00%> (-1.70%)`	⬇️
optuna/storages/base.py	`68.22% <0.00%> (+11.90%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1c9d868...c926774. Read the comment docs.

ytsmiling · 2020-05-19T22:00:17Z

I rearranged commits and now this PR passed the updated tests #1191.
I addressed review comments and would like to ask re-reviews.

Some implementation has been changed from the latest reviews:

CachedStorage uses _StudyInfo, which is similar to the current implementation of InMemoryStorage.

@sile @toshihikoyanase Please review this PR when you have time.

optuna/storages/cached_storage.py

sile

@ytsmiling Thanks! Now this RP seems great but I left a few very minor comments.

optuna/storages/cached_storage.py

Co-authored-by: Takeru Ohta <phjgt308@gmail.com>

sile

LGTM! Thank you for your swift fix.

toshihikoyanase

Thank you for your great PR. I'm in the middle of the review of logic, but let me share some comments about styles.

This is just a comment, but I noticed two things:

FrozenTrial is not frozen anymore. The attribute values are updated to keep the cache up-to-date. Not now, but we may rename it in the future.
This PR directly manipulates SQLAlchemy queries while the previous implementation tried to capslate them in optuna/rdb/models.py. It may be inevitable to speed up storages.

optuna/storages/cached_storage.py

optuna/storages/rdb/storage.py

optuna/storages/cached_storage.py

optuna/storages/rdb/storage.py

tests/storages_tests/test_cached_storage.py

tests/storages_tests/rdb_tests/test_storage.py

toshihikoyanase

The changes regarding the logic seem good to me.
I'll approve the PR after you check my comments about styles.

Co-authored-by: Toshihiko Yanase <toshihiko.yanase@gmail.com>

ytsmiling · 2020-05-20T19:36:00Z

@toshihikoyanase Thank you for reviewing this PR.

FrozenTrial is not frozen anymore. The attribute values are updated to keep the cache up-to-date. Not now, but we may rename it in the future.

Actually, the mutation of FrozenTrial was not introduced in this PR, but it has been a long history in the InMemoryStorage class.

This PR directly manipulates SQLAlchemy queries while the previous implementation tried to capslate them in optuna/rdb/models.py. It may be inevitable to speed up storages.

Yes, I think we should remove methods in the models package and directly manipulate sqlalchemy. It can result in a significant performance gain (e.g. get_all_study_summaries method). From a pure performance perspective, it's even better to stop using orm and directly use SQL, but it requires much more code changes and I think we do not need to switch to SQL for now.

toshihikoyanase

Thank you for your swift actions. LGTM!

ytsmiling changed the title ~~Add storage cache~~ [WIP] Add storage cache Apr 20, 2020

ytsmiling self-assigned this Apr 20, 2020

ytsmiling marked this pull request as ready for review April 21, 2020 04:38

ytsmiling changed the title ~~[WIP] Add storage cache~~ Add storage cache Apr 21, 2020

ytsmiling mentioned this pull request Apr 23, 2020

[RFC] Define specs of storage classes, make implementations consistent, and add tests. #1155

Closed

sile requested changes Apr 27, 2020

View reviewed changes

ytsmiling mentioned this pull request Apr 27, 2020

Major refactoring of storage classes. #1170

Closed

20 tasks

ytsmiling requested a review from sile April 28, 2020 03:41

sile requested changes Apr 28, 2020

View reviewed changes

optuna/storages/cached_storage.py Show resolved Hide resolved

grafi-tt reviewed May 11, 2020

View reviewed changes

toshihikoyanase reviewed May 13, 2020

View reviewed changes

optuna/storages/cached_storage.py Show resolved Hide resolved

toshihikoyanase mentioned this pull request May 15, 2020

Release Tasks for v1.5.0. #1245

Closed

2 tasks

Support storage cache.

9ba2cf6

ytsmiling force-pushed the add-storage-cache branch from 28008cc to 9ba2cf6 Compare May 19, 2020 19:49

ytsmiling added 5 commits May 20, 2020 04:54

Fix import order.

93c6a58

Add locks for thread safety.

dd46614

Use copy-on-write for thread-safety.

db284af

Reformat code.

117aa3b

Fix type error.

e17476d

ytsmiling requested review from sile and toshihikoyanase May 19, 2020 22:00

toshihikoyanase reviewed May 20, 2020

View reviewed changes

optuna/storages/cached_storage.py Outdated Show resolved Hide resolved

sile reviewed May 20, 2020

View reviewed changes

optuna/storages/cached_storage.py Outdated Show resolved Hide resolved

optuna/storages/cached_storage.py Outdated Show resolved Hide resolved

optuna/storages/cached_storage.py Outdated Show resolved Hide resolved

optuna/storages/cached_storage.py Show resolved Hide resolved

ytsmiling and others added 2 commits May 20, 2020 15:14

Fix type annotations.

fc633ba

Co-authored-by: Takeru Ohta <phjgt308@gmail.com>

Rename _TrialUpdates to _TrialUpdate in storage cache.

1cdf089

Remove unused method.

434ff12

sile approved these changes May 20, 2020

View reviewed changes

sile added the enhancement Change that does not break compatibility and not affect public interfaces, but improves performance. label May 20, 2020

sile added this to the v1.5.0 milestone May 20, 2020

toshihikoyanase reviewed May 20, 2020

View reviewed changes

tests/storages_tests/test_cached_storage.py Outdated Show resolved Hide resolved

toshihikoyanase reviewed May 20, 2020

View reviewed changes

tests/storages_tests/rdb_tests/test_storage.py Outdated Show resolved Hide resolved

toshihikoyanase reviewed May 20, 2020

View reviewed changes

ytsmiling and others added 2 commits May 21, 2020 04:21

Add sphinx domains to method docs.

45710d2

Co-authored-by: Toshihiko Yanase <toshihiko.yanase@gmail.com>

Clarify _create_new_trial method doc.

4c8d2d9

Co-authored-by: Toshihiko Yanase <toshihiko.yanase@gmail.com>

ytsmiling added 6 commits May 21, 2020 06:29

Concatenate with-statements.

920ad38

Add Sphinx domain annotation to method doc.

5ae3904

Rename methods and arguments with clearer names.

5518514

Use pytest.mark.parametrize to simplify tests.

740c596

Change return type of RDBStorage._create_new_trial.

09fbf93

Remove unused import.

c926774

toshihikoyanase approved these changes May 21, 2020

View reviewed changes

toshihikoyanase merged commit a95aac0 into optuna:master May 21, 2020

This was referenced May 21, 2020

Move caching mechanism from RDBStorage to _CachedStorage. #1263

Merged

Cache study-related info in _CachedStorage. #1264

Merged

Add read_trials_from_remote_storage method to Storage implementations. #1298

Merged

not522 mentioned this pull request Oct 14, 2025

Cache distributions to skip consistency check #6301

Merged

Uh oh!

Conversation

ytsmiling commented Apr 20, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Description of the changes

Current Status

Microbench

This PR

Current master

Uh oh!

ytsmiling commented Apr 20, 2020

Uh oh!

sile left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codecov-io commented Apr 27, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

sile left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

grafi-tt May 11, 2020

Choose a reason for hiding this comment

Uh oh!

ytsmiling May 11, 2020

Choose a reason for hiding this comment

Uh oh!

ytsmiling commented May 11, 2020

Uh oh!

Uh oh!

codecov-commenter commented May 19, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

ytsmiling commented May 19, 2020

Uh oh!

Uh oh!

sile left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sile left a comment

Choose a reason for hiding this comment

Uh oh!

toshihikoyanase left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

toshihikoyanase left a comment

Choose a reason for hiding this comment

Uh oh!

ytsmiling commented May 20, 2020

Uh oh!

toshihikoyanase left a comment

Choose a reason for hiding this comment

Uh oh!

ytsmiling commented Apr 20, 2020 •

edited

Loading

codecov-io commented Apr 27, 2020 •

edited

Loading

codecov-commenter commented May 19, 2020 •

edited

Loading

toshihikoyanase left a comment •

edited

Loading