Skip to content

Support multiple studies in InMemoryStorage.#1228

Merged
hvy merged 18 commits intooptuna:masterfrom
ytsmiling:support-multi-studies-in-inmemory-db
May 12, 2020
Merged

Support multiple studies in InMemoryStorage.#1228
hvy merged 18 commits intooptuna:masterfrom
ytsmiling:support-multi-studies-in-inmemory-db

Conversation

@ytsmiling
Copy link
Copy Markdown
Member

@ytsmiling ytsmiling commented May 11, 2020

Motivation

This PR adds multi-study support to in-memory storage.
This partially addresses #1170 and relates to #1191.

Description of the changes

Add multi-study support to in-memory db.

Other notes

I checked that this passes tests in #1191 in python3.5.

Update:

Comment on lines 428 to +429
if isinstance(storage, InMemoryStorage):
with pytest.raises(ValueError):
# InMemoryStorage shares the same study if create_new_study is additionally invoked.
# Thus, the following line should fail due to distribution incompatibility.
storage.set_trial_param(trial_id_3, "y", 1, distribution_y_2)
pass
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These "tests" will be completely removed by #1191.


with pytest.raises(ValueError):
ChainerMNStudy(study, comm)
pass
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto.

@ytsmiling ytsmiling added bug Issue/PR about behavior that is broken. Not for typos/examples/CI/test but for Optuna itself. optuna.storages Related to the `optuna.storages` submodule. This is automatically labeled by github-actions. labels May 11, 2020
@sile
Copy link
Copy Markdown
Member

sile commented May 12, 2020

@ytsmiling Thank you for your PR. I'll review this PR today.
By the way, could you run a benchmark to measure the execution speed of the new implementation?

@sile sile added compatibility Change that breaks compatibility. and removed compatibility Change that breaks compatibility. labels May 12, 2020
@ytsmiling
Copy link
Copy Markdown
Member Author

ytsmiling commented May 12, 2020

Result of the same benchmark with #1135: (1000 trials 30 params)
Current master:
214310942 function calls (192531182 primitive calls) in 130.340 seconds
This PR:
81599936 function calls (81348176 primitive calls) in 82.710 seconds

So, rewriting the in-memory db codes resulted in over 30% speedup.

@sile
Copy link
Copy Markdown
Member

sile commented May 12, 2020

Oh, great! Honestly, I was a bit concern that this PR might introduce some overhead.

@ytsmiling
Copy link
Copy Markdown
Member Author

I guess that the speed up comes from rewriting the get_trials method and other methods have increased latency.
But anyway, the overhead seems not that problem.

Copy link
Copy Markdown
Member

@hvy hvy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR. I'm also concerned about the locking but left some minor comments and questions from my side.

self._check_study_id(study_id)
with self._lock:
return copy.deepcopy(self.study_user_attrs)
return self._studies[study_id].user_attrs
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please correct me if I'm wrong but changing the copy behavior seems to go a bit beyond the scope of this PR (ditto for get_study_system_attrs and get_trial).

Copy link
Copy Markdown
Member Author

@ytsmiling ytsmiling May 12, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it can be addressed in another PR.
However, the current implementation of this PR relies on this copying behavior of get_trial, and modifying the line requires a large change of codes.

I'll create another PR to change the copying behavior and I'd like to merge it before this PR.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After examining the code, I decided not to create separate PR and keep the original copying behavior in this PR.

@ytsmiling
Copy link
Copy Markdown
Member Author

ytsmiling commented May 12, 2020

#1228 (comment)
This probably is a reason for the speed gain by this PR.

@ytsmiling
Copy link
Copy Markdown
Member Author

After reverting the deep-copy behavior in #1228 (comment), the benchmark became as follows:

         209306939 function calls (188385179 primitive calls) in 129.212 seconds

This is almost the same as the current master implementation.

Copy link
Copy Markdown
Member

@hvy hvy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your quick fixes. I left some additional comments related to your latest changes.

Copy link
Copy Markdown
Member

@sile sile left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.
I'm looking forward to merging this PR as the inconsistent behavior of InMemoryStrage was a concern for a long time.

@sile sile added this to the v1.5.0 milestone May 12, 2020
Copy link
Copy Markdown
Member

@hvy hvy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Issue/PR about behavior that is broken. Not for typos/examples/CI/test but for Optuna itself. optuna.storages Related to the `optuna.storages` submodule. This is automatically labeled by github-actions.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants