Skip to content

Store CMAES optimizer after splitting into substrings.#1833

Merged
hvy merged 8 commits intooptuna:masterfrom
c-bata:split-optimizer-string
Sep 24, 2020
Merged

Store CMAES optimizer after splitting into substrings.#1833
hvy merged 8 commits intooptuna:masterfrom
c-bata:split-optimizer-string

Conversation

@c-bata
Copy link
Copy Markdown
Member

@c-bata c-bata commented Sep 15, 2020

Motivation

fix #1775

Description of the changes

Split an optimizer string, puts these substrings into system_attrs.

@c-bata c-bata added the bug Issue/PR about behavior that is broken. Not for typos/examples/CI/test but for Optuna itself. label Sep 15, 2020
@github-actions github-actions bot added the optuna.samplers Related to the `optuna.samplers` submodule. This is automatically labeled by github-actions. label Sep 15, 2020
@c-bata c-bata marked this pull request as ready for review September 15, 2020 07:11
@c-bata
Copy link
Copy Markdown
Member Author

c-bata commented Sep 15, 2020

I prepare a script to reproduce the bug and confirmed that this PR fixed it.
https://gist.github.com/c-bata/28a6334c37f340a2ce8e54c4dee1e2bf

Before:

$ python examples/quadratic_simple.py
...
sqlalchemy.exc.DataError: (MySQLdb._exceptions.DataError) (1406, "Data too long for column 'value_json' at row 1")
[SQL: INSERT INTO trial_system_attributes (trial_id, `key`, value_json) VALUES (%s, %s, %s)]
[parameters: (38, 'cma:optimizer', '"80049528050000000000008c0a636d6165732e5f636d61948c03434d419493942981947d94288c065f6e5f64696d944b028c085f706f7073697a65944b068c035f6d75944b038c075f6d ... (2366 characters truncated) ... 00990201000000f0379026010000003037902601000000702a291301000000c0fe962601000000b025291301001400947494628c085f657073696c6f6e94473e45798ee2308c3a75622e"')]
(Background on this error at: http://sqlalche.me/e/9h9h)
...
optuna.exceptions.StorageInternalError: An exception is raised during the commit. This typically happens due to invalid data in the commit, e.g. exceeding max length.

After:

$ python examples/quadratic_simple.py 
...
Best value: 4.09659 (params: {'x': -6.36791, 'y': -36.4537})

And I also confirmed that this change has no effect to optimization results. See https://gist.github.com/c-bata/28a6334c37f340a2ce8e54c4dee1e2bf#gistcomment-3454630 for details.

@c-bata c-bata changed the title Split CMAES optimizer with 2048 characters Store CMAES optimizer after splitting into substrings. Sep 15, 2020
@HideakiImamura HideakiImamura self-assigned this Sep 17, 2020
Copy link
Copy Markdown
Member

@HideakiImamura HideakiImamura left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! This is an amazing alternative to PR #1776. The changes look great to me.

@hvy hvy self-assigned this Sep 18, 2020
Copy link
Copy Markdown
Member

@hvy hvy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the interesting PR. Basically LGTM! I left some questions rather than request if you don't mind.

@hvy
Copy link
Copy Markdown
Member

hvy commented Sep 18, 2020

By the way, I for a moment though about providing this splitting under hood of the Trial's and Study's attr manipulation methods to avoid the storage error being propagated to the user. But that might be controversial. On the other hand, we can probably catch and reraise the storage error with a more friendly one.

@c-bata
Copy link
Copy Markdown
Member Author

c-bata commented Sep 18, 2020

FYI, 2048 character limits is just an implementation detail on RDBStorage, not a specification of BaseStorage. So I think we have two options:

  1. Update a specification of BaseStorage, then exporting warning messages if the length of value exceeds 2048 characters limit.
  2. Add Trial's and Study's attr manipulation methods in RDBStorage without updating a specification.

I prefer the former approach, but it contains a breaking change (though it's an undefined behavior).

Copy link
Copy Markdown
Member

@hvy hvy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@hvy hvy merged commit 59f11f3 into optuna:master Sep 24, 2020
@hvy hvy added this to the v2.2.0 milestone Sep 24, 2020
@hvy
Copy link
Copy Markdown
Member

hvy commented Sep 24, 2020

Let's discuss #1833 (comment) further. I'll create a separate issue for it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Issue/PR about behavior that is broken. Not for typos/examples/CI/test but for Optuna itself. optuna.samplers Related to the `optuna.samplers` submodule. This is automatically labeled by github-actions.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Cannot store CMA object on PostgreSQL and MySQL.

3 participants