Skip to content

Handle -inf and inf values in RDBStorage#3238

Merged
nzw0301 merged 2 commits intooptuna:masterfrom
xadrianzetx:fix-rdb-handle-inf
Jan 30, 2022
Merged

Handle -inf and inf values in RDBStorage#3238
nzw0301 merged 2 commits intooptuna:masterfrom
xadrianzetx:fix-rdb-handle-inf

Conversation

@xadrianzetx
Copy link
Copy Markdown
Collaborator

@xadrianzetx xadrianzetx commented Jan 18, 2022

Motivation

Previously an error could be raised when -inf or inf was registered in rdb storage. Depending on SQL dialect in backend, those special values might be supported or not. This PR introduces a simple way to get around this limitation by storing -inf and inf as minimum and maximum values of 32 bit float and providing thin translation layer between stored and true values on database reads and writes. This gives effective available range for trial values of (-3.4028235e+38, 3.4028235e+38). Everything beyond that is interpreted as infinity. Closes #3206.

Description of the changes

  • Implement inf handling in RDBStorage
  • Test

This patch ensures that `RDBBackend` can fully support MySQL
specification by representing infinite values as floats and limiting
signed floats to 32 bit
@xadrianzetx xadrianzetx changed the title Handle -inf and 'inf' values in RDBStorage Handle -inf and inf values in RDBStorage Jan 18, 2022
@github-actions github-actions Bot added the optuna.storages Related to the `optuna.storages` submodule. This is automatically labeled by github-actions. label Jan 18, 2022
@xadrianzetx
Copy link
Copy Markdown
Collaborator Author

xadrianzetx commented Jan 18, 2022

Note - this is a simple way of fixing it without a need to modify data models. The downside is that when translating value from storage back to actual value, we can't directly compare two floats. This means some trial values that are close enough to numerical limits might be translated back as infinite. But still, this is for values close to 2e+38 so in reality this is probably a non issue.

Alternative solution is to modify TrialValueModel and TrialIntermediateValueModel to include an integer column which indicates if value is infinite. No need for approximations then. This would also give us a chance to specify double precision for Float in those two models, meaning reported value could be in (-1.7976931348623157e+308, 1.7976931348623157e+308) range and would go in line with python float range (which is C type double in cpython).

@toshihikoyanase
Copy link
Copy Markdown
Member

@nzw0301 @HideakiImamura Could you review this PR, please?

@toshihikoyanase toshihikoyanase added enhancement Change that does not break compatibility and not affect public interfaces, but improves performance. bug Issue/PR about behavior that is broken. Not for typos/examples/CI/test but for Optuna itself. v3 Issue/PR for Optuna version 3. and removed enhancement Change that does not break compatibility and not affect public interfaces, but improves performance. labels Jan 19, 2022
@nzw0301
Copy link
Copy Markdown
Member

nzw0301 commented Jan 19, 2022

Sure. I'll review this PR within the next week.

Copy link
Copy Markdown
Member

@HideakiImamura HideakiImamura left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR. Basically, LGTM. Let me add a comment for the test.

Comment thread tests/storages_tests/rdb_tests/test_with_server.py
@codecov-commenter
Copy link
Copy Markdown

Codecov Report

Merging #3238 (cf8a45c) into master (acdf4b7) will increase coverage by 0.06%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #3238      +/-   ##
==========================================
+ Coverage   91.51%   91.58%   +0.06%     
==========================================
  Files         146      145       -1     
  Lines       12011    12011              
==========================================
+ Hits        10992    11000       +8     
+ Misses       1019     1011       -8     
Impacted Files Coverage Δ
optuna/storages/_rdb/storage.py 92.94% <100.00%> (+0.12%) ⬆️
optuna/integration/chainermn.py 95.15% <0.00%> (-0.03%) ⬇️
optuna/study/_optimize.py 98.31% <0.00%> (-0.02%) ⬇️
optuna/multi_objective/samplers/_nsga2.py 97.14% <0.00%> (-0.02%) ⬇️
optuna/samplers/_nsga2/sampler.py 96.61% <0.00%> (-0.02%) ⬇️
optuna/samplers/_nsga2/crossover.py 97.83% <0.00%> (-0.01%) ⬇️
optuna/__init__.py 100.00% <0.00%> (ø)
optuna/_callbacks.py 100.00% <0.00%> (ø)
optuna/pruners/_successive_halving.py 100.00% <0.00%> (ø)
... and 2 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update acdf4b7...cf8a45c. Read the comment docs.

Copy link
Copy Markdown
Member

@HideakiImamura HideakiImamura left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update! LGTM.

@HideakiImamura
Copy link
Copy Markdown
Member

I confirmed that all problems in #3206 are resolved by this PR.

(venv) mamu@HideakinoMacBook-puro 3205-3206 % python inf-objective.py
[I 2022-01-28 12:56:37,375] A new study created in memory with name: no-name-ff107e9e-301c-460b-b9cc-81df43d3968d
[I 2022-01-28 12:56:37,376] Trial 0 finished with value: inf and parameters: {}. Best is trial 0 with value: inf.
In-memory : Trial state is TrialState.COMPLETE
[I 2022-01-28 12:56:37,441] A new study created in RDB with name: no-name-969966ff-fb18-494a-932c-ecddc8a4017b
[I 2022-01-28 12:56:37,502] Trial 0 finished with value: inf and parameters: {}. Best is trial 0 with value: inf.
SQLite    : Trial state is TrialState.COMPLETE
[I 2022-01-28 12:56:37,620] A new study created in RDB with name: no-name-07b13b5f-85bc-46a2-ba72-8b95dbb7eb5f
[I 2022-01-28 12:56:37,763] Trial 0 finished with value: inf and parameters: {}. Best is trial 0 with value: inf.
MySQL     : Trial state is TrialState.COMPLETE
[I 2022-01-28 12:56:37,903] A new study created in RDB with name: no-name-90b645ef-80ff-4730-a1e0-01e2c51011c5
[I 2022-01-28 12:56:38,012] Trial 0 finished with value: inf and parameters: {}. Best is trial 0 with value: inf.
PostgreSQL: Trial state is TrialState.COMPLETE
(venv) mamu@HideakinoMacBook-puro 3205-3206 % python inf-report.py   
[I 2022-01-28 12:56:41,181] A new study created in memory with name: no-name-d9311317-b070-46ea-a22d-39c280b29046
[I 2022-01-28 12:56:41,182] Trial 0 finished with value: 1.0 and parameters: {}. Best is trial 0 with value: 1.0.
In-memory : Trial state is TrialState.COMPLETE
[I 2022-01-28 12:56:41,247] A new study created in RDB with name: no-name-13547701-5097-4ca9-b34a-a7dc1810e732
[I 2022-01-28 12:56:41,315] Trial 0 finished with value: 1.0 and parameters: {}. Best is trial 0 with value: 1.0.
SQLite    : Trial state is TrialState.COMPLETE
[I 2022-01-28 12:56:41,435] A new study created in RDB with name: no-name-c383b025-e13b-4122-9f23-0e1c2b4680ae
[I 2022-01-28 12:56:41,575] Trial 0 finished with value: 1.0 and parameters: {}. Best is trial 0 with value: 1.0.
MySQL     : Trial state is TrialState.COMPLETE
[I 2022-01-28 12:56:41,712] A new study created in RDB with name: no-name-21630f5e-9773-4e9a-bbf9-8011f1c81a53
[I 2022-01-28 12:56:41,829] Trial 0 finished with value: 1.0 and parameters: {}. Best is trial 0 with value: 1.0.
PostgreSQL: Trial state is TrialState.COMPLETE
(venv) mamu@HideakinoMacBook-puro 3205-3206 % python inf-tell.py 
[I 2022-01-28 12:56:47,353] A new study created in memory with name: no-name-66267843-18bf-4c88-8701-2643a782c1d6
In-memory : Trial state is TrialState.COMPLETE
[I 2022-01-28 12:56:47,420] A new study created in RDB with name: no-name-b8cd0fce-6807-4174-8606-b352d214b19d
SQLite    : Trial state is TrialState.COMPLETE
[I 2022-01-28 12:56:47,583] A new study created in RDB with name: no-name-db63b7aa-e06f-4d1d-b67c-1bdc7abe8268
MySQL     : Trial state is TrialState.COMPLETE
[I 2022-01-28 12:56:47,806] A new study created in RDB with name: no-name-121048c0-3d33-408b-b56f-103e83c06e06
PostgreSQL: Trial state is TrialState.COMPLETE

Copy link
Copy Markdown
Member

@nzw0301 nzw0301 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your PR! LGTM.

I locally confirmed the code works with

  1. a single objective (inf) with intermediate values (inf)
  2. two objective values (inf, 1.0).

I'm wondering if we could add the explanation of this transformation to Optuna documentation, where storage or study. But it can be addressed as a follow-up.

So I'll merge this PR once.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Issue/PR about behavior that is broken. Not for typos/examples/CI/test but for Optuna itself. optuna.storages Related to the `optuna.storages` submodule. This is automatically labeled by github-actions. v3 Issue/PR for Optuna version 3.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

inf handling in Optuna

5 participants