Skip to content

Cache split in TPE for high-dimensional optimization#5464

Closed
nabenabe0928 wants to merge 11 commits intooptuna:masterfrom
nabenabe0928:enhance/cache-split-for-pr
Closed

Cache split in TPE for high-dimensional optimization#5464
nabenabe0928 wants to merge 11 commits intooptuna:masterfrom
nabenabe0928:enhance/cache-split-for-pr

Conversation

@nabenabe0928
Copy link
Copy Markdown
Contributor

@nabenabe0928 nabenabe0928 commented May 29, 2024

Motivation

As TPE significantly slows down for high-dimensional optimization, this PR introduces caching mechanism for the TPE split.

Description of the changes

  • Cache the split information in sampler so that we can re-use the result

For n_trials=1000 and dim=10, the runtimes are the following:

This PR Master
29 47

@not522
Copy link
Copy Markdown
Member

not522 commented May 29, 2024

What is the relationship between this PR and #5454? Should we review it after #5454?

@nabenabe0928
Copy link
Copy Markdown
Contributor Author

@not522
These two PRs are orthogonal works, so we can separately work!
This PR aims to share the split information in a trial.
The other PR aims to share the information over multiple trials giving the same set of arguments to the hssp solver.

@not522
Copy link
Copy Markdown
Member

not522 commented May 30, 2024

@eukaryo @gen740 Could you review this PR?

@@ -1,3 +1,5 @@
from __future__ import annotations
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this change is not relevant to this PR.
This change would be good to include when using features from __future__.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

@eukaryo
Copy link
Copy Markdown
Collaborator

eukaryo commented Jun 4, 2024

Sorry, I am temporarily busy because my primary computer is not working, and I suppose @HideakiImamura -san is the appropriate reviewer. @HideakiImamura could you review this PR?

@codecov
Copy link
Copy Markdown

codecov bot commented Jun 5, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 89.74%. Comparing base (181d65f) to head (3b61edc).
Report is 705 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #5464      +/-   ##
==========================================
+ Coverage   89.52%   89.74%   +0.22%     
==========================================
  Files         194      195       +1     
  Lines       12626    12592      -34     
==========================================
- Hits        11303    11301       -2     
+ Misses       1323     1291      -32     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Copy Markdown
Member

@HideakiImamura HideakiImamura left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Holding cache on the sampler instance causes the data inconsistency across different processes. Could you give me an opinion that?

@nabenabe0928
Copy link
Copy Markdown
Contributor Author

nabenabe0928 commented Jun 6, 2024

We discussed internally and decided to close this PR.
This issue can be more or less avoided by specifying multivariate=True.

@nabenabe0928
Copy link
Copy Markdown
Contributor Author

Just for future reminder, I will leave some comments:

This PR does not cause any issues between processes

As each trial is sampled in a thread and a sampling of a specific trial will not be scattered on multiple processes or threads, we do not have to be concerned about the missing cached data in another process.

However, when using multiple threads, we need to take care of another thread overwriting the cache before one trial completes its sampling.
For this reason, I added a buffer to store the split data up to the latest 64 trials in a thread.

Anyways, missing cache does not cause any issue because we simply need to re-calculate the split, which incurs some more computational burden.

@nabenabe0928 nabenabe0928 deleted the enhance/cache-split-for-pr branch June 5, 2025 06:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants