Fix TPESampler with multivariate and constant_liar#6189
Fix TPESampler with multivariate and constant_liar#6189nabenabe0928 merged 12 commits intooptuna:masterfrom
TPESampler with multivariate and constant_liar#6189Conversation
|
@nabenabe0928 @sawa3030 Could you review this PR? |
| distribution = trial.distributions[param_name] | ||
| params = self._get_params(trial) | ||
| if all((param_name in params) for param_name in search_space): | ||
| for param_name, distribution in search_space.items(): |
There was a problem hiding this comment.
Note
distribution is used only for to_internal_repr and the return of to_internal_repr is not affected by any dynamic search space.
Please also note that CategoricalDistribution cannot be dynamic.
optuna/samplers/_tpe/sampler.py
Outdated
| params = json.loads(params_str) | ||
| params.update(trial.params) |
There was a problem hiding this comment.
In my understanding, trial.params is a subset of params, but are there any scenarios where this is not the case?
I mean, doesn't params.update(trial.params) do nothing here?
There was a problem hiding this comment.
trial.params is a subset of params
This is incorrect when the objective function changes.
By the way, using trial.params might be more appropriate in this case. 🤔
import optuna
def objective1(trial):
x = trial.suggest_float("x", 0, 10)
y = trial.suggest_float("y", 0, 10)
return x ** 2 + y ** 2
def objective2(trial):
x = trial.suggest_float("x", 10, 20)
y = trial.suggest_float("y", 10, 20)
return x ** 2 + y ** 2
sampler = optuna.samplers.TPESampler(multivariate=True, constant_liar=True)
study = optuna.create_study(sampler=sampler)
study.optimize(objective1, n_trials=10)
study.optimize(objective2, n_trials=1)There was a problem hiding this comment.
Thank you for the correction, I confirmed it:
[I 2025-07-08 05:37:34,237] Trial 9 finished with value: 65.10574953954168 and parameters: {'x': 2.3991078898053453, 'y': 7.703897122405998}. Best is trial 1 with value: 13.028020613050282.
self._relative_params={'x': 1.391003910457944, 'y': 1.5331091061768467}
self._relative_params={'x': 1.391003910457944, 'y': 1.5331091061768467}
[W 2025-07-08 05:37:34,239] The parameter `x` in Trial#10 is sampled independently using `RandomSampler` instead of `TPESampler`, potentially degrading the optimization performance. This fallback happend because dynamic search space is not supported for `multivariate=True`. You can suppress this warning by setting `warn_independent_sampling` to `False` in the constructor of `TPESampler` if this independent sampling is intended behavior.
self._relative_params={'x': 1.391003910457944, 'y': 1.5331091061768467}
self._relative_params={'x': 1.391003910457944, 'y': 1.5331091061768467}
[W 2025-07-08 05:37:34,241] The parameter `y` in Trial#10 is sampled independently using `RandomSampler` instead of `TPESampler`, potentially degrading the optimization performance. This fallback happend because dynamic search space is not supported for `multivariate=True`. You can suppress this warning by setting `warn_independent_sampling` to `False` in the constructor of `TPESampler` if this independent sampling is intended behavior.
trial.params={'x': 13.801593636509187, 'y': 13.520713187030058}
[I 2025-07-08 05:37:34,242] Trial 10 finished with value: 373.2936719932594 and parameters: {'x': 13.801593636509187, 'y': 13.520713187030058}. Best is trial 1 with value: 13.028020613050282.
There was a problem hiding this comment.
By the way, using trial.params might be more appropriate in this case. 🤔
You mean something like this?
relative_params = ...
trial_params = ...
trial_params.update(relative_params)If so, we should probably use search_space to limit the parameters to update:
trial_params.update([relative_params[param_name] for param_name in search_space if param_name in relative_params])For example, I am not sure the exact and appropriate behavior for the following case:
relative_params = {"a": 11.0, "b": 12.0, "c": 100}
trial_params = {"a": 1.0, "b": 2.0} # Assume `c` is suggested only if a > 10.There was a problem hiding this comment.
I mean that we just use trial.params when we detect the search space is changed.
There was a problem hiding this comment.
Hm, that is indeed a very good point.
What about adding an inline comment about this?
|
The PR looks mostly good to me:) |
Co-authored-by: Shuhei Watanabe <47781922+nabenabe0928@users.noreply.github.com>
sawa3030
left a comment
There was a problem hiding this comment.
I'm sorry for the delay in reviewing. I've left a few comments. PTAL.
optuna/samplers/_tpe/sampler.py
Outdated
| return self._sample(study, trial, {param_name: param_distribution})[param_name] | ||
|
|
||
| def _get_params(self, trial: FrozenTrial) -> dict[str, Any]: | ||
| if trial.state.is_finished(): |
There was a problem hiding this comment.
| if trial.state.is_finished(): | |
| if trial.state.is_finished() or not self._constant_liar: |
I was wondering if this version might help avoid an unnecessary call to trial.system_attrs.get.
There was a problem hiding this comment.
When self._constant_liar is False, the trial is always finished. Meanwhile, when self._multivariate is False, there was an unnecessary access. I fixed it.
There was a problem hiding this comment.
Thank you for the correction!
17b6bdc to
3262145
Compare
sawa3030
left a comment
There was a problem hiding this comment.
Thank you for all the explanation. LGTM
There was a problem hiding this comment.
Sorry for the late response...:(
This PR looks almost good to me:)
Let me approve this PR once I get the response from @not522 to each of my comments!
Co-authored-by: Shuhei Watanabe <47781922+nabenabe0928@users.noreply.github.com>
Co-authored-by: Shuhei Watanabe <47781922+nabenabe0928@users.noreply.github.com>
|
I noticed |
|
@not522 |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #6189 +/- ##
==========================================
- Coverage 88.38% 88.21% -0.17%
==========================================
Files 207 207
Lines 14030 14065 +35
==========================================
+ Hits 12400 12408 +8
- Misses 1630 1657 +27 ☔ View full report in Codecov by Sentry. |
033d4d5 to
fe3acff
Compare
I found that the previous code may use the stale sampling result for import optuna
def objective1(trial):
x = trial.suggest_float("x", 0, 10)
y = trial.suggest_float("y", 0, 10)
return x ** 2 + y ** 2
def objective2(trial):
x = trial.suggest_float("x", 10, 20)
y = trial.suggest_float("y", 10, 20)
return x ** 2 + y ** 2
sampler = optuna.samplers.TPESampler(multivariate=True, constant_liar=True)
study = optuna.create_study(sampler=sampler)
study.optimize(objective1, n_trials=10)
study.optimize(objective2, n_trials=1) |
nabenabe0928
left a comment
There was a problem hiding this comment.
Sorry for the delay, LGTM!
|
I will merge this PR once @sawa3030 confirms the change! |
sawa3030
left a comment
There was a problem hiding this comment.
Thank you for the correction. LGTM
Motivation
Combining
TPESampler'smultivariateandconstant_liarcan sometimes cause theconstant_liarto function improperly during batch optimization.Running the following script demonstrates that the result remains unchanged whether
constant_liarisTrueorFalsein the current master branch.After applying this PR, we can see that sampling becomes more dispersed when
constant_liaris set toTrue. (Same colors indicate samples in the same batch.)master
multivariate=True,constant_liar=Falsemultivariate=True,constant_liar=TruePR
multivariate=True,constant_liar=Falsemultivariate=True,constant_liar=TrueDescription of the changes
system_attrsto make them available to other processes.RUNNINGtrials do actual sampling.Benchmarks
Speed
It takes slightly longer to save parameters in
system_attrs, but the difference isn't significant.Details
Optimization performance
The difference in optimization performance becomes identical to simply having or not having
constant_liar.Details
The following figures show the Mann-Whitney U tests. If the line is close to 0,
constant_liarwould result in better performance. Conversely, if it's close to 1, the effect would be the opposite. Overall, the effect isn't clear, but some settings clearly show an advantage withconstant_liar.