Fix `TPESampler` with `multivariate` and `constant_liar` by not522 · Pull Request #6189 · optuna/optuna

not522 · 2025-07-02T03:17:27Z

Motivation

Combining TPESampler's multivariate and constant_liar can sometimes cause the constant_liar to function improperly during batch optimization.

Running the following script demonstrates that the result remains unchanged whether constant_liar is True or False in the current master branch.

import matplotlib.pyplot as plt
import optuna

N_TRIAL = 50
N_BATCH = 10

for multivariate in (False, True):
    for constant_liar in (False, True):
        sampler = optuna.samplers.TPESampler(
            seed=42,
            multivariate=multivariate,
            constant_liar=constant_liar,
        )
        study = optuna.create_study(sampler=sampler)

        for i in range(0, N_TRIAL, N_BATCH):
            trials = []
            for j in range(N_BATCH):
                trials.append(study.ask())
            X = [trial.suggest_float("x", -10, 10) for trial in trials]
            Y = [trial.suggest_float("y", -10, 10) for trial in trials]
            for j in range(N_BATCH):
                study.tell(trials[j], X[j] ** 2 + Y[j] ** 2)

            # Skip first random sampling.
            if i > 0:
                plt.plot(X, Y, ".")
        plt.xlim(-10, 10)
        plt.ylim(-10, 10)
        plt.savefig(f"{multivariate}-{constant_liar}.png")
        plt.clf()

After applying this PR, we can see that sampling becomes more dispersed when constant_liar is set to True. (Same colors indicate samples in the same batch.)

master
- multivariate=True, constant_liar=False
- multivariate=True, constant_liar=True
PR
- multivariate=True, constant_liar=False
- multivariate=True, constant_liar=True

Description of the changes

Store the results of relative sampling in system_attrs to make them available to other processes.
Use them before RUNNING trials do actual sampling.

Benchmarks

Speed

It takes slightly longer to save parameters in system_attrs, but the difference isn't significant.

Details

import time
import optuna
from tqdm import tqdm

N_TRIAL = 1000
N_BATCH = 10

for multivariate in (False, True):
    for constant_liar in (False, True):
        start = time.time()
        sampler = optuna.samplers.TPESampler(
            seed=42,
            multivariate=multivariate,
            constant_liar=constant_liar,
        )
        study = optuna.create_study(sampler=sampler, storage="sqlite:///tmp.db")

        for i in range(0, N_TRIAL, N_BATCH):
            trials = []
            for j in range(N_BATCH):
                trials.append(study.ask())
            X = [trial.suggest_float("x", -10, 10) for trial in trials]
            Y = [trial.suggest_float("y", -10, 10) for trial in trials]
            for j in range(N_BATCH):
                study.tell(trials[j], X[j] ** 2 + Y[j] ** 2)
        print(f"{multivariate=} {constant_liar=} {time.time()-start}")

master

multivariate=False constant_liar=False 9.37236499786377
multivariate=False constant_liar=True 11.942036867141724
multivariate=True constant_liar=False 10.05676007270813
multivariate=True constant_liar=True 11.465260982513428

PR

multivariate=False constant_liar=False 9.201738119125366
multivariate=False constant_liar=True 12.185273170471191
multivariate=True constant_liar=False 9.935747146606445
multivariate=True constant_liar=True 12.39444088935852

Optimization performance

The difference in optimization performance becomes identical to simply having or not having constant_liar.

Details

from multiprocessing import Pool
import optuna
import optunahub
import matplotlib.pyplot as plt
from tqdm import tqdm
import numpy as np
from scipy.stats import mannwhitneyu

optuna.logging.set_verbosity(optuna.logging.WARNING)

N_STUDY = 100
N_TRIAL = 1000
N_BATCH = 50
function_id = 3

bbob = optunahub.load_module("benchmarks/bbob")


def run_study(args):
    n_dim, constant_liar, seed = args
    objective = bbob.Problem(function_id=function_id, dimension=n_dim, instance_id=1)
    assert objective.directions == [optuna.study.StudyDirection.MINIMIZE]

    study_name = f"{n_dim}_{constant_liar}_{seed:02d}"
    sampler = optuna.samplers.TPESampler(
        seed=seed,
        multivariate=True,
        constant_liar=constant_liar,
    )
    study = optuna.create_study(study_name=study_name, sampler=sampler)

    for i in range(0, N_TRIAL, N_BATCH):
        trials = [study.ask() for _ in range(N_BATCH)]
        for d in range(n_dim):
            for trial in trials:
                trial.suggest_float(f"x{d}", -5, 5)
        values = [objective(trial) for trial in trials]
        for trial, value in zip(trials, values):
            study.tell(trial, value)

    storage = optuna.storages.JournalStorage(
        optuna.storages.journal.JournalFileBackend(f"./benchmark_{function_id:02d}/{study_name}.log")
    )
    optuna.copy_study(
        from_study_name=study_name,
        from_storage=study._storage,
        to_storage=storage,
    )


def main():
    for n_dim in [2, 3, 5, 10, 20, 40]:
        args = []
        for constant_liar in (True, False):
            for seed in range(N_STUDY):
                args.append((n_dim, constant_liar, seed))
        with Pool(processes=10) as pool:
            with tqdm(total=len(args)) as t:
                for _ in pool.imap_unordered(run_study, args):
                    t.update(1)
        all_best_values = [[], []]
        for constant_liar in (True, False):
            for seed in range(N_STUDY):
                study_name = f"{n_dim}_{constant_liar}_{seed:02d}"
                storage = optuna.storages.JournalStorage(
                    optuna.storages.journal.JournalFileBackend(f"./benchmark_{function_id:02d}/{study_name}.log")
                )
                study = optuna.load_study(storage=storage, study_name=study_name)
                best_values = []
                best_value = float("inf")
                for trial in study.trials:
                    if best_value > trial.value:
                        best_value = trial.value
                    best_values.append(best_value)
                all_best_values[int(constant_liar)].append(best_values)
        all_best_values = np.asarray(all_best_values)
        p_values = []
        for i in range(N_TRIAL):
            p_value = mannwhitneyu(
                all_best_values[1, :, i],
                all_best_values[0, :, i],
                alternative='less',
            ).pvalue
            p_values.append(p_value)
        plt.plot(range(N_TRIAL), p_values, label=f"D={n_dim}")
    plt.ylim(0, 1)
    plt.legend(loc=1)
    plt.savefig(f"benchmark_{function_id:02}.png")


if __name__ == "__main__":
    main()

The following figures show the Mann-Whitney U tests. If the line is close to 0, constant_liar would result in better performance. Conversely, if it's close to 1, the effect would be the opposite. Overall, the effect isn't clear, but some settings clearly show an advantage with constant_liar.

function_id=3 (Rastrigin Function)

function_id=19 (Composite Griewank-Rosenbrock Function F8F2)

function_id=22 (Gallagher’s Gaussian 21-hi Peaks Function)

y0z · 2025-07-02T06:19:37Z

@nabenabe0928 @sawa3030 Could you review this PR?

nabenabe0928 · 2025-07-03T01:33:15Z

optuna/samplers/_tpe/sampler.py

-                    distribution = trial.distributions[param_name]
+            params = self._get_params(trial)
+            if all((param_name in params) for param_name in search_space):
+                for param_name, distribution in search_space.items():


Note
distribution is used only for to_internal_repr and the return of to_internal_repr is not affected by any dynamic search space.
Please also note that CategoricalDistribution cannot be dynamic.

optuna/samplers/_tpe/sampler.py

nabenabe0928 · 2025-07-03T02:48:11Z

optuna/samplers/_tpe/sampler.py

+            params = json.loads(params_str)
+            params.update(trial.params)


In my understanding, trial.params is a subset of params, but are there any scenarios where this is not the case?
I mean, doesn't params.update(trial.params) do nothing here?

trial.params is a subset of params

This is incorrect when the objective function changes.
By the way, using trial.params might be more appropriate in this case. 🤔

import optuna def objective1(trial): x = trial.suggest_float("x", 0, 10) y = trial.suggest_float("y", 0, 10) return x ** 2 + y ** 2 def objective2(trial): x = trial.suggest_float("x", 10, 20) y = trial.suggest_float("y", 10, 20) return x ** 2 + y ** 2 sampler = optuna.samplers.TPESampler(multivariate=True, constant_liar=True) study = optuna.create_study(sampler=sampler) study.optimize(objective1, n_trials=10) study.optimize(objective2, n_trials=1)

Thank you for the correction, I confirmed it:

[I 2025-07-08 05:37:34,237] Trial 9 finished with value: 65.10574953954168 and parameters: {'x': 2.3991078898053453, 'y': 7.703897122405998}. Best is trial 1 with value: 13.028020613050282. self._relative_params={'x': 1.391003910457944, 'y': 1.5331091061768467} self._relative_params={'x': 1.391003910457944, 'y': 1.5331091061768467} [W 2025-07-08 05:37:34,239] The parameter `x` in Trial#10 is sampled independently using `RandomSampler` instead of `TPESampler`, potentially degrading the optimization performance. This fallback happend because dynamic search space is not supported for `multivariate=True`. You can suppress this warning by setting `warn_independent_sampling` to `False` in the constructor of `TPESampler` if this independent sampling is intended behavior. self._relative_params={'x': 1.391003910457944, 'y': 1.5331091061768467} self._relative_params={'x': 1.391003910457944, 'y': 1.5331091061768467} [W 2025-07-08 05:37:34,241] The parameter `y` in Trial#10 is sampled independently using `RandomSampler` instead of `TPESampler`, potentially degrading the optimization performance. This fallback happend because dynamic search space is not supported for `multivariate=True`. You can suppress this warning by setting `warn_independent_sampling` to `False` in the constructor of `TPESampler` if this independent sampling is intended behavior. trial.params={'x': 13.801593636509187, 'y': 13.520713187030058} [I 2025-07-08 05:37:34,242] Trial 10 finished with value: 373.2936719932594 and parameters: {'x': 13.801593636509187, 'y': 13.520713187030058}. Best is trial 1 with value: 13.028020613050282.

By the way, using trial.params might be more appropriate in this case. 🤔

You mean something like this?

relative_params = ... trial_params = ... trial_params.update(relative_params)

If so, we should probably use search_space to limit the parameters to update:

trial_params.update([relative_params[param_name] for param_name in search_space if param_name in relative_params])

For example, I am not sure the exact and appropriate behavior for the following case:

relative_params = {"a": 11.0, "b": 12.0, "c": 100} trial_params = {"a": 1.0, "b": 2.0} # Assume `c` is suggested only if a > 10.

I mean that we just use trial.params when we detect the search space is changed.

Hm, that is indeed a very good point.
What about adding an inline comment about this?

optuna/samplers/_tpe/sampler.py

nabenabe0928 · 2025-07-03T02:51:31Z

The PR looks mostly good to me:)
Let me benchmark this PR once I find a time!

Co-authored-by: Shuhei Watanabe <47781922+nabenabe0928@users.noreply.github.com>

sawa3030

I'm sorry for the delay in reviewing. I've left a few comments. PTAL.

optuna/samplers/_tpe/sampler.py

sawa3030 · 2025-07-10T03:11:52Z

optuna/samplers/_tpe/sampler.py

        return self._sample(study, trial, {param_name: param_distribution})[param_name]

+    def _get_params(self, trial: FrozenTrial) -> dict[str, Any]:
+        if trial.state.is_finished():


Suggested change

if trial.state.is_finished():

if trial.state.is_finished() or not self._constant_liar:

I was wondering if this version might help avoid an unnecessary call to trial.system_attrs.get.

When self._constant_liar is False, the trial is always finished. Meanwhile, when self._multivariate is False, there was an unnecessary access. I fixed it.

Thank you for the correction!

sawa3030

Thank you for all the explanation. LGTM

optuna/samplers/_tpe/sampler.py

nabenabe0928

Sorry for the late response...:(
This PR looks almost good to me:)
Let me approve this PR once I get the response from @not522 to each of my comments!

Co-authored-by: Shuhei Watanabe <47781922+nabenabe0928@users.noreply.github.com>

optuna/samplers/_tpe/sampler.py

nabenabe0928

LGTM!

not522 · 2025-07-23T09:19:00Z

I noticed _get_params isn't working correctly in edge cases. I'm working on figuring out a fix.

nabenabe0928 · 2025-07-23T09:21:14Z

@not522
Sure, thank you for the report... 😭

codecov · 2025-07-23T09:23:41Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 88.21%. Comparing base (7cefd09) to head (da3ee61).
Report is 133 commits behind head on master.

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #6189      +/-   ##
==========================================
- Coverage   88.38%   88.21%   -0.17%     
==========================================
  Files         207      207              
  Lines       14030    14065      +35     
==========================================
+ Hits        12400    12408       +8     
- Misses       1630     1657      +27

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

not522 · 2025-07-28T06:56:08Z

I noticed _get_params isn't working correctly in edge cases. I'm working on figuring out a fix.

I found that the previous code may use the stale sampling result for sample_independent (See trial#10 on the following example).
I fixed it to filter out the current trial. PTAL.

import optuna

def objective1(trial):
    x = trial.suggest_float("x", 0, 10)
    y = trial.suggest_float("y", 0, 10)
    return x ** 2 + y ** 2

def objective2(trial):
    x = trial.suggest_float("x", 10, 20)
    y = trial.suggest_float("y", 10, 20)
    return x ** 2 + y ** 2

sampler = optuna.samplers.TPESampler(multivariate=True, constant_liar=True)
study = optuna.create_study(sampler=sampler)
study.optimize(objective1, n_trials=10)
study.optimize(objective2, n_trials=1)

nabenabe0928

Sorry for the delay, LGTM!

nabenabe0928 · 2025-08-04T02:35:09Z

I will merge this PR once @sawa3030 confirms the change!

sawa3030

Thank you for the correction. LGTM

Fix TPESampler with multivariate and constant_liar

688cd0f

not522 added the bug Issue/PR about behavior that is broken. Not for typos/examples/CI/test but for Optuna itself. label Jul 2, 2025

y0z assigned nabenabe0928 and sawa3030 Jul 2, 2025

Fix to save relative parameters while considering string length

023d3b7

nabenabe0928 reviewed Jul 3, 2025

View reviewed changes

optuna/samplers/_tpe/sampler.py Outdated Show resolved Hide resolved

nabenabe0928 reviewed Jul 3, 2025

View reviewed changes

optuna/samplers/_tpe/sampler.py Outdated Show resolved Hide resolved

Update optuna/samplers/_tpe/sampler.py

72d29b3

Co-authored-by: Shuhei Watanabe <47781922+nabenabe0928@users.noreply.github.com>

sawa3030 reviewed Jul 10, 2025

View reviewed changes

not522 added 3 commits July 11, 2025 18:06

Use join instead of iterative concatenation

375176b

Check params is not empty instead of search_space

cb16547

Skip system_attrs access for non-multivariate TPE

3262145

not522 force-pushed the store-relative-params branch from 17b6bdc to 3262145 Compare July 15, 2025 08:56

sawa3030 approved these changes Jul 17, 2025

View reviewed changes

y0z unassigned sawa3030 Jul 17, 2025

nabenabe0928 reviewed Jul 22, 2025

View reviewed changes

optuna/samplers/_tpe/sampler.py Show resolved Hide resolved

nabenabe0928 reviewed Jul 22, 2025

View reviewed changes

optuna/samplers/_tpe/sampler.py Outdated Show resolved Hide resolved

nabenabe0928 reviewed Jul 22, 2025

View reviewed changes

not522 and others added 2 commits July 23, 2025 17:34

Update optuna/samplers/_tpe/sampler.py

76a480c

Co-authored-by: Shuhei Watanabe <47781922+nabenabe0928@users.noreply.github.com>

Update optuna/samplers/_tpe/sampler.py

c453a1c

Co-authored-by: Shuhei Watanabe <47781922+nabenabe0928@users.noreply.github.com>

nabenabe0928 reviewed Jul 23, 2025

View reviewed changes

optuna/samplers/_tpe/sampler.py Outdated Show resolved Hide resolved

Update optuna/samplers/_tpe/sampler.py

da3ee61

nabenabe0928 approved these changes Jul 23, 2025

View reviewed changes

nabenabe0928 enabled auto-merge July 23, 2025 09:16

not522 disabled auto-merge July 23, 2025 09:19

not522 added 3 commits July 28, 2025 15:25

Merge branch 'master' into store-relative-params

23c060c

lint

5623d48

Filter out the current trial for constant_liar

fe3acff

not522 force-pushed the store-relative-params branch from 033d4d5 to fe3acff Compare July 28, 2025 06:51

nabenabe0928 added this to the v4.5.0 milestone Jul 30, 2025

nabenabe0928 approved these changes Aug 4, 2025

View reviewed changes

nabenabe0928 removed their assignment Aug 4, 2025

sawa3030 approved these changes Aug 4, 2025

View reviewed changes

nabenabe0928 merged commit 40dc7f7 into optuna:master Aug 4, 2025
14 checks passed

not522 deleted the store-relative-params branch August 4, 2025 06:09

sawa3030 mentioned this pull request Dec 26, 2025

Add constant liar strategy to GPSampler #6402

Closed

This was referenced Feb 10, 2026

Add constant liar strategy to GPSampler sawa3030/optuna#1

Closed

Add constant liar strategy to GPSampler #6430

Merged

not522 mentioned this pull request Mar 11, 2026

Fix TPESampler with multivariate and constant_liar #6505

Merged

	if trial.state.is_finished():
	if trial.state.is_finished() or not self._constant_liar:

Uh oh!

Conversation

not522 commented Jul 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Description of the changes

Benchmarks

Speed

Optimization performance

Uh oh!

y0z commented Jul 2, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

nabenabe0928 Jul 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

nabenabe0928 commented Jul 3, 2025

Uh oh!

sawa3030 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sawa3030 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

nabenabe0928 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

nabenabe0928 left a comment

Choose a reason for hiding this comment

Uh oh!

not522 commented Jul 23, 2025

Uh oh!

nabenabe0928 commented Jul 23, 2025

Uh oh!

codecov bot commented Jul 23, 2025

Codecov Report

Uh oh!

not522 commented Jul 28, 2025

Uh oh!

nabenabe0928 left a comment

Choose a reason for hiding this comment

Uh oh!

nabenabe0928 commented Aug 4, 2025

Uh oh!

sawa3030 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

not522 commented Jul 2, 2025 •

edited

Loading

nabenabe0928 Jul 3, 2025 •

edited

Loading

nabenabe0928 left a comment •

edited

Loading