Use the Newton method instead of bisect in `ndtri_exp` by nabenabe0928 · Pull Request #6194 · optuna/optuna

nabenabe0928 · 2025-07-04T10:17:32Z

Motivation

Since ndtri_exp is one of the bottleneck in TPESampler, I speeded up the ndtri_exp implementation.
ndtri_exp(y) essentially finds the root of f(x) = log_ndtr(x) - y = 0.
Currently, our implementation uses the binary search, but the binary search is much slower than the Newton method, so I will replace the binary search with the Newton method.

Description of the changes

Replace the binary search with the Newton method
Introduce the good initial guess for the Newton method
Describe the algorithms in the documentation string

The landscape of the initial guess is available below:

Benchmarking Results

Important

27% speedup 🎉

Note

When using our initial guess, the iteration (the number of log_ndtr_single calls) reduces by 28% in comparison to x=0 with the Newton method 😄
When compared to the binary search, the reduction is 92% 😎

This PR	Master
6.04 $\pm$ 0.117	8.27 $\pm$ 0.057

Code

import time
import optuna


optuna.logging.set_verbosity(optuna.logging.CRITICAL)

for seed in range(10):
    print(f"Start with {seed=}")
    sampler = optuna.samplers.TPESampler(seed=42)
    study = optuna.create_study(sampler=sampler)
    start = time.time()
    study.optimize(lambda t: sum(t.suggest_float(f"x{i}", -5, 5)**2 for i in range(10)), n_trials=500)
    print(time.time() - start)

Results by Master

Start with seed=0
7.831433057785034
Start with seed=1
8.079424858093262
Start with seed=2
8.192095518112183
Start with seed=3
8.294391393661499
Start with seed=4
8.307274580001831
Start with seed=5
8.376489162445068
Start with seed=6
8.367579698562622
Start with seed=7
8.417609453201294
Start with seed=8
8.40910816192627
Start with seed=9
8.42704153060913

Results by this PR

Start with seed=0
5.634321689605713
Start with seed=1
5.7635557651519775
Start with seed=2
5.904412508010864
Start with seed=3
6.616066932678223
Start with seed=4
6.8322083950042725
Start with seed=5
6.148069143295288
Start with seed=6
6.042054891586304
Start with seed=7
5.861287593841553
Start with seed=8
5.825707912445068
Start with seed=9
5.786619663238525

y0z · 2025-07-04T10:44:35Z

@not522 Could you review this PR?

nabenabe0928 · 2025-07-04T10:56:20Z

This PR passes the following test:

import math
import sys

from optuna.samplers._tpe._truncnorm import _ndtri_exp_single
from scipy.special import ndtri_exp


EPS = sys.float_info.min
for y in [-EPS] + [-10 ** i for i in range(-300, 10)]:
    x = _ndtri_exp_single(y)
    ans = ndtri_exp(y).item()
    diff = abs(x - ans)
    assert math.isclose(x, ans), f"{x=}, {ans=}"
    print(f"{diff=:.2e}, {y=}, {x=}, {ans=}")

contramundum53

LGTM.

not522 · 2025-07-07T06:37:25Z

optuna/samplers/_tpe/_truncnorm.py

+        --> x = sqrt(-2 * (y + 1/2 * log(2pi))
+
+    For the moderate y, we use Eq. (13), i.e., standard logistic CDF, in the following paper:
+        - Approximating the Cumulative Distribution Function of the Normal Distribution.


Could you change it to a standard citation format?

@not522 Thank you for bringing this up! I applied your suggestion!

optuna/samplers/_tpe/_truncnorm.py

Co-authored-by: Naoto Mizuno <naotomizuno@preferred.jp>

not522

LGTM!
I evaluated the error using the following code.

Details

import math
import sys
import numpy as np
import scipy.special
import mpmath
import matplotlib.pyplot as plt


_norm_pdf_C = math.sqrt(2 * math.pi)
_norm_pdf_logC = math.log(_norm_pdf_C)
_ndtri_exp_approx_C = math.sqrt(3) / math.pi


def _ndtr_single(a):
    x = a / 2**0.5

    if x < -1 / 2**0.5:
        y = 0.5 * math.erfc(-x)
    elif x < 1 / 2**0.5:
        y = 0.5 + 0.5 * math.erf(x)
    else:
        y = 1.0 - 0.5 * math.erfc(x)

    return y


def _log_ndtr_single(a):
    if a > 6:
        return -_ndtr_single(-a)
    if a > -20:
        return math.log(_ndtr_single(a))

    log_LHS = -0.5 * a**2 - math.log(-a) - 0.5 * math.log(2 * math.pi)
    last_total = 0.0
    right_hand_side = 1.0
    numerator = 1.0
    denom_factor = 1.0
    denom_cons = 1 / a**2
    sign = 1
    i = 0

    while abs(last_total - right_hand_side) > sys.float_info.epsilon:
        i += 1
        last_total = right_hand_side
        sign = -sign
        denom_factor *= denom_cons
        numerator *= 2 * i - 1
        right_hand_side += sign * numerator * denom_factor

    return log_LHS + math.log(right_hand_side)


def _bisect(f, a, b, c):
    if f(a) > c:
        a, b = b, a
    # In the algorithm, it is assumed that all of (a + b), (a * 2), and (b * 2) are finite.
    for _ in range(100):
        m = (a + b) / 2
        if a == m or b == m:
            return m
        if f(m) < c:
            a = m
        else:
            b = m
    return (a + b) / 2


def _ndtri_exp_single_master(y):
    # TODO(amylase): Justify this constant
    return _bisect(_log_ndtr_single, -100, +100, y)


def _ndtri_exp_single_pr(y):
    if y > -sys.float_info.min:
        return math.inf if y <= 0 else math.nan

    if y > -1e-2:  # Case 1. abs(y) << 1.
        u = -2.0 * math.log(-y)
        x = math.sqrt(u - math.log(u))
    elif y < -5:  # Case 2. abs(y) >> 1.
        x = -math.sqrt(-2.0 * (y + _norm_pdf_logC))
    else:  # Case 3. Moderate y.
        x = -_ndtri_exp_approx_C * math.log(math.exp(-y) - 1)

    log_ndtr_x = math.nan
    for _ in range(100):
        log_ndtr_x = _log_ndtr_single(x)
        log_norm_pdf_x = -0.5 * x**2 - _norm_pdf_logC
        # NOTE(nabenabe): Use exp(log_ndtr_x - log_norm_pdf_x) instead of ndtr_x / norm_pdf_x for
        # numerical stability.
        dx = (log_ndtr_x - y) * math.exp(log_ndtr_x - log_norm_pdf_x)
        x -= dx
        if abs(dx) < 1e-8 * abs(x):  # Equivalent to np.isclose with atol=0.0 and rtol=1e-8.
            break

    return x


def _ndtri_exp_single_mp(y):
    a = -1e9
    b = +1e9
    for _ in range(1000):
        m = (a + b) / 2
        if mpmath.log(mpmath.ncdf(m)) < y:
            a = m
        else:
            b = m
    return (a + b) / 2


mpmath.mp.dps = 100

clips = [(-1000, 0), (-10, 0), (-0.1, 0), (-0.001, 0)]

fig, axes = plt.subplots(2, len(clips)//2, figsize=(9, 5), constrained_layout=True)

for (a, b), axis in zip(clips, axes.ravel()):
    x = np.linspace(a, b, 100, endpoint=False)
    y_scipy = scipy.special.ndtri_exp(x)
    y_master = np.array([_ndtri_exp_single_master(t) for t in x])
    y_pr = np.array([_ndtri_exp_single_pr(t) for t in x])
    y_mp = np.array([_ndtri_exp_single_mp(t) for t in x])
    err_scipy = y_scipy - y_mp
    err_master = y_master - y_mp
    err_pr = y_pr - y_mp

    axis.plot(x, err_scipy, label="SciPy")
    axis.plot(x, err_master, label="master")
    axis.plot(x, err_pr, label="PR")
    axis.grid()
    axis.legend(loc='lower left')

plt.savefig("ndtri_exp.png")

Use the Newton method instead of bisect in ndtri_exp

8fbd162

nabenabe0928 added the enhancement Change that does not break compatibility and not affect public interfaces, but improves performance. label Jul 4, 2025

Apply formatter

fe3e37a

y0z assigned not522 Jul 4, 2025

nabenabe0928 added 2 commits July 4, 2025 13:05

Add unit test

820c636

Apply formatter

91a7028

contramundum53 approved these changes Jul 6, 2025

View reviewed changes

not522 reviewed Jul 7, 2025

View reviewed changes

Fix the citation format

cd7965c

nabenabe0928 added this to the v4.5.0 milestone Jul 7, 2025

not522 reviewed Jul 8, 2025

View reviewed changes

optuna/samplers/_tpe/_truncnorm.py Outdated Show resolved Hide resolved

Update optuna/samplers/_tpe/_truncnorm.py

933b78a

Co-authored-by: Naoto Mizuno <naotomizuno@preferred.jp>

not522 approved these changes Jul 8, 2025

View reviewed changes

not522 merged commit 5b54028 into optuna:master Jul 8, 2025
14 checks passed

nabenabe0928 unassigned not522 Jul 8, 2025

not522 mentioned this pull request Jul 9, 2025

TPESampler warns numerical error #5953

Closed

nabenabe0928 mentioned this pull request Jul 9, 2025

Speed up TPESampler using approximation in standard normal related computation #5478

Open

nabenabe0928 mentioned this pull request Oct 16, 2025

Speedup of TPESampler in Optuna nabenabe0928/my-skills#4

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Use the Newton method instead of bisect in `ndtri_exp`#6194

Use the Newton method instead of bisect in `ndtri_exp`#6194
not522 merged 6 commits intooptuna:masterfrom
nabenabe0928:enhance/use-newton-not-bisect-in-ndtri-exp

nabenabe0928 commented Jul 4, 2025 •

edited

Loading

Uh oh!

y0z commented Jul 4, 2025

Uh oh!

nabenabe0928 commented Jul 4, 2025

Uh oh!

contramundum53 left a comment

Uh oh!

not522 Jul 7, 2025

Uh oh!

nabenabe0928 Jul 7, 2025

Uh oh!

Uh oh!

not522 left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Conversation

nabenabe0928 commented Jul 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Description of the changes

Benchmarking Results

Uh oh!

y0z commented Jul 4, 2025

Uh oh!

nabenabe0928 commented Jul 4, 2025

Uh oh!

contramundum53 left a comment

Choose a reason for hiding this comment

Uh oh!

not522 Jul 7, 2025

Choose a reason for hiding this comment

Uh oh!

nabenabe0928 Jul 7, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

not522 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

nabenabe0928 commented Jul 4, 2025 •

edited

Loading