Use the Newton method instead of bisect in ndtri_exp#6194
Merged
not522 merged 6 commits intooptuna:masterfrom Jul 8, 2025
Merged
Use the Newton method instead of bisect in ndtri_exp#6194not522 merged 6 commits intooptuna:masterfrom
ndtri_exp#6194not522 merged 6 commits intooptuna:masterfrom
Conversation
Member
|
@not522 Could you review this PR? |
Contributor
Author
|
This PR passes the following test: import math
import sys
from optuna.samplers._tpe._truncnorm import _ndtri_exp_single
from scipy.special import ndtri_exp
EPS = sys.float_info.min
for y in [-EPS] + [-10 ** i for i in range(-300, 10)]:
x = _ndtri_exp_single(y)
ans = ndtri_exp(y).item()
diff = abs(x - ans)
assert math.isclose(x, ans), f"{x=}, {ans=}"
print(f"{diff=:.2e}, {y=}, {x=}, {ans=}") |
not522
reviewed
Jul 7, 2025
optuna/samplers/_tpe/_truncnorm.py
Outdated
| --> x = sqrt(-2 * (y + 1/2 * log(2pi)) | ||
|
|
||
| For the moderate y, we use Eq. (13), i.e., standard logistic CDF, in the following paper: | ||
| - Approximating the Cumulative Distribution Function of the Normal Distribution. |
Member
There was a problem hiding this comment.
Could you change it to a standard citation format?
Contributor
Author
There was a problem hiding this comment.
@not522 Thank you for bringing this up! I applied your suggestion!
not522
reviewed
Jul 8, 2025
Co-authored-by: Naoto Mizuno <naotomizuno@preferred.jp>
not522
approved these changes
Jul 8, 2025
Member
not522
left a comment
There was a problem hiding this comment.
LGTM!
I evaluated the error using the following code.
Details
import math
import sys
import numpy as np
import scipy.special
import mpmath
import matplotlib.pyplot as plt
_norm_pdf_C = math.sqrt(2 * math.pi)
_norm_pdf_logC = math.log(_norm_pdf_C)
_ndtri_exp_approx_C = math.sqrt(3) / math.pi
def _ndtr_single(a):
x = a / 2**0.5
if x < -1 / 2**0.5:
y = 0.5 * math.erfc(-x)
elif x < 1 / 2**0.5:
y = 0.5 + 0.5 * math.erf(x)
else:
y = 1.0 - 0.5 * math.erfc(x)
return y
def _log_ndtr_single(a):
if a > 6:
return -_ndtr_single(-a)
if a > -20:
return math.log(_ndtr_single(a))
log_LHS = -0.5 * a**2 - math.log(-a) - 0.5 * math.log(2 * math.pi)
last_total = 0.0
right_hand_side = 1.0
numerator = 1.0
denom_factor = 1.0
denom_cons = 1 / a**2
sign = 1
i = 0
while abs(last_total - right_hand_side) > sys.float_info.epsilon:
i += 1
last_total = right_hand_side
sign = -sign
denom_factor *= denom_cons
numerator *= 2 * i - 1
right_hand_side += sign * numerator * denom_factor
return log_LHS + math.log(right_hand_side)
def _bisect(f, a, b, c):
if f(a) > c:
a, b = b, a
# In the algorithm, it is assumed that all of (a + b), (a * 2), and (b * 2) are finite.
for _ in range(100):
m = (a + b) / 2
if a == m or b == m:
return m
if f(m) < c:
a = m
else:
b = m
return (a + b) / 2
def _ndtri_exp_single_master(y):
# TODO(amylase): Justify this constant
return _bisect(_log_ndtr_single, -100, +100, y)
def _ndtri_exp_single_pr(y):
if y > -sys.float_info.min:
return math.inf if y <= 0 else math.nan
if y > -1e-2: # Case 1. abs(y) << 1.
u = -2.0 * math.log(-y)
x = math.sqrt(u - math.log(u))
elif y < -5: # Case 2. abs(y) >> 1.
x = -math.sqrt(-2.0 * (y + _norm_pdf_logC))
else: # Case 3. Moderate y.
x = -_ndtri_exp_approx_C * math.log(math.exp(-y) - 1)
log_ndtr_x = math.nan
for _ in range(100):
log_ndtr_x = _log_ndtr_single(x)
log_norm_pdf_x = -0.5 * x**2 - _norm_pdf_logC
# NOTE(nabenabe): Use exp(log_ndtr_x - log_norm_pdf_x) instead of ndtr_x / norm_pdf_x for
# numerical stability.
dx = (log_ndtr_x - y) * math.exp(log_ndtr_x - log_norm_pdf_x)
x -= dx
if abs(dx) < 1e-8 * abs(x): # Equivalent to np.isclose with atol=0.0 and rtol=1e-8.
break
return x
def _ndtri_exp_single_mp(y):
a = -1e9
b = +1e9
for _ in range(1000):
m = (a + b) / 2
if mpmath.log(mpmath.ncdf(m)) < y:
a = m
else:
b = m
return (a + b) / 2
mpmath.mp.dps = 100
clips = [(-1000, 0), (-10, 0), (-0.1, 0), (-0.001, 0)]
fig, axes = plt.subplots(2, len(clips)//2, figsize=(9, 5), constrained_layout=True)
for (a, b), axis in zip(clips, axes.ravel()):
x = np.linspace(a, b, 100, endpoint=False)
y_scipy = scipy.special.ndtri_exp(x)
y_master = np.array([_ndtri_exp_single_master(t) for t in x])
y_pr = np.array([_ndtri_exp_single_pr(t) for t in x])
y_mp = np.array([_ndtri_exp_single_mp(t) for t in x])
err_scipy = y_scipy - y_mp
err_master = y_master - y_mp
err_pr = y_pr - y_mp
axis.plot(x, err_scipy, label="SciPy")
axis.plot(x, err_master, label="master")
axis.plot(x, err_pr, label="PR")
axis.grid()
axis.legend(loc='lower left')
plt.savefig("ndtri_exp.png")
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

Motivation
Since
ndtri_expis one of the bottleneck inTPESampler, I speeded up thendtri_expimplementation.ndtri_exp(y)essentially finds the root off(x) = log_ndtr(x) - y = 0.Currently, our implementation uses the binary search, but the binary search is much slower than the Newton method, so I will replace the binary search with the Newton method.
Description of the changes
The landscape of the initial guess is available below:
Benchmarking Results
Important
27% speedup 🎉
Note
When using our initial guess, the iteration (the number of
log_ndtr_singlecalls) reduces by 28% in comparison tox=0with the Newton method 😄When compared to the binary search, the reduction is 92% 😎
Code
Results by Master
Results by this PR