Skip to content

LogisticRegression with lbfgs solver terminates before convergence #18074

@geraschenko

Description

@geraschenko

LogisticRegresssion with the lbfgs solver terminates early, even when tol is decreased and max_iter has not been reached.

Code to Reproduce

We fit random data twice, changing only the order of the examples. Ideally, example order should not matter; the fit coefficients should be the same either way. I produced the results below with this code in colab.

from sklearn.linear_model import LogisticRegression
import numpy as np

n_features = 1000
n_examples = 1500

np.random.seed(0)
x = np.random.random((n_examples, n_features))
y = np.random.randint(2, size=n_examples)
max_iter=1000
solver = 'lbfgs'

for tol in [1e-2, 1e-3, 1e-4, 1e-5]:
  np.random.seed(0)
  lr1 = LogisticRegression(solver=solver, max_iter=max_iter, tol=tol).fit(x, y)

  np.random.seed(0)
  lr2 = LogisticRegression(solver=solver, max_iter=max_iter, tol=tol).fit(x[::-1], y[::-1])

  print(f'tol={tol}')
  print(f'  Optimizer iterations, forward order: {lr1.n_iter_[0]}, reverse order: {lr2.n_iter_[0]}.')
  print(f'  Mean absolute diff in coefficients: {np.abs(lr1.coef_ - lr2.coef_).mean()}')

Expected Results

As tol is reduced, the difference between coefficients continues to decrease provided that max_iter is not being hit. When solver is changed to 'newton-cg', we get the expected behavior:

tol=0.01
  Optimizer iterations, forward order: 12, reverse order: 11.
  Mean absolute diff in coefficients: 0.0004846833304941047
tol=0.001
  Optimizer iterations, forward order: 15, reverse order: 14.
  Mean absolute diff in coefficients: 5.4776672871601846e-05
tol=0.0001
  Optimizer iterations, forward order: 19, reverse order: 16.
  Mean absolute diff in coefficients: 1.6047945654930538e-06
tol=1e-05
  Optimizer iterations, forward order: 19, reverse order: 17.
  Mean absolute diff in coefficients: 2.76826465093659e-07

Actual Results

As tol is reduced, the optimizer does not take more steps despite not having converged:

tol=0.01
  Optimizer iterations, forward order: 362, reverse order: 376.
  Mean absolute diff in coefficients: 0.0007590864459748883
tol=0.001
  Optimizer iterations, forward order: 373, reverse order: 401.
  Mean absolute diff in coefficients: 0.0006877678611572595
tol=0.0001
  Optimizer iterations, forward order: 373, reverse order: 401.
  Mean absolute diff in coefficients: 0.0006877678611572595
tol=1e-05
  Optimizer iterations, forward order: 373, reverse order: 401.
  Mean absolute diff in coefficients: 0.0006877678611572595

Versions

Output of sklearn.show_versions():

System:
    python: 3.6.9 (default, Jul 17 2020, 12:50:27)  [GCC 8.4.0]
executable: /usr/bin/python3
   machine: Linux-4.19.104+-x86_64-with-Ubuntu-18.04-bionic

Python dependencies:
          pip: 19.3.1
   setuptools: 49.2.0
      sklearn: 0.23.1
        numpy: 1.18.5
        scipy: 1.4.1
       Cython: 0.29.21
       pandas: 1.0.5
   matplotlib: 3.2.2
       joblib: 0.16.0
threadpoolctl: 2.1.0

Built with OpenMP: True

Diagnosis

I'm pretty sure the issue is in the call to scipy.optimize.minimize at this line in linear_model/_logistic.py. The value of tol is passed to minimize as gtol, but ftol and eps are left at their default values. In the example above, I think the optimizer is hitting the ftol termination condition. Possible solutions:

  • Scale down ftol and eps by some multiple of tol.
  • Scale down eps by some multiple of tol and set ftol to zero.
  • Allow the user of LogisticRegression to control ftol and eps through additional kwargs.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions