LogisticRegression with lbfgs solver terminates before convergence

LogisticRegresssion with the lbfgs solver terminates early, even when `tol` is decreased and `max_iter` has not been reached.

#### Code to Reproduce
We fit random data twice, changing only the order of the examples. Ideally, example order should not matter; the fit coefficients should be the same either way. I produced the results below with this code [in colab](https://colab.research.google.com/drive/1ulisxe27x9QFAO6ROMU5uEt-IcinwenX).
```python
from sklearn.linear_model import LogisticRegression
import numpy as np

n_features = 1000
n_examples = 1500

np.random.seed(0)
x = np.random.random((n_examples, n_features))
y = np.random.randint(2, size=n_examples)
max_iter=1000
solver = 'lbfgs'

for tol in [1e-2, 1e-3, 1e-4, 1e-5]:
  np.random.seed(0)
  lr1 = LogisticRegression(solver=solver, max_iter=max_iter, tol=tol).fit(x, y)

  np.random.seed(0)
  lr2 = LogisticRegression(solver=solver, max_iter=max_iter, tol=tol).fit(x[::-1], y[::-1])

  print(f'tol={tol}')
  print(f'  Optimizer iterations, forward order: {lr1.n_iter_[0]}, reverse order: {lr2.n_iter_[0]}.')
  print(f'  Mean absolute diff in coefficients: {np.abs(lr1.coef_ - lr2.coef_).mean()}')
```
#### Expected Results
As `tol` is reduced, the difference between coefficients continues to decrease provided that `max_iter` is not being hit. When `solver` is changed to `'newton-cg'`, we get the expected behavior:
```
tol=0.01
  Optimizer iterations, forward order: 12, reverse order: 11.
  Mean absolute diff in coefficients: 0.0004846833304941047
tol=0.001
  Optimizer iterations, forward order: 15, reverse order: 14.
  Mean absolute diff in coefficients: 5.4776672871601846e-05
tol=0.0001
  Optimizer iterations, forward order: 19, reverse order: 16.
  Mean absolute diff in coefficients: 1.6047945654930538e-06
tol=1e-05
  Optimizer iterations, forward order: 19, reverse order: 17.
  Mean absolute diff in coefficients: 2.76826465093659e-07
```

#### Actual Results
As `tol` is reduced, the optimizer does not take more steps despite not having converged:
```
tol=0.01
  Optimizer iterations, forward order: 362, reverse order: 376.
  Mean absolute diff in coefficients: 0.0007590864459748883
tol=0.001
  Optimizer iterations, forward order: 373, reverse order: 401.
  Mean absolute diff in coefficients: 0.0006877678611572595
tol=0.0001
  Optimizer iterations, forward order: 373, reverse order: 401.
  Mean absolute diff in coefficients: 0.0006877678611572595
tol=1e-05
  Optimizer iterations, forward order: 373, reverse order: 401.
  Mean absolute diff in coefficients: 0.0006877678611572595
```

#### Versions
Output of `sklearn.show_versions()`:
```
System:
    python: 3.6.9 (default, Jul 17 2020, 12:50:27)  [GCC 8.4.0]
executable: /usr/bin/python3
   machine: Linux-4.19.104+-x86_64-with-Ubuntu-18.04-bionic

Python dependencies:
          pip: 19.3.1
   setuptools: 49.2.0
      sklearn: 0.23.1
        numpy: 1.18.5
        scipy: 1.4.1
       Cython: 0.29.21
       pandas: 1.0.5
   matplotlib: 3.2.2
       joblib: 0.16.0
threadpoolctl: 2.1.0

Built with OpenMP: True
```
#### Diagnosis
I'm pretty sure the issue is in the call to [scipy.optimize.minimize](https://docs.scipy.org/doc/scipy/reference/optimize.minimize-lbfgsb.html#optimize-minimize-lbfgsb) at [this line in linear_model/_logistic.py](https://github.com/scikit-learn/scikit-learn/blob/fd237278e895b42abe8d8d09105cbb82dc2cbba7/sklearn/linear_model/_logistic.py#L757). The value of `tol` is passed to `minimize` as `gtol`, but `ftol` and `eps` are left at their default values. In the example above, I think the optimizer is hitting the `ftol` termination condition. Possible solutions:
* Scale down `ftol` and `eps` by some multiple of `tol`.
* Scale down `eps` by some multiple of `tol` and set `ftol` to zero.
* Allow the user of LogisticRegression to control `ftol` and `eps` through additional kwargs.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

LogisticRegression with lbfgs solver terminates before convergence #18074

Code to Reproduce

Expected Results

Actual Results

Versions

Diagnosis

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

LogisticRegression with lbfgs solver terminates before convergence #18074

Description

Code to Reproduce

Expected Results

Actual Results

Versions

Diagnosis

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions