RFC try to warn on iid less often by amueller · Pull Request #11613 · scikit-learn/scikit-learn

amueller · 2018-07-17T22:54:43Z

This suppresses the iid warning more often. But then the question is what are good tolerances for allclose?

wdevazelhes · 2018-07-17T23:14:03Z

wdevazelhes · 2018-07-17T23:27:22Z

But then the question is what are good tolerances for allclose?

Do you mean a general discussion on what should be the tolerance in this case (uncertainty on a scoring for CV) or rather running a benchmark on some particular tasks and based on that make a decision ?

GaelVaroquaux · 2018-07-18T09:30:35Z

I don't know how to set the tolerance here. I have no idea what the expect difference is.

Have you noticed that a given value gives a strong reduction in warnings?

jnothman · 2018-08-09T01:23:26Z

What's the resolution here? We can be conservative and make the tolerance a little smaller, i.e. smaller than anyone tends to report...

NicolasHug · 2018-08-17T19:09:57Z

So I made some quick benchmark counting the number of warnings raised by iid in the examples (that's what we're trying to reduce, right?).

When warning everytime when iid='warn' (different from what is done on master now):

~/dev/sklearn(branch:pr/11613*) » python test_iid_warnings.py                                                                                                                      nico@cotier
running examples.applications.plot_face_recognition... Got 1 iid warnings.
running examples.cluster.plot_feature_agglomeration_vs_univariate_selection... Got 2 iid warnings.
running examples.compose.plot_column_transformer_mixed_types... Got 0 iid warnings.
running examples.compose.plot_compare_reduction... Got 2 iid warnings.
running examples.compose.plot_digits_pipe... Got 1 iid warnings.
running examples.compose.plot_feature_union... Got 1 iid warnings.
running examples.covariance.plot_covariance_estimation... Got 1 iid warnings.
running examples.decomposition.plot_pca_vs_fa_model_selection... Got 2 iid warnings.
running examples.exercises.plot_cv_diabetes... Got 1 iid warnings.
running examples.gaussian_process.plot_compare_gpr_krr... Got 1 iid warnings.
running examples.model_selection.grid_search_text_feature_extraction... Got 0 iid warnings.
running examples.model_selection.plot_grid_search_digits... Got 2 iid warnings.
running examples.model_selection.plot_multi_metric_evaluation... Got 1 iid warnings.
running examples.model_selection.plot_nested_cross_validation_iris... Got 150 iid warnings.
running examples.model_selection.plot_randomized_search... Got 2 iid warnings.
running examples.neighbors.plot_digits_kde_sampling... Got 1 iid warnings.
running examples.neural_networks.plot_rbm_logistic_classification... Got 0 iid warnings.
running examples.plot_kernel_ridge_regression... Got 2 iid warnings.
running examples.preprocessing.plot_discretization_classification... Got 0 iid warnings.
running examples.svm.plot_rbf_parameters... Got 1 iid warnings.
running examples.svm.plot_svm_scale_c... Got 6 iid warnings.
Total number of iid warnings: 177

With the current changes:

~/dev/sklearn(branch:pr/11613*) » python test_iid_warnings.py                                                                                                                      nico@cotier
running examples.applications.plot_face_recognition... Got 0 iid warnings.
running examples.cluster.plot_feature_agglomeration_vs_univariate_selection... Got 0 iid warnings.
running examples.compose.plot_column_transformer_mixed_types... Got 0 iid warnings.
running examples.compose.plot_compare_reduction... Got 2 iid warnings.
running examples.compose.plot_digits_pipe... Got 0 iid warnings.
running examples.compose.plot_feature_union... Got 1 iid warnings.
running examples.covariance.plot_covariance_estimation... Got 1 iid warnings.
running examples.decomposition.plot_pca_vs_fa_model_selection... Got 0 iid warnings.
running examples.exercises.plot_cv_diabetes... Got 0 iid warnings.
running examples.gaussian_process.plot_compare_gpr_krr... Got 0 iid warnings.
running examples.model_selection.grid_search_text_feature_extraction... Got 0 iid warnings.
running examples.model_selection.plot_grid_search_digits... Got 0 iid warnings.
running examples.model_selection.plot_multi_metric_evaluation... Got 0 iid warnings.
running examples.model_selection.plot_nested_cross_validation_iris... Got 83 iid warnings.
running examples.model_selection.plot_randomized_search... Got 1 iid warnings.
running examples.neighbors.plot_digits_kde_sampling... Got 0 iid warnings.
running examples.neural_networks.plot_rbm_logistic_classification... Got 0 iid warnings.
running examples.plot_kernel_ridge_regression... Got 0 iid warnings.
running examples.preprocessing.plot_discretization_classification... Got 0 iid warnings.
running examples.svm.plot_rbf_parameters... Got 0 iid warnings.
running examples.svm.plot_svm_scale_c... Got 0 iid warnings.
Total number of iid warnings: 88
(sklearn) ------------------------

Looks like most of the warnings come from examples.model_selection.plot_nested_cross_validation_iris

Code:

Details

import importlib
import warnings
import sys

import matplotlib.pyplot as plt


plt.ion()  # interactive mode on, prevents matplotlib from blocking
sys.stderr = open('/dev/null', 'w')  # hide error messages

examples = [  # list of examples directly using ****SearchCV()
    'examples.applications.plot_face_recognition',
    'examples.cluster.plot_feature_agglomeration_vs_univariate_selection',
    'examples.compose.plot_column_transformer_mixed_types',
    'examples.compose.plot_compare_reduction',
    'examples.compose.plot_digits_pipe',
    'examples.compose.plot_feature_union',
    'examples.covariance.plot_covariance_estimation',
    'examples.decomposition.plot_pca_vs_fa_model_selection',
    'examples.exercises.plot_cv_diabetes',
    'examples.gaussian_process.plot_compare_gpr_krr',
    'examples.model_selection.grid_search_text_feature_extraction',
    'examples.model_selection.plot_grid_search_digits',
    'examples.model_selection.plot_multi_metric_evaluation',
    'examples.model_selection.plot_nested_cross_validation_iris',
    'examples.model_selection.plot_randomized_search',
    'examples.neighbors.plot_digits_kde_sampling',
    'examples.neural_networks.plot_rbm_logistic_classification',
    'examples.plot_kernel_ridge_regression',
    'examples.preprocessing.plot_discretization_classification',
    'examples.svm.plot_rbf_parameters',
    'examples.svm.plot_svm_scale_c',
]

n_iid_warnings = 0
for e in examples:
    print('running {}... '.format(e), end='', flush=True)
    with warnings.catch_warnings(record=True) as warnings_:
        # Cause deprecation warnings to always be triggered.
        warnings.simplefilter("always", DeprecationWarning)

        sys.stdout = open('/dev/null', 'w')  # avoid prints from example exec
        importlib.import_module(e)
        sys.stdout = sys.__stdout__  # restore the original stdout.

        iid_warnings = [w for w in warnings_ if
                        "The default of the `iid` parameter will change" in
                        str(w.message)]

        print('Got {} iid warnings.'.format(len(iid_warnings)))
        n_iid_warnings += len(iid_warnings)

print('Total number of iid warnings: {}'.format(n_iid_warnings))

Note: there may be actually more warnings if some of the examples indirectly call one of the CV tools. The list of examples I use comes from git grep --name-only SearchCV examples/

I can try different tol values and see which one has the least number of warnings. I just want to make sure this is actually what we want here.

Edit: Increasing both tol to 1e-3 gives only 1 warning.

NicolasHug · 2018-08-17T19:10:24Z

sklearn/model_selection/_search.py

+                if not np.allclose(means_weighted, means_unweighted,
+                                   rtol=1e-4, atol=1e-4):
+                    warn = True
+                    continue


Should this be a break?

qinhanmin2014 · 2018-08-19T14:33:30Z

Is it good to suppresses the deprecation warning? It seems that the common way we handle these problems in this release is to set iid=False in the examples and use @pytest.mark.filterwarnings to filter the warnings in the tests (also recorded in our contributing guide). Will it be an acceptable solution here?

jnothman · 2018-08-19T23:07:08Z

The thing is that: * For many uses this change (iid=t/f) makes negligible difference to the results * As the parameter is being deprecated, asking users to set an explicit value for the parameter means their code will break when the parameter disappears * Too many warnings lead to them being ignored

amueller · 2018-08-20T15:34:59Z

Maybe to add to @jnothman's excellent explanation: Here we're using the examples code as a proxy for the user's code. I want the users code not to give useless warnings.

amueller · 2018-08-20T15:41:04Z

So if we're conservative we could do 1e-4 or 1e-5 and we'd get rid of some. If we're more aggressive we can do 1e-3 and get rid of most warnings.

jnothman · 2018-08-20T23:02:46Z

I think 1e-4 is appropriate to what people usually report.

qinhanmin2014 · 2018-08-21T13:25:26Z

I think I'm persuaded here.
So what's our plan for the remaining warnings is the examples? Leave them or add iid=False to them?

amueller · 2018-08-21T13:40:32Z

I'd leave them, otherwise we need to change again soon.

amueller · 2018-08-21T18:16:45Z

@jnothman rtol or atol? So I'll just leave it as is?

amueller · 2018-08-21T18:18:16Z

merge for RC?

jnothman · 2018-08-21T23:38:07Z

It would be better to mostly rely on rtol, but I haven't stopped to think about the value.

qinhanmin2014

LGTM (maybe we can be more aggressive here).

qinhanmin2014 · 2018-08-22T01:08:11Z

Btw @jnothman @amueller How will you handle existing iid=False in examples?

doc/tutorial/text_analytics/working_with_text_data.rst:  >>> gs_clf = GridSearchCV(text_clf, parameters, cv=5, iid=False, n_jobs=-1)
examples/compose/plot_column_transformer_mixed_types.py:grid_search = GridSearchCV(clf, param_grid, cv=10, iid=False)
examples/preprocessing/plot_discretization_classification.py:                           iid=False)

amueller · 2018-08-22T01:22:17Z

Well once we deprecate it in 0.22 we'll remove those, right?

qinhanmin2014 · 2018-08-22T01:27:28Z

Well once we deprecate it in 0.22 we'll remove those, right?

@amueller The issue here is that we don't set iid=False in most examples. Will it be strange for these examples to set iid=False? Should we remove iid=False in these examples?

amueller · 2018-08-22T01:34:25Z

If it's already there I wouldn't worry too much about it.

* tag '0.20rc1': (1109 commits) MNT rc version DOC Release dates for 0.20 (scikit-learn#11838) DOC Fix: require n_splits > 1 in TimeSeriesSplit (scikit-learn#11937) FIX xfail for MacOS LogisticRegressionCV stability (scikit-learn#11936) MNT: Use GEMV in enet_coordinate_descent (Pt. 1) (scikit-learn#11896) [MRG] TST/FIX stop optics reachability failure on 32bit (scikit-learn#11916) ENH add multi_class='auto' for LogisticRegression, default from 0.22; default solver will be 'lbfgs' (scikit-learn#11905) MAINT Fix test_logistic::test_dtype_match failure on 32 bit arch (scikit-learn#11899) DOC Updated link to Laurens van der Maaten's home page (scikit-learn#11907) DOC Remove stray backtick in /doc/modules/feature_extraction.rst (scikit-learn#11910) Deprecate min_samples_leaf and min_weight_fraction_leaf (scikit-learn#11870) MNT modify test_sparse_oneclasssvm to be parametrized (scikit-learn#11894) EXA set figure size to avoid overlaps (scikit-learn#11889) MRG/REL fixes /skips for 32bit tests (scikit-learn#11879) add durations=20 to makefile to show test runtimes locally (scikit-learn#11147) DOC loss='l2' is no longer accpeted in l1_min_c DOC add note about brute force nearest neighbors for string data (scikit-learn#11884) DOC Change sign of energy in RBM (scikit-learn#11156) RFC try to warn on iid less often (scikit-learn#11613) DOC reduce plot_gpr_prior_posterior.py warnings(scikit-learn#11664) ...

* releases: (1109 commits) MNT rc version DOC Release dates for 0.20 (scikit-learn#11838) DOC Fix: require n_splits > 1 in TimeSeriesSplit (scikit-learn#11937) FIX xfail for MacOS LogisticRegressionCV stability (scikit-learn#11936) MNT: Use GEMV in enet_coordinate_descent (Pt. 1) (scikit-learn#11896) [MRG] TST/FIX stop optics reachability failure on 32bit (scikit-learn#11916) ENH add multi_class='auto' for LogisticRegression, default from 0.22; default solver will be 'lbfgs' (scikit-learn#11905) MAINT Fix test_logistic::test_dtype_match failure on 32 bit arch (scikit-learn#11899) DOC Updated link to Laurens van der Maaten's home page (scikit-learn#11907) DOC Remove stray backtick in /doc/modules/feature_extraction.rst (scikit-learn#11910) Deprecate min_samples_leaf and min_weight_fraction_leaf (scikit-learn#11870) MNT modify test_sparse_oneclasssvm to be parametrized (scikit-learn#11894) EXA set figure size to avoid overlaps (scikit-learn#11889) MRG/REL fixes /skips for 32bit tests (scikit-learn#11879) add durations=20 to makefile to show test runtimes locally (scikit-learn#11147) DOC loss='l2' is no longer accpeted in l1_min_c DOC add note about brute force nearest neighbors for string data (scikit-learn#11884) DOC Change sign of energy in RBM (scikit-learn#11156) RFC try to warn on iid less often (scikit-learn#11613) DOC reduce plot_gpr_prior_posterior.py warnings(scikit-learn#11664) ...

* dfsg: (1109 commits) MNT rc version DOC Release dates for 0.20 (scikit-learn#11838) DOC Fix: require n_splits > 1 in TimeSeriesSplit (scikit-learn#11937) FIX xfail for MacOS LogisticRegressionCV stability (scikit-learn#11936) MNT: Use GEMV in enet_coordinate_descent (Pt. 1) (scikit-learn#11896) [MRG] TST/FIX stop optics reachability failure on 32bit (scikit-learn#11916) ENH add multi_class='auto' for LogisticRegression, default from 0.22; default solver will be 'lbfgs' (scikit-learn#11905) MAINT Fix test_logistic::test_dtype_match failure on 32 bit arch (scikit-learn#11899) DOC Updated link to Laurens van der Maaten's home page (scikit-learn#11907) DOC Remove stray backtick in /doc/modules/feature_extraction.rst (scikit-learn#11910) Deprecate min_samples_leaf and min_weight_fraction_leaf (scikit-learn#11870) MNT modify test_sparse_oneclasssvm to be parametrized (scikit-learn#11894) EXA set figure size to avoid overlaps (scikit-learn#11889) MRG/REL fixes /skips for 32bit tests (scikit-learn#11879) add durations=20 to makefile to show test runtimes locally (scikit-learn#11147) DOC loss='l2' is no longer accpeted in l1_min_c DOC add note about brute force nearest neighbors for string data (scikit-learn#11884) DOC Change sign of energy in RBM (scikit-learn#11156) RFC try to warn on iid less often (scikit-learn#11613) DOC reduce plot_gpr_prior_posterior.py warnings(scikit-learn#11664) ...

amueller added 3 commits July 17, 2018 17:33

try to warn on iid less often

9fa76cf

less iid warnings

ee71140

add tols

5c66af5

amueller added 3 commits July 20, 2018 10:51

Merge branch 'master' into iid_warning

efc534a

flake8

b1fcdeb

Merge branch 'master' into iid_warning

49bd2c9

qinhanmin2014 added this to the 0.20 milestone Jul 21, 2018

NicolasHug reviewed Aug 17, 2018

View reviewed changes

stop testing once one score wasn't close enough

1f8eb2e

jnothman approved these changes Aug 21, 2018

View reviewed changes

qinhanmin2014 approved these changes Aug 22, 2018

View reviewed changes

qinhanmin2014 merged commit a43e94c into scikit-learn:master Aug 22, 2018

Uh oh!

Conversation

amueller commented Jul 17, 2018 • edited by rth Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

wdevazelhes commented Jul 17, 2018

Uh oh!

wdevazelhes commented Jul 17, 2018

Uh oh!

GaelVaroquaux commented Jul 18, 2018

Uh oh!

jnothman commented Aug 9, 2018

Uh oh!

NicolasHug commented Aug 17, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

NicolasHug Aug 17, 2018

Choose a reason for hiding this comment

Uh oh!

amueller Aug 17, 2018

Choose a reason for hiding this comment

Uh oh!

qinhanmin2014 commented Aug 19, 2018

Uh oh!

jnothman commented Aug 19, 2018 via email

Uh oh!

amueller commented Aug 20, 2018

Uh oh!

amueller commented Aug 20, 2018

Uh oh!

jnothman commented Aug 20, 2018 via email

Uh oh!

qinhanmin2014 commented Aug 21, 2018

Uh oh!

amueller commented Aug 21, 2018

Uh oh!

amueller commented Aug 21, 2018

Uh oh!

amueller commented Aug 21, 2018

Uh oh!

jnothman commented Aug 21, 2018

Uh oh!

qinhanmin2014 left a comment

Choose a reason for hiding this comment

Uh oh!

qinhanmin2014 commented Aug 22, 2018

Uh oh!

amueller commented Aug 22, 2018

Uh oh!

qinhanmin2014 commented Aug 22, 2018

Uh oh!

amueller commented Aug 22, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

amueller commented Jul 17, 2018 •

edited by rth

Loading

NicolasHug commented Aug 17, 2018 •

edited

Loading