[MRG+1] Dropping python 2.6 support by lesteve · Pull Request #7890 · scikit-learn/scikit-learn

lesteve · 2016-11-16T13:23:21Z

0.19 is the first release without Python 2.6 support. Related: #7522.

The git grep I used to try to remove Python 2.6 code:

git grep -iP 'py(thon).*2\.6' | grep -v externals

Some details about some slightly orthogonal changes: * Note about cheking safely for nan is likely not valid any more (commit introducing it is c80ca91) * scipy.linalg.qr econ parameter removed since scipy 0.9 in favour of mode='economic' * Remove unnecessary libgfortran in conda create command

lesteve

A few comments to help the review

lesteve · 2016-11-16T13:35:29Z

sklearn/gaussian_process/gaussian_process.py

        # Get generalized least squares solution
        Ft = linalg.solve_triangular(C, F, lower=True)
-        try:
-            Q, G = linalg.qr(Ft, econ=True)


econ=True removed in scipy=0.9 which matches our minimum scipy version. From
https://github.com/scipy/scipy/blob/master/doc/release/0.9.0-notes.rst#removed-features.

lesteve · 2016-11-16T13:36:47Z

sklearn/metrics/classification.py

-            # the jaccard to 1: lim_{x->0} x/x = 1
-            # Note with py2.6 and np 1.3: we can't check safely for nan.
-            score[pred_or_true == 0.0] = 1.0
+            score[np.isnan(score)] = 1.0


Bold move I know. The commit introducing this change is c80ca91 and no tests were added for this edge case along with the change. Does anyone remember a problem like this?

can't we do the same thing here we do in places like StandardScaler?

Can you elaborate on what we do in StandardScaler ?

Modifying the denominator, so setting pred_or_true[pred_or_true == 0] = 1. Then we don't need to catch the error, either.

Wait, sorry, we want score to be one, not zero. We should really have a standard way of dividing by zero, it comes up soo often. what's wrong with the current solution though? I find it cleaner than the change.

Fine then I'll leave the previous solution and just remove the comment. Given the comment I thought something in an old python + numpy was preventing to just check for NaNs.

lesteve · 2016-11-16T13:40:17Z

sklearn/neighbors/tests/test_approximate.py

    # exact results on this toy dataset.
-    lsfh = LSHForest(min_hash_match=0, n_candidates=n_points).fit(X)
+    lsfh = LSHForest(min_hash_match=0, n_candidates=n_points,
+                     random_state=42).fit(X)


I don't know why this is needed ... but it is needed without it you get an error consistently like this if you run the tests with nosetests sklearn. The error can not be reproduced only with running a single test function or the single file with nosetests sklearn/neighbors/tests/test_approximate.py and I tried in a loop 100 times. Still trying to get to the bottom of this ...

The implication being that in removing one of the fixes you changed behaviour?

well this should be here anyhow...

lesteve · 2016-11-16T13:41:05Z

.travis.yml

      COVERAGE=true
    # This environment tests the oldest supported anaconda env
-    - DISTRIB="conda" PYTHON_VERSION="2.6" INSTALL_MKL="false"
+    - DISTRIB="conda" PYTHON_VERSION="2.7" INSTALL_MKL="false"


Still keeping this build to test oldest versions available through conda.

jnothman · 2016-11-16T22:41:43Z

sklearn/utils/tests/test_estimator_checks.py

    assert_raises_regex(AttributeError, msg, check_estimator, BaseEstimator)
    # check that fit does input validation
-    msg = "TypeError not raised by fit"
+    msg = "TypeError not raised"


Looks like the tests fail on Python 2.7 without this change ... I'll look into it in more details.

So it looks like unittest.TestCase.assertRegexp does not print the function name in its error message if no exception is raised only with Python 2.7:

import unittest _dummy = unittest.TestCase('__init__') def no_raise(): return _dummy.assertRaisesRegexp(ValueError, 'message', no_raise)

With Python 2.7:

AssertionError: ValueError not raised

With Python 3:

AssertionError: ValueError not raised by no_raise

The patch we had before was applied to Python 2.7 despite the misleading comment (assertRegex vs assertRegexp). Are we good with the change then?

jnothman · 2016-11-16T22:42:50Z

Thanks for this, @lesteve!

amueller

LGTM apart from minor comments.

amueller · 2016-11-16T22:43:07Z

build_tools/travis/install.sh

    if [[ "$INSTALL_MKL" == "true" ]]; then
        conda create -n testenv --yes python=$PYTHON_VERSION pip nose \
            numpy=$NUMPY_VERSION scipy=$SCIPY_VERSION numpy scipy \
-            libgfortran mkl flake8 \


what's the reasoning with libgfortran?

I think at one point it was needed explicitly because dependencies were not correctly defined in the conda metadata. This has long been fixed.

amueller · 2016-11-16T22:44:52Z

sklearn/metrics/classification.py

-            # the jaccard to 1: lim_{x->0} x/x = 1
-            # Note with py2.6 and np 1.3: we can't check safely for nan.
-            score[pred_or_true == 0.0] = 1.0
+            score[np.isnan(score)] = 1.0


can't we do the same thing here we do in places like StandardScaler?

amueller · 2016-11-16T22:45:34Z

sklearn/neighbors/tests/test_approximate.py

    # exact results on this toy dataset.
-    lsfh = LSHForest(min_hash_match=0, n_candidates=n_points).fit(X)
+    lsfh = LSHForest(min_hash_match=0, n_candidates=n_points,
+                     random_state=42).fit(X)


well this should be here anyhow...

amueller · 2016-11-16T22:47:37Z

sklearn/utils/tests/test_estimator_checks.py

    assert_raises_regex(AttributeError, msg, check_estimator, BaseEstimator)
    # check that fit does input validation
-    msg = "TypeError not raised by fit"
+    msg = "TypeError not raised"


I second @jnothman's concern ;)

amueller · 2016-11-17T15:46:52Z

Hm do you keep force-pushing? I can't find your comments :-/

amueller · 2016-11-17T15:49:18Z

sklearn/utils/testing.py

-                                  callable_obj.__name__))
-
+    # Python 2.7
+    assert_raises_regex = _dummy.assertRaisesRegexp


Hm or leave the old one in and not change behavior?

I'd rather use the stdlib function rather than our backport (that was only intended for Python 2.6 btw but was kicking in for Python 2.7 too for some bad reason). The only change in behaviour is the name of the function that does not add that much value given that you can find it in the stacktrace.

Error messages from Python 2.7 assertRegexp does not contain the function name, in contrast with Python 3 assertRegex

lesteve · 2016-11-17T18:38:42Z

Hm do you keep force-pushing? I can't find your comments :-/

~~I don't think force pushing will fix it~~ Edit: I read this too fast and did understand "can you force push" ... I did probably a few force push along the way, sorry. In case you are using firefox I have a GreaseMonkey script to open all outdated diff comments (updated it recently): https://gist.github.com/lesteve/b4ef29bccd42b354a834#file-github_outdated_diff-user-js. I have set the shortcut to Alt-P because that happens to be one of the shortcut not taken by my browser.

You can probably change it just a tiny bit and copy and paste into the console or turn it into a bookmarklet if you wish. For some reason I can't remember I never managed to get the latter to work.

jnothman · 2016-11-22T02:31:41Z

LGTM

amueller · 2016-11-23T22:11:09Z

LGTM, thanks :)

jnothman · 2016-11-23T23:47:16Z

Yay!

…

On 24 November 2016 at 09:11, Andreas Mueller ***@***.***> wrote: LGTM, thanks :) — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#7890 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAEz66_G0bREc7kMvz29iTCrampnps26ks5rBLn-gaJpZM4Kz2fT> .

* Remove Python 2.6 support Some details about some slightly orthogonal changes: * Note about cheking safely for nan is likely not valid any more (commit introducing it is c80ca91) * scipy.linalg.qr econ parameter removed since scipy 0.9 in favour of mode='economic' * Remove unnecessary libgfortran in conda create command * Putative fix by setting the random seed * Revert unintended change * Reinstate previous logic for checking for NaNs * Reinstate change in error message Error messages from Python 2.7 assertRegexp does not contain the function name, in contrast with Python 3 assertRegex

lesteve added 2 commits November 16, 2016 13:32

Putative fix by setting the random seed

76e7a45

lesteve commented Nov 16, 2016

View reviewed changes

raghavrv added Build / CI Waiting for Reviewer labels Nov 16, 2016

jnothman reviewed Nov 16, 2016

View reviewed changes

amueller approved these changes Nov 16, 2016

View reviewed changes

Revert unintended change

b152872

lesteve force-pushed the remove-python2.6-support branch from a2a923e to b152872 Compare November 17, 2016 14:08

amueller reviewed Nov 17, 2016

View reviewed changes

lesteve added 2 commits November 17, 2016 18:21

Reinstate previous logic for checking for NaNs

3135374

Reinstate change in error message

25c474d

Error messages from Python 2.7 assertRegexp does not contain the function name, in contrast with Python 3 assertRegex

lesteve force-pushed the remove-python2.6-support branch from 4be9b87 to 25c474d Compare November 17, 2016 17:23

jnothman changed the title ~~[MRG] Dropping python 2.6 support~~ [MRG+1] Dropping python 2.6 support Nov 22, 2016

amueller merged commit e5bf61e into scikit-learn:master Nov 23, 2016

lesteve deleted the remove-python2.6-support branch November 23, 2016 22:15

aabadie mentioned this pull request Dec 1, 2016

Drop python 2.6 support joblib/joblib#435

Closed

4 tasks

Uh oh!

Conversation

lesteve commented Nov 16, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lesteve left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lesteve Nov 16, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jnothman commented Nov 16, 2016

Uh oh!

amueller left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

amueller commented Nov 17, 2016

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lesteve commented Nov 17, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jnothman commented Nov 22, 2016

Uh oh!

amueller commented Nov 23, 2016

Uh oh!

jnothman commented Nov 23, 2016 via email

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

lesteve commented Nov 16, 2016 •

edited

Loading

lesteve Nov 16, 2016 •

edited

Loading

lesteve commented Nov 17, 2016 •

edited

Loading