[MRG+2] Moves Imputation out of Preprocessing#10483
[MRG+2] Moves Imputation out of Preprocessing#10483jnothman merged 23 commits intoscikit-learn:masterfrom
Conversation
|
thanks! could we adopt SimpleImputer or NaiveImputer or ConstantImputer
(choose one) for now and we can change it with sed later?
|
jnothman
left a comment
There was a problem hiding this comment.
You need to add impute.rst to the user guide table of contents
| missing values | ||
| Most Scikit-learn estimators do not work with missing values. When they | ||
| do (e.g. in :class:`preprocessing.Imputer`), NaN is the preferred | ||
| do (e.g. in :class:`Imputer`), NaN is the preferred |
There was a problem hiding this comment.
This still needs to be impute.Imputer
|
|
||
|
|
||
| @deprecated("Imputer was deprecated in version 0.20 and will be " | ||
| "removed in 0.22. Import Imputer from sklearn instead.") |
| isotonic.check_increasing | ||
| isotonic.isotonic_regression | ||
|
|
||
| .. _impute_ref: |
There was a problem hiding this comment.
I am not familiar with reST, I am not sure why there's _ref here. This was with other modules here so I used this :( .
There was a problem hiding this comment.
you need this only if it is referenced somewhere. But I guess if it is not referenced now, it can be in the futur so it can stay here
| @@ -0,0 +1,54 @@ | |||
|
|
|||
There was a problem hiding this comment.
You need to remove this content from the preprocessing docs. You're welcome to leave a "see also" reference to these docs in preprocessing if you think that will help users.
|
|
||
| @deprecated("Imputer was deprecated in version 0.20 and will be " | ||
| "removed in 0.22. Import Imputer from sklearn instead.") | ||
| "removed in 0.22. Import import.Imputer from sklearn" |
|
Will you do the SimpleImputer rename? |
Yeah, Sure. |
jnothman
left a comment
There was a problem hiding this comment.
Thanks!
please use @ignore_warnings on all tests in sklearn.preprocessing.tests.test_imputation.
Please don't change test_cross_validation.py and test_grid_search.py. They're about to be removed and you only make merge conflicts.
|
|
||
|
|
||
| @deprecated("Imputer was deprecated in version 0.20 and will be " | ||
| "removed in 0.22. Import impute.Imputer from sklearn" |
| @@ -0,0 +1,375 @@ | |||
| # Authors: Nicolas Tresegnie <nicolas.tresegnie@gmail.com> | |||
There was a problem hiding this comment.
Please add a short docstring: "Transformers for missing value imputation"
jnothman
left a comment
There was a problem hiding this comment.
You didn't import ignore_warnings
Thanks for the efficient work
|
@jnothman Thanks for the quick review. Just out of curiosity, have you moved to a new time zone? ;) It looks so from your review time. |
|
No, I haven't. I'm just sneaking in an early morning review before disappearing for a few days. Hopeful we can merge this and get on with business. Please add an API changes entry to what's new |
_____________________________ [doctest] impute.rst _____________________________
also allows for different missing values encodings.
The following snippet demonstrates how to replace missing values,
encoded as ``np.nan``, using the mean value of the columns (axis 0)
that contain the missing values::
>>> import numpy as np
>>> from sklearn.impute import SimpleImputer
>>> imp = SimpleImputer(missing_values='NaN', strategy='mean', axis=0)
>>> imp.fit([[1, 2], [np.nan, 3], [7, 6]])
Expected:
SimpleImputer(axis=0, copy=True, missing_values='NaN', strategy='mean', verbose=0)
Got:
SimpleImputer(axis=0, copy=True, missing_values='NaN', strategy='mean',
verbose=0)Why this doctest failed? The two results are same. Also, there's a failure in pep8 which says |
|
Use the normalize whitespace flag for doctest |
|
I think the ambiguous variable name just means some code was unnecessarily repeated in the test. You can fix it |
|
of course, silly me. we wasted a while lot of effort on that parameter
deprecation. let's just drop it from the new version
|
|
I'm OK with a deprecation warning in the old version and directly dropping axis parameter in the new version. |
|
We don't exactly need a deprecation warning in the old version if the
entire module is deprecated...?
…On 14 February 2018 at 14:12, Hanmin Qin ***@***.***> wrote:
I'm OK with a deprecation warning in the old version and directly dropping
axis parameter in the new version.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#10483 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAEz61yXPouOaGvUPBOIZRUnbJ3M48RZks5tUk8kgaJpZM4RgYMJ>
.
|
Not sure, I'll revert the deprecation if you have made the decision, but I think users will expect the new version to behave the same as the old version. It seems strange to remove a parameter without any warning when moving the class, so I would prefer a deprecation warning at least in the old version (maybe also in the new version from my side). |
|
Why not just having what's new and warning say:
"sklearn.preprocessing.Imputer has been renamed to
sklearn.impute.SimpleImputer, and its axis parameter is no longer
available. The deprecated Imputer will be removed in version 0.22." and if
the user uses Imputer(axis=1), can add "Imputation on axis=1 can be
implemented with FunctionTransformer."
…On 14 February 2018 at 14:25, Hanmin Qin ***@***.***> wrote:
We don't exactly need a deprecation warning in the old version if the
entire module is deprecated...?
Not sure, I'll revert the deprecation if you have made the decision, but I
think users will expect the new version to behave the same as the old
version. It seems strange to remove a parameter without any warning when
moving the class, so I would prefer a deprecation warning at least in the
old version (maybe also in the new version from my side).
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#10483 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAEz6102tUyZvG62iD-yclUTsrAQ8NNZks5tUlIhgaJpZM4RgYMJ>
.
|
|
Yes, let's go ahead when this is ready to merge.
…On 14 February 2018 at 19:49, Hanmin Qin ***@***.***> wrote:
Why not just having what's new and warning say:
"sklearn.preprocessing.Imputer has been renamed to
sklearn.impute.SimpleImputer, and its axis parameter is no longer
available. The deprecated Imputer will be removed in version 0.22."
I think it's a better way. @jnothman <https://github.com/jnothman> revert
#10558 <#10558>?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#10483 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAEz62pt7aiOG8hfVm_XmnP3kZRIhRWQks5tUp4qgaJpZM4RgYMJ>
.
|
|
Okay. Let's do it that way. |
|
Thanks @thechargedneutron |
|
Do you want to remove axis and update the what's new, @qinhanmin2014, or should we open an issue? |
|
@jnothman Please open a new issue. I'll try to take care of it after couple of days if no one takes it (it's Chinese new year :) ) |
|
of course
|
|
This pull request introduces 1 alert - view on lgtm.com new alerts:
Comment posted by lgtm.com |
qinhanmin2014
left a comment
There was a problem hiding this comment.
LGTM. I've resolved the conflicts and checked the PR. merge? @jnothman @glemaitre
|
Ah, I thought I'd already merged this. I must have been blocked by a merge conflict and not seen. |
|
Thanks a lot @qinhanmin2014 for pointing out our folly
|
|
We failed to remove references to Imputer from sklearn/modules/preprocessing.rst :( Needs to be done. |
|
I'll do this. :) Are you referring to doc/modules/preprocessing.rst ? |
|
Yes |
Reference Issues/PRs
Fixes #9726
What does this implement/fix? Explain your changes.
Changes done as per suggestions
Tasks
sklearn/preprocessing/imputation.pytosklearn/impute.py, as well as the corresponding tests.Imputerinsklearn/preprocessing/imputation.pyto be removed in v0.22.sklearn.imputesection indoc/modules/classes.rstdoc/modules/classes.rstsklearn/__init__.py's __all__doc/modules/preprocessing.rsttodoc/modules/impute.rst