[MRG + 1] Remove "warn_on_dtype" from check_array#13382
[MRG + 1] Remove "warn_on_dtype" from check_array#13382NicolasHug merged 54 commits intoscikit-learn:masterfrom
Conversation
merge fork
…https://github.com/praths007/scikit-learn into remove_warn_on_dtype_from_check_array
jnothman
left a comment
There was a problem hiding this comment.
It's very hard to focus on the substance of your pr with all the other changes
|
@jnothman shouldn't there be a The |
|
Overall I'm not yet committed to this change. We had problems with the
other use of warn_on_dtype and I'm not sure we gain from removing it here.
But do you mean that we should issue a ChangedBehaviorWarning in cases
where we used to issue a conversion warning?
|
…to remove_warn_on_dtype_from_check_array
|
There are places where test cases break because of raising a DeprecationWarning:
So if there is a DeprecationWarning raised it will still break the test case as assert will be wrong. Is there any way around this? |
Or even more conservative, go through a deprecation cycle? We're changing the behaviour of a public function, unless it's a bugfix we need a deprecation cycle right? (I'm not suggesting we should do that, overall I'm just curious about when deprecation cycles are relevant and when they're not) |
|
We're changing a side-effect rather than a behaviour of a public
function... I don't think there's any real need to deprecate.
|
|
I believe that my limited understanding of this issue causes this PR to not be efficacious. Should I go ahead and close it? Perhaps someone else can provide a better solution. |
NicolasHug
left a comment
There was a problem hiding this comment.
Some comments.
Overall I'm not yet committed to this change
So you've changed your mind since #13324 (comment)?
I believe that my limited understanding of this issue causes this PR to not be efficacious
You're almost there @praths007 !
What this PR needs to do is:
- deprecate
warn_on_dtypefromcheck_arrayand keep the current behaviour (i.e. warning on dtype conversion). - remove its use from
check_pariwise_arrays. No need for a deprecation cycle here.
You've already done most of it (please see my comments below).
Though I would suggest to wait for Joel's feedback before going back to it to avoid unnecessary work...
|
|
||
| def test_check_array_warn_on_dtype_deprecation(): | ||
| X = np.asarray([[0.0], [1.0]]) | ||
| assert_warns(DeprecationWarning, check_array, X, warn_on_dtype=True) |
There was a problem hiding this comment.
Please use pytest.wars with the match parameter as shown here, instead of assert_warns
NicolasHug
left a comment
There was a problem hiding this comment.
Some comments.
Overall I'm not yet committed to this change
So you've changed your mind since #13324 (comment)?
I believe that my limited understanding of this issue causes this PR to not be efficacious
You're almost there @praths007 !
What this PR needs to do is:
- deprecate
warn_on_dtypefromcheck_arrayand keep the current behaviour (i.e. warning on dtype conversion). - remove its use from
check_pariwise_arrays. No need for a deprecation cycle here.
You've already done most of it (please see my comments below).
Though I would suggest to wait for Joel's feedback before going back to it to avoid unnecessary work...
|
I think my opinion then was fairly ambivalent, not a strong vote for its
removal. Sorry to be ambiguous.
|
|
@praths007 , as discussed here #13324 (comment), this PR would need an update to implement the following point:
LMK if you need help or if you'd rather someone else take it over |
|
Thank You @NicolasHug, Majority of the tests cases pass now except documentation (which I am assuming fails because the docstring has now changed from False to None). Is there someplace this change needs to be reflected . |
| assert_raises(ValueError, pairwise_distances, X, Y, metric="blah") | ||
|
|
||
| # Test boolean metric raises dataconversion warning | ||
| assert_warns_message(DataConversionWarning, 'boolean', pairwise_distances, |
There was a problem hiding this comment.
please use pytest here, and a longer match for the message
| assert_warns(DataConversionWarning, check_array, df, | ||
| dtype='numeric', warn_on_dtype=True) | ||
| assert_no_warnings(check_array, df, dtype='object', warn_on_dtype=True) | ||
| with pytest.warns(DeprecationWarning, |
There was a problem hiding this comment.
use pytest.warn(None) and put this in a separate test if you need to ignore the deprecation warning
circle ci failure fix
…to remove_warn_on_dtype_from_check_array
NicolasHug
left a comment
There was a problem hiding this comment.
Some more comments but this is looking good
Please also add an entry to the change log at doc/whats_new/v*.rst. Like the other entries there, please reference this pull request with :issue: and credit yourself with :user:.
sklearn/utils/validation.py
Outdated
| if warn_on_dtype is not None: | ||
| warnings.warn( | ||
| "'warn_on_dtype' is deprecated in " | ||
| "version 0.21 and will be removed in 0.23 " |
There was a problem hiding this comment.
| "version 0.21 and will be removed in 0.23 " | |
| "version 0.21 and will be removed in 0.23. " |
NicolasHug
left a comment
There was a problem hiding this comment.
last comments but this LGTM otherwise
Thanks @praths007 for sticking to it!!
|
Also please update this PR's header since it's a bit outdated now ;)
|
ogrisel
left a comment
There was a problem hiding this comment.
A bunch of mostly cosmetic comments to address before merging.
sklearn/metrics/pairwise.py
Outdated
|
|
||
| if dtype == bool and (X.dtype != bool or Y.dtype != bool): | ||
| msg = ("Data was converted to boolean " | ||
| "for metric %s" % (metric)) |
There was a problem hiding this comment.
Cosmetics: this fits on one line and there is no need to wrap the metric variable with parenthesis:
msg = "Data was converted to boolean for metric %s" % metric| # non-boolean arrays are converted to boolean for boolean | ||
| # distance metrics with a data conversion warning | ||
| msg = ("Data was converted to boolean " | ||
| "for metric %s" % (metric)) |
There was a problem hiding this comment.
Same remark:
msg = "Data was converted to boolean for metric %s" % metric|
|
||
| def test_no_data_conversion_warning(): | ||
| # No warnings issued if metric is not a | ||
| # boolean distance function |
There was a problem hiding this comment.
Again, no need to split such comments on multiple lines before the 79 chars limit of PEP8.
| assert_raises(ValueError, assert_all_finite, X) | ||
|
|
||
|
|
||
| @pytest.mark.filterwarnings("ignore: 'warn_on_dtype' is deprecated") # 0.23 |
There was a problem hiding this comment.
Instead of ignoring the warning, we should update the test to stop passing the warn_on_dtype argument as test_check_array_series is actually not concerned with the warn_on_dtype thing.
sklearn/utils/validation.py
Outdated
| warnings.warn( | ||
| "'warn_on_dtype' is deprecated in " | ||
| "version 0.21 and will be removed in 0.23. " | ||
| "Use 'warn_on_dtype=None' to remove this warning.", |
There was a problem hiding this comment.
This phrasing is problematic: we cannot tell people to explicitly set a parameter that will be removed at a release. This will break their code when they upgrade.
Instead we should tell them to not set the warn_on_dtype parameter.
There was a problem hiding this comment.
Also please do not use that many line breaks to format this string either.
sklearn/utils/validation.py
Outdated
|
|
||
| .. deprecated:: 0.21 | ||
| ``warn_on_dtype`` is deprecated in | ||
| version 0.21 and will be removed in 0.23. |
There was a problem hiding this comment.
PEP8 says the limit is at 79 so there is no need to break that early:
.. deprecated:: 0.21
``warn_on_dtype`` is deprecated in version 0.21 and will be
removed in 0.23.
sklearn/utils/validation.py
Outdated
|
|
||
| .. deprecated:: 0.21 | ||
| ``warn_on_dtype`` is deprecated in | ||
| version 0.21 and will be removed in 0.23. |
merge fork
merge sklearn-master
|
Comments have been addressed, merging Thanks @praths007 ! |
This reverts commit ac23178.
This reverts commit ac23178.
Reference Issues/PRs:
Remove "warn_on_dtype" from check_array #13324
Implemented changes:
Changes made: