MNT make files private for cluster module#14948
MNT make files private for cluster module#14948NicolasHug merged 20 commits intoscikit-learn:masterfrom
Conversation
adrinjalali
left a comment
There was a problem hiding this comment.
Not a thorough review yet.
| ('sklearn.cluster.k_means_', 'KMeans'), | ||
| ('sklearn.cluster.mean_shift_', 'MeanShift'), | ||
| ('sklearn.cluster.optics_', 'OPTICS'), | ||
| ('sklearn.cluster.spectral', 'SpectralClustering'), |
There was a problem hiding this comment.
Haven't checked if this list is complete.
There was a problem hiding this comment.
This seems to be leading to some confusion.
Can we simplify this check by only doing a module import:
import {deprecated_path}
So we do not need the importee?
There was a problem hiding this comment.
That would only check that
import sklearn.neural_network.rbm
raises a warning, but not that we can still import the stuff we used to from rbm. I think it's important to still ensure that
There was a problem hiding this comment.
Then we need a comment somewhere saying "We only need to import one thing from each file"
|
tests failing, and I just realized an issue with not raising a deprecation warning in pytest sessions, in contrast to ignoring those specific files: now we can't actually check if our examples and other places in the code are still using the deprecated path, we won't be issuing the warning during a pytest session. |
Right but examples don't use pytest so you should see them? For everything else, if it's imported at top level you would have seen it at collection time only once, anyway? Feel free to use the other approach is you think it's really preferable. |
adrinjalali
left a comment
There was a problem hiding this comment.
Tests seem to be certificate issue. LGTM :)
|
(I still need to check the examples as you suggested, I'll do that soon) |
|
If you could also check if there is any base classes |
|
I've removed |
|
|
||
| from .spectral import spectral_clustering, SpectralClustering | ||
| from .mean_shift_ import (mean_shift, MeanShift, | ||
| from ._spectral import spectral_clustering, SpectralClustering |
There was a problem hiding this comment.
I think the question is, do we want to include BaseSpectral here.
Or generally, any base class. I think we haven't decided on that, have we?
|
I think that we should. They are part of the public API, otherwise, it
should be named _BaseSpectral (as _BaseComposition)
…On Sat, 14 Sep 2019 at 12:36, Adrin Jalali ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In sklearn/cluster/__init__.py
<#14948 (comment)>
:
> @@ -3,18 +3,18 @@
algorithms.
"""
-from .spectral import spectral_clustering, SpectralClustering
-from .mean_shift_ import (mean_shift, MeanShift,
+from ._spectral import spectral_clustering, SpectralClustering
I think the question is, do we want to include BaseSpectral here.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#14948?email_source=notifications&email_token=ABY32P3MCRB6WBZDTTRMWXDQJS5BPA5CNFSM4IVMRQQ2YY3PNVWWK3TUL52HS4DFWFIHK3DMKJSXC5LFON2FEZLWNFSXPKTDN5WW2ZLOORPWSZGOCEXVYGA#pullrequestreview-288316440>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABY32P232UQKWXHVVORDAP3QJS5BPANCNFSM4IVMRQQQ>
.
--
Guillaume Lemaitre
INRIA Saclay - Parietal team
Center for Data Science Paris-Saclay
https://glemaitre.github.io/
|
|
I am strongly against making base classes public. It's pretty much equivalent to making all the helpers public. They are often very tightly coupled to the child classes, and having them public would make any refactoring or bug fix near impossible. I'm pretty sure we've been consistently breaking backward compatibility of base classes. Sort of related: that's another level of API contract, but any method that is not |
We have tons of things that are public but should have been private in the first place, this is precisely the point of these PRs: clean the public API space. |
|
I am fine making the base classes private as well. However, this not the
case now
and we should follow the deprecation path as well, is that correct?
…On Sun, 15 Sep 2019 at 14:34, Nicolas Hug ***@***.***> wrote:
They are part of the public API, otherwise, it
should be named _BaseSpectral (as _BaseComposition)
We have tons of things that are public but should have been private in the
first place, this is precisely the point of these PRs: clean the public API
space.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#14948?email_source=notifications&email_token=ABY32P7ERZOM5GFIKL7Z7TDQJYTW7A5CNFSM4IVMRQQ2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD6XP3YA#issuecomment-531561952>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABY32P23YYWTVWLB2RWU5XTQJYTW7ANCNFSM4IVMRQQQ>
.
--
Guillaume Lemaitre
INRIA Saclay - Parietal team
Center for Data Science Paris-Saclay
https://glemaitre.github.io/
|
We are properly deprecating all the base classes. All the tools that aren't in |
…uster_dep_underscore
|
merge conflict and tests are failing. |
|
I'm leaving as-is for now, I think we need to discuss the conflicts issues #12927 (comment) on Monday during the meeting |
|
I kinda didn't understand what the conclusion on this topic was during the meeting. Do we continue and comment on PRs telling people how to handle merge conflicts? |
|
@thomasjpfan is thinking of some solution that could avoid the conflicts... waiting for him to submit the idea |
|
Just updated the PR. @thomasjpfan wanna give it a look? |
|
Where do you do the generation here? |
@jnothman It is done in sklearn/_build_utils/deprecated_modules.py by adding to the |
|
Are you OK with this @thomasjpfan ? @adrinjalali wanna give it a second round since method has changed? |
jnothman
left a comment
There was a problem hiding this comment.
Otherwise I think this lgtm but I've not been rigorous
| @@ -20,10 +20,6 @@ | |||
| from ..utils.validation import assert_all_finite, check_array | |||
|
|
|||
|
|
|||
| __all__ = ['SpectralCoclustering', | |||
There was a problem hiding this comment.
This will technically mean that people who previously imported * from this module will now get check_array in their locals... Not sure that matters
There was a problem hiding this comment.
True. Can't we just keep this? What's the point of removing it?
There was a problem hiding this comment.
What's the point of removing it?
It suggests that users might be allowed to import from this. And there's no reason to protect users that want to import * IMO.
But OK I'll put it back.
|
There is a linting error. |
It's going to happen in every deprecation PR. It's because new files are being created. I don't think we should fix them here (increases the possibilities of a painful merge) |
…uster_dep_underscore
…-learn; branch 'master' of github.com:scikit-learn/scikit-learn into cluster_dep_underscore
|
Since @jnothman seems to be +1 as well I'll merge tomorrow unless we have additional comments |
rth
left a comment
There was a problem hiding this comment.
Haven't checked very thoroughly but LGTM, aside from one minor comment
| @@ -20,10 +20,6 @@ | |||
| from ..utils.validation import assert_all_finite, check_array | |||
|
|
|||
|
|
|||
| __all__ = ['SpectralCoclustering', | |||
There was a problem hiding this comment.
True. Can't we just keep this? What's the point of removing it?
| from ..utils.validation import FLOAT_DTYPES | ||
| from ..exceptions import ConvergenceWarning | ||
| from . import _k_means | ||
| from . import _k_means_fast as _k_means |
There was a problem hiding this comment.
I also find the naming *_fast.pyx files more meaningful than with underscores as it was done before.
|
Thanks for the review @rth! Comments are addressed so I'll merge |
|
My colleague, @namoshizun, has had trouble upgrading his nightly build installation due to this PR.
He had an installation of the scikit-learn nightly build prior to this PR, then upgraded that environment with I'm not sure why the old |
|
I tried install the stable release and then Does uninstalling and then reinstall the dev version resolve @namoshizun's issue? |
|
I don't know what |
Addresses part of #12927 and part of #9250
This PR renames
file.pyinto_file.pyin thesklearn.clustermodule.In particular, please note that:
I rename
dbscan_.pyinto_dbscan.py. Same foroptics_andmean_shift_I renamed
k_means_.pyinto_k_means.py. Since_k_means.pyxalready existed, I had to rename it too. I chose_k_means_fast.pyxbut no strong opinion. Same forhierarchical.Ping @adrinjalali @thomasjpfan @rth @glemaitre.
Again, lint issue can be ignored and are triggered by new files being created.