MNT Completes position arg deprecation by thomasjpfan · Pull Request #17272 · scikit-learn/scikit-learn

thomasjpfan · 2020-05-18T17:46:04Z

After this PR here are the public functions that have 2 or more arguments that do not have a * for keyword only arguments. In this case, "public function" means it is listed in classes.rst.

These functions are not updated with this PR:

[<function sklearn._config.set_config(assume_finite=None, working_memory=None, print_changed_only=None, display=None)>,
 <function sklearn.compose._column_transformer.make_column_transformer(*transformers, **kwargs)>,
 <function sklearn.covariance._shrunk_covariance.shrunk_covariance(emp_cov, shrinkage=0.1)>,
 <function sklearn.feature_extraction.image.reconstruct_from_patches_2d(patches, image_size)>,
 <function sklearn.feature_extraction.image.extract_patches(arr, patch_shape=8, extraction_step=1)>,
 <function sklearn.feature_selection._univariate_selection.f_classif(X, y)>,
 <function sklearn.feature_selection._univariate_selection.chi2(X, y)>,
 <function sklearn.isotonic.check_increasing(x, y)>,
 <function sklearn.metrics._ranking.auc(x, y)>,
 <function sklearn.metrics._regression.max_error(y_true, y_pred)>,
 <function sklearn.metrics.cluster._supervised.adjusted_rand_score(labels_true, labels_pred)>,
 <function sklearn.metrics.cluster._supervised.homogeneity_score(labels_true, labels_pred)>,
 <function sklearn.metrics.cluster._supervised.completeness_score(labels_true, labels_pred)>,
 <function sklearn.metrics.cluster._unsupervised.calinski_harabasz_score(X, labels)>,
 <function sklearn.metrics.cluster._unsupervised.davies_bouldin_score(X, labels)>,
 <function sklearn.metrics.pairwise.laplacian_kernel(X, Y=None, gamma=None)>,
 <function sklearn.metrics.pairwise.chi2_kernel(X, Y=None, gamma=1.0)>,
 <function sklearn.metrics.pairwise.linear_kernel(X, Y=None, dense_output=True)>,
 <function sklearn.metrics.pairwise.polynomial_kernel(X, Y=None, degree=3, gamma=None, coef0=1)>,
 <function sklearn.metrics.pairwise.haversine_distances(X, Y=None)>,
 <function sklearn.metrics.pairwise.additive_chi2_kernel(X, Y=None)>,
 <function sklearn.metrics.pairwise.paired_cosine_distances(X, Y)>,
 <function sklearn.metrics.pairwise.cosine_similarity(X, Y=None, dense_output=True)>,
 <function sklearn.metrics.pairwise.rbf_kernel(X, Y=None, gamma=None)>,
 <function sklearn.metrics.pairwise.paired_euclidean_distances(X, Y)>,
 <function sklearn.metrics.pairwise.sigmoid_kernel(X, Y=None, gamma=None, coef0=1)>,
 <function sklearn.metrics.pairwise.paired_manhattan_distances(X, Y)>,
 <function sklearn.metrics.pairwise.cosine_distances(X, Y=None)>,
 <function sklearn.model_selection._search.fit_grid_point(X, y, estimator, parameters, train, test, scorer, verbose, error_score=nan, **fit_params)>,
 <function sklearn.neural_network._base.log_loss(y_true, y_prob)>,
 <function sklearn.pipeline.make_pipeline(*steps, **kwargs)>,
 <function sklearn.pipeline.make_union(*transformers, **kwargs)>,
 <function sklearn.preprocessing._data.add_dummy_feature(X, value=1.0)>,
 <function sklearn.utils.safe_mask(X, mask)>,
 <function sklearn.utils.resample(*arrays, **options)>,
 <function sklearn.utils.shuffle(*arrays, **options)>,
 <function sklearn.utils.safe_indexing(X, indices, axis=0)>,
 <function sklearn.utils.class_weight.compute_class_weight(class_weight, classes, y)>,
 <function sklearn.utils.extmath.safe_sparse_dot(a, b, dense_output=False)>,
 <function sklearn.utils.extmath.density(w, **kwargs)>,
 <function sklearn.utils.extmath.randomized_svd(M, n_components, n_oversamples=10, n_iter='auto', power_iteration_normalizer='auto', transpose='auto', flip_sign=True, random_state=0)>,
 <function sklearn.utils.extmath.randomized_range_finder(A, size, n_iter, power_iteration_normalizer='auto', random_state=None)>,
 <function sklearn.utils.extmath.weighted_mode(a, w, axis=0)>,
 <function sklearn.utils.graph.single_source_shortest_path_length(graph, source, cutoff=None)>,
 <function sklearn.utils.sparsefuncs.inplace_csr_column_scale(X, scale)>,
 <function sklearn.utils.sparsefuncs.inplace_swap_column(X, m, n)>,
 <function sklearn.utils.sparsefuncs.inplace_swap_row(X, m, n)>,
 <function sklearn.utils.sparsefuncs.incr_mean_variance_axis(X, axis, last_mean, last_var, last_n)>,
 <function sklearn.utils.sparsefuncs.mean_variance_axis(X, axis)>,
 <function sklearn.utils.sparsefuncs.inplace_row_scale(X, scale)>,
 <function sklearn.utils.sparsefuncs.inplace_column_scale(X, scale)>,
 <function sklearn.utils.validation.has_fit_parameter(estimator, parameter)>]

The utils and pairwise ones are the biggest bunch that do not have *.

Are there any other functions we want to add * to?

thomasjpfan · 2020-05-18T17:49:08Z

CC @adrinjalali @NicolasHug

NicolasHug

LGTM if green

adrinjalali · 2020-05-18T18:34:28Z

Thanks Thomas. I'd make args positional in these functions as well:

[<function sklearn._config.set_config(*, assume_finite=None, working_memory=None, print_changed_only=None, display=None)>,
 <function sklearn.feature_extraction.image.extract_patches(arr, *, patch_shape=8, extraction_step=1)>,
 <function sklearn.metrics._regression.max_error(*, y_true, y_pred)>,
 <function sklearn.metrics.cluster._supervised.adjusted_rand_score(*, labels_true, labels_pred)>,
 <function sklearn.metrics.cluster._supervised.homogeneity_score(*, labels_true, labels_pred)>,
 <function sklearn.metrics.cluster._supervised.completeness_score(*, labels_true, labels_pred)>,
 <function sklearn.metrics.pairwise.laplacian_kernel(X, Y=None, *, gamma=None)>,
 <function sklearn.metrics.pairwise.chi2_kernel(X, Y=None, *, gamma=1.0)>,
 <function sklearn.metrics.pairwise.linear_kernel(X, Y=None, *, dense_output=True)>,
 <function sklearn.metrics.pairwise.polynomial_kernel(X, Y=None, *, degree=3, gamma=None, coef0=1)>,
 <function sklearn.metrics.pairwise.cosine_similarity(X, Y=None, *, dense_output=True)>,
 <function sklearn.metrics.pairwise.rbf_kernel(X, Y=None, *, gamma=None)>,
 <function sklearn.metrics.pairwise.sigmoid_kernel(X, Y=None, *, gamma=None, coef0=1)>,
 <function sklearn.model_selection._search.fit_grid_point(X, y, estimator, parameters, *, train, test, scorer, verbose, error_score=nan, **fit_params)>,
 <function sklearn.neural_network._base.log_loss(*, y_true, y_prob)>,
 <function sklearn.utils.safe_indexing(X, indices, *, axis=0)>,
 <function sklearn.utils.class_weight.compute_class_weight(*, class_weight, classes, y)>,
 <function sklearn.utils.extmath.safe_sparse_dot(a, b, *, dense_output=False)>,
 <function sklearn.utils.extmath.randomized_svd(M, n_components, *, n_oversamples=10, n_iter='auto', power_iteration_normalizer='auto', transpose='auto', flip_sign=True, random_state=0)>,
 <function sklearn.utils.extmath.randomized_range_finder(A, *, size, n_iter, power_iteration_normalizer='auto', random_state=None)>,
 <function sklearn.utils.extmath.weighted_mode(a, w, *, axis=0)>,
 <function sklearn.utils.graph.single_source_shortest_path_length(graph, source, *, cutoff=None)>,
 <function sklearn.utils.sparsefuncs.incr_mean_variance_axis(X, *, axis, last_mean, last_var, last_n)>,
 <function sklearn.utils.sparsefuncs.mean_variance_axis(X, *, axis)>,
 <function sklearn.utils.sparsefuncs.inplace_row_scale(X, *, scale)>,
 <function sklearn.utils.sparsefuncs.inplace_column_scale(X, *, scale)>]

WDYT? I do find *_true, *_pred always very confusing, and never remember which one's first. So I can imaging with positional calls people would make mistakes there.

adrinjalali · 2020-05-18T18:34:39Z

tests are also failing.

thomasjpfan · 2020-05-18T18:42:13Z

WDYT? I do find *_true, *_pred always very confusing, and never remember which one's first. So I can imaging with positional calls people would make mistakes there.

I am +0.25 on this, because I think there would be too many warnings with this change. We already released 0.23 version where we left y_true and y_pred positional.

adrinjalali · 2020-05-18T18:43:02Z

WDYT of adding a "version" parameter to _parameterize_positional_args and deprecate those in 0.24?

thomasjpfan · 2020-05-18T18:47:50Z

WDYT of adding a "version" parameter to _parameterize_positional_args and deprecate those in 0.24?

What do you mean by "those"?

I think the deprecations in this PR should still have a 2 version deprecation cycle.

adrinjalali · 2020-05-18T18:51:36Z

Sorry, by "those" I meant *_true, *_pred ones, and I meant to add a parameter to the decorator function so that they're added as deprecated in 0.24, and fixed in 0.26. right now versions are hard coded in the decorator. And I meant to do the deprecation in a separate PR which won't be released in 0.23.

thomasjpfan · 2020-05-18T18:55:13Z

Sorry, by "those" I meant *_true, *_pred ones, and I meant to add a parameter to the decorator function so that they're added as deprecated in 0.24, and fixed in 0.26.

Ah I see. I can add version to the decorator. We can open an issue for making *_true and *_pred keyword only and discuss it in our next call.

NicolasHug · 2020-05-18T18:56:33Z

 <function sklearn.metrics.cluster._supervised.adjusted_rand_score(*, labels_true, labels_pred)>,
 <function sklearn.metrics.cluster._supervised.homogeneity_score(*, labels_true, labels_pred)>,
 <function sklearn.metrics.cluster._supervised.completeness_score(*, labels_true, labels_pred)>,

IMO we shouldn't make these kwonly considering we kept y_true and y_pred positional for all the other metrics.

I do find *_true, *_pred always very confusing

I sort of agree but the good news is we're always consistent on this. In the end that's just a convention users need to get used to, just like they get used to passing X before y.

adrinjalali · 2020-05-18T18:58:05Z

I thought it'd be an easy sell, but apparently @NicolasHug doesn't think so :P So I'm +1 on the issue and talking about it in the call.

adrinjalali · 2020-05-18T20:09:28Z

WDYT of the other suggestions I have in #17272 (comment)? (kinda have a feeling we may have discussed some of them in other PRs)

thomasjpfan · 2020-05-18T20:16:52Z

Here are the ones I left out from #17272 (comment)

set_config: circular imports
extract_patches: will be removed in 0.24
fit_grid_point: will be removed in 0.24
compute_class_weight: Internally, we use this with no keyword arguments.
neural_network._base.log_loss: False positive - not public
mean_variance_axis: args implied by the name of the function
inplace_row_scale: args implied by the name of the function
inplace_column_scale: args implied by the name of the function

thomasjpfan · 2020-05-18T20:24:36Z

I guess compute_class_weight is the one I am unsure about. I would leave it as is, or do compute_class_weight(class_weight, *, classes, y).

adrinjalali · 2020-05-18T20:35:47Z

I agree with you. But it seems there are too many places internally where we call it positionally :/

adrinjalali · 2020-05-18T20:37:49Z

sklearn/cluster/_dbscan.py

-           algorithm='auto', leaf_size=30, p=2, sample_weight=None,
-           n_jobs=None):
+@_deprecate_positional_args
+def dbscan(X, *, eps=0.5, min_samples=5, metric='minkowski',


DBSCAN accepts positional eps

adrinjalali · 2020-05-18T20:40:38Z

sklearn/covariance/_graph_lasso.py

-
-def graphical_lasso(emp_cov, alpha, cov_init=None, mode='cd', tol=1e-4,
+@_deprecate_positional_args
+def graphical_lasso(emp_cov, *, alpha, cov_init=None, mode='cd', tol=1e-4,


GraphicalLasso accepts positional alpha

adrinjalali · 2020-05-18T20:44:09Z

sklearn/metrics/pairwise.py

 # Kernels
-def linear_kernel(X, Y=None, dense_output=True):
+@_deprecate_positional_args
+def linear_kernel(X, Y=None, *, dense_output=True):


I think in the PR I touched this file, we decided not to touch the kernels, but I'm happy with this.

#16982 (comment)

I'll leave it alone.

adrinjalali · 2020-05-18T20:45:23Z

sklearn/utils/__init__.py

            "0.22 and will be removed in version 0.24.")
-def safe_indexing(X, indices, axis=0):
+@_deprecate_positional_args
+def safe_indexing(X, indices, *, axis=0):


we're deprecating this func anyway.

yeah I already have a PR that removes it

…_arg_deprecation

thomasjpfan · 2020-05-18T21:09:11Z

I went with compute_class_weight(class_weight, *, classes, y) and made all the needed changes.

adrinjalali

LGTM when green. Are you fine with the changes @NicolasHug ?

thomasjpfan · 2020-05-18T21:20:45Z

sklearn/cluster/tests/test_mean_shift.py

    X = sparse.lil_matrix((1000, 1000))
    msg = "A sparse matrix was passed, but dense data is required."
-    assert_raise_message(TypeError, msg, estimate_bandwidth, X, 200)
+    assert_raise_message(TypeError, msg, estimate_bandwidth, X)


This is setting quantile to 200 which does not make sense for this check.

NicolasHug · 2020-05-18T21:24:16Z

yes

* ENH Adds public functions with position args warning * BUG Circlar import * STY * BUG Add keywords * BUG Fixes warnings * ENH Adds other public functions * CLN Suggestion * CLN Suggestion * REV Revert pairwise * ENH Positoinal args for compute_class_weight

thomasjpfan added 3 commits May 18, 2020 13:28

ENH Adds public functions with position args warning

de14e71

BUG Circlar import

3deee0e

STY

1ac91f8

thomasjpfan added this to the 0.23.1 milestone May 18, 2020

BUG Add keywords

1460c32

NicolasHug approved these changes May 18, 2020

View reviewed changes

BUG Fixes warnings

595be1e

ENH Adds other public functions

56193bf

adrinjalali reviewed May 18, 2020

View reviewed changes

thomasjpfan added 5 commits May 18, 2020 16:51

CLN Suggestion

fdf23c0

CLN Suggestion

4dbb623

Merge remote-tracking branch 'upstream/master' into complete_position…

cf83d4c

…_arg_deprecation

REV Revert pairwise

12cb28d

ENH Positoinal args for compute_class_weight

bf7085d

adrinjalali approved these changes May 18, 2020

View reviewed changes

thomasjpfan commented May 18, 2020

View reviewed changes

adrinjalali mentioned this pull request May 18, 2020

REL 0.23.1 release #17271

Merged

adrinjalali merged commit e770715 into scikit-learn:master May 19, 2020

Uh oh!

Conversation

thomasjpfan commented May 18, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

thomasjpfan commented May 18, 2020

Uh oh!

NicolasHug left a comment

Choose a reason for hiding this comment

Uh oh!

adrinjalali commented May 18, 2020

Uh oh!

adrinjalali commented May 18, 2020

Uh oh!

thomasjpfan commented May 18, 2020

Uh oh!

adrinjalali commented May 18, 2020

Uh oh!

thomasjpfan commented May 18, 2020

Uh oh!

adrinjalali commented May 18, 2020

Uh oh!

thomasjpfan commented May 18, 2020

Uh oh!

NicolasHug commented May 18, 2020

Uh oh!

adrinjalali commented May 18, 2020

Uh oh!

adrinjalali commented May 18, 2020

Uh oh!

thomasjpfan commented May 18, 2020

Uh oh!

thomasjpfan commented May 18, 2020

Uh oh!

adrinjalali commented May 18, 2020

Uh oh!

adrinjalali May 18, 2020

Choose a reason for hiding this comment

Uh oh!

adrinjalali May 18, 2020

Choose a reason for hiding this comment

Uh oh!

adrinjalali May 18, 2020

Choose a reason for hiding this comment

Uh oh!

thomasjpfan May 18, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

adrinjalali May 18, 2020

Choose a reason for hiding this comment

Uh oh!

NicolasHug May 18, 2020

Choose a reason for hiding this comment

Uh oh!

thomasjpfan commented May 18, 2020

Uh oh!

adrinjalali left a comment

Choose a reason for hiding this comment

Uh oh!

thomasjpfan May 18, 2020

Choose a reason for hiding this comment

Uh oh!

NicolasHug commented May 18, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

thomasjpfan commented May 18, 2020 •

edited

Loading

thomasjpfan May 18, 2020 •

edited

Loading