Fix documentation of default values in base, birch by brigitteunger · Pull Request #16195 · scikit-learn/scikit-learn

brigitteunger · 2020-01-24T19:06:40Z

Fixes parts of #15761.

glemaitre

I would apply the following changes in the same time.

diff --git a/sklearn/base.py b/sklearn/base.py
index e438ef534..ff0d2818b 100644
--- a/sklearn/base.py
+++ b/sklearn/base.py
@@ -45,10 +45,10 @@ def clone(estimator, safe=True):
 
     Parameters
     ----------
-    estimator : estimator object, or list, tuple or set of objects
-        The estimator or group of estimators to be cloned
+    estimator : {list, tuple, set} of estimator object or estimator object
+        The estimator or group of estimators to be cloned.
 
-    safe : boolean, default=True
+    safe : bool, default=True
         If safe is false, clone will fall back to a deep copy on objects
         that are not estimators.
 
@@ -429,7 +429,8 @@ class ClusterMixin:
 
         Parameters
         ----------
-        X : ndarray, shape (n_samples, n_features)
+        X : {array-like, sparse matrix, dataframe} of shape \
+                (n_samples, n_features)
             Input data.
 
         y : Ignored
@@ -437,7 +438,7 @@ class ClusterMixin:
 
         Returns
         -------
-        labels : ndarray, shape (n_samples,)
+        labels : ndarray of shape (n_samples,)
             Cluster labels.
         """
         # non-optimized default implementation; override when a better
@@ -469,9 +470,9 @@ class BiclusterMixin:
 
         Returns
         -------
-        row_ind : np.array, dtype=np.intp
+        row_ind : ndarray, dtype=np.intp
             Indices of rows in the dataset that belong to the bicluster.
-        col_ind : np.array, dtype=np.intp
+        col_ind : ndarray, dtype=np.intp
             Indices of columns in the dataset that belong to the bicluster.
 
         """
@@ -489,7 +490,7 @@ class BiclusterMixin:
 
         Returns
         -------
-        shape : (int, int)
+        shape : tuple (int, int)
             Number of rows and columns (resp.) in the bicluster.
         """
         indices = self.get_indices(i)
@@ -533,10 +534,11 @@ class TransformerMixin:
 
         Parameters
         ----------
-        X : numpy array of shape [n_samples, n_features]
+        X : {array-like, sparse matrix, dataframe} of shape \
+                (n_samples, n_features)
             Training set.
 
-        y : numpy array of shape [n_samples], default=None
+        y : array-like of shape (n_samples,), default=None
             Target values.
 
         **fit_params : dict
@@ -544,7 +546,7 @@ class TransformerMixin:
 
         Returns
         -------
-        X_new : numpy array of shape [n_samples, n_features_new]
+        X_new : ndarray of shape (n_samples, n_features_new)
             Transformed array.
         """
         # non-optimized default implementation; override when a better
@@ -586,7 +588,8 @@ class OutlierMixin:
 
         Parameters
         ----------
-        X : ndarray, shape (n_samples, n_features)
+        X : {array-like, sparse matrix, dataframe} of shape \
+                (n_samples, n_features)
             Input data.
 
         y : Ignored
@@ -594,7 +597,7 @@ class OutlierMixin:
 
         Returns
         -------
-        y : ndarray, shape (n_samples,)
+        y : ndarray of shape (n_samples,)
             1 for inliers, -1 for outliers.
         """
         # override for transductive outlier detectors like LocalOulierFactor
diff --git a/sklearn/cluster/_birch.py b/sklearn/cluster/_birch.py
index d160d581e..48266a70c 100644
--- a/sklearn/cluster/_birch.py
+++ b/sklearn/cluster/_birch.py
@@ -108,28 +108,28 @@ class _CFNode:
 
     Attributes
     ----------
-    subclusters_ : array-like
-        list of subclusters for a particular CFNode.
+    subclusters_ : list
+        List of subclusters for a particular CFNode.
 
     prev_leaf_ : _CFNode
-        prev_leaf. Useful only if is_leaf is True.
+        Useful only if is_leaf is True.
 
     next_leaf_ : _CFNode
-        next_leaf. Useful only if is_leaf is True.
+        Useful only if is_leaf is True.
         the final subclusters.
 
-    init_centroids_ : ndarray, shape (branching_factor + 1, n_features)
-        manipulate ``init_centroids_`` throughout rather than centroids_ since
+    init_centroids_ : ndarray of shape (branching_factor + 1, n_features)
+        Manipulate ``init_centroids_`` throughout rather than centroids_ since
         the centroids are just a view of the ``init_centroids_`` .
 
-    init_sq_norm_ : ndarray, shape (branching_factor + 1,)
+    init_sq_norm_ : ndarray of shape (branching_factor + 1,)
         manipulate init_sq_norm_ throughout. similar to ``init_centroids_``.
 
-    centroids_ : ndarray
-        view of ``init_centroids_``.
+    centroids_ : ndarray of shape (branching_factor + 1, n_features)
+        View of ``init_centroids_``.
 
-    squared_norm_ : ndarray
-        view of ``init_sq_norm_``.
+    squared_norm_ : ndarray of shape (branching_factor + 1,)
+        View of ``init_sq_norm_``.
 
     """
     def __init__(self, threshold, branching_factor, is_leaf, n_features):
@@ -245,7 +245,7 @@ class _CFSubcluster:
 
     Parameters
     ----------
-    linear_sum : ndarray, shape (n_features,), default=None
+    linear_sum : ndarray of shape (n_features,), default=None
         Sample. This is kept optional to allow initialization of empty
         subclusters.
 
@@ -261,7 +261,7 @@ class _CFSubcluster:
     squared_sum_ : float
         Sum of the squared l2 norms of all samples belonging to a subcluster.
 
-    centroid_ : ndarray
+    centroid_ : ndarray of shape (branching_factor + 1, n_features)
         Centroid of the subcluster. Prevent recomputing of centroids when
         ``CFNode.centroids_`` is called.
 
@@ -269,7 +269,7 @@ class _CFSubcluster:
         Child Node of the subcluster. Once a given _CFNode is set as the child
         of the _CFNode, it is set to ``self.child_``.
 
-    sq_norm_ : ndarray
+    sq_norm_ : ndarray of shape (branching_factor + 1,)
         Squared norm of the subcluster. Used to prevent recomputing when
         pairwise minimum distances are computed.
     """
@@ -376,14 +376,14 @@ class Birch(ClusterMixin, TransformerMixin, BaseEstimator):
     dummy_leaf_ : _CFNode
         Start pointer to all the leaves.
 
-    subcluster_centers_ : ndarray,
+    subcluster_centers_ : ndarray
         Centroids of all subclusters read directly from the leaves.
 
-    subcluster_labels_ : ndarray,
+    subcluster_labels_ : ndarray
         Labels assigned to the centroids of the subclusters after
         they are clustered globally.
 
-    labels_ : ndarray, shape (n_samples,)
+    labels_ : ndarray of shape (n_samples,)
         Array of labels assigned to the input data.
         if partial_fit is used instead of fit, they are assigned to the
         last batch of data.
@@ -444,7 +444,7 @@ class Birch(ClusterMixin, TransformerMixin, BaseEstimator):
 
         Parameters
         ----------
-        X : {array-like, sparse matrix}, shape (n_samples, n_features)
+        X : {array-like, sparse matrix} of shape (n_samples, n_features)
             Input data.
 
         y : Ignored
@@ -515,7 +515,7 @@ class Birch(ClusterMixin, TransformerMixin, BaseEstimator):
 
         Returns
         -------
-        leaves : array-like
+        leaves : list of shape (n_leaves,)
             List of the leaf nodes.
         """
         leaf_ptr = self.dummy_leaf_.next_leaf_
@@ -531,7 +531,8 @@ class Birch(ClusterMixin, TransformerMixin, BaseEstimator):
 
         Parameters
         ----------
-        X : {array-like, sparse matrix}, shape (n_samples, n_features), None
+        X : {array-like, sparse matrix} of shape (n_samples, n_features), \
+                default=None
             Input data. If X is not provided, only the global clustering
             step is done.
 
@@ -569,12 +570,12 @@ class Birch(ClusterMixin, TransformerMixin, BaseEstimator):
 
         Parameters
         ----------
-        X : {array-like, sparse matrix}, shape (n_samples, n_features)
+        X : {array-like, sparse matrix} of shape (n_samples, n_features)
             Input data.
 
         Returns
         -------
-        labels : ndarray, shape(n_samples)
+        labels : ndarray of shape (n_samples,)
             Labelled data.
         """
         X = check_array(X, accept_sparse='csr')
@@ -593,12 +594,12 @@ class Birch(ClusterMixin, TransformerMixin, BaseEstimator):
 
         Parameters
         ----------
-        X : {array-like, sparse matrix}, shape (n_samples, n_features)
+        X : {array-like, sparse matrix} of shape (n_samples, n_features)
             Input data.
 
         Returns
         -------
-        X_trans : {array-like, sparse matrix}, shape (n_samples, n_clusters)
+        X_trans : {array-like, sparse matrix} of shape (n_samples, n_clusters)
             Transformed data.
         """
         check_is_fitted(self)

brigitteunger · 2020-04-28T09:38:37Z

@glemaitre: I fixed input and output description as you suggested. Do you have time for another review?

brigitteunger · 2020-05-09T13:16:39Z

@adrinjalali : I fixed input and output description as @glemaitre suggested. Do you have time for a review?

thomasjpfan · 2020-05-09T17:37:57Z

sklearn/base.py

    ----------
-    estimator : estimator object, or list, tuple or set of objects
-        The estimator or group of estimators to be cloned
+    estimator : {list, tuple, set} of estimator object or estimator object


Nit:

Suggested change

estimator : {list, tuple, set} of estimator object or estimator object

estimator : {list, tuple, set} of estimator objects or estimator object

Done. Thank you very much for your review!

thomasjpfan · 2020-05-10T16:14:27Z

Thank you @brigitteunger !

…16195) * Fix documentation of default values in birch * fix input and output description as suggested * fix typo Co-authored-by: Brigitte@home <unger@nue.tu-berlin.de>

* Fix documentation of default values in birch * fix input and output description as suggested * fix typo Co-authored-by: Brigitte@home <unger@nue.tu-berlin.de>

…16195) * Fix documentation of default values in birch * fix input and output description as suggested * fix typo Co-authored-by: Brigitte@home <unger@nue.tu-berlin.de>

brigitteunger requested a review from adrinjalali January 24, 2020 19:18

glemaitre assigned glemaitre and unassigned glemaitre Jan 26, 2020

glemaitre requested changes Jan 26, 2020

View reviewed changes

brigitteunger force-pushed the docu branch from 2b8d97e to 12e73ea Compare February 9, 2020 16:49

github-actions bot added the module:cluster label Mar 2, 2020

brigitteunger force-pushed the docu branch from 12e73ea to 18cb5ef Compare April 28, 2020 08:18

brigitteunger requested a review from glemaitre April 28, 2020 09:36

thomasjpfan approved these changes May 9, 2020

View reviewed changes

brigitteunger and others added 3 commits May 9, 2020 20:09

Fix documentation of default values in birch

88a8d0f

fix input and output description as suggested

44cca34

fix typo

31f5e3e

brigitteunger force-pushed the docu branch from 3db674d to 31f5e3e Compare May 9, 2020 18:13

brigitteunger requested a review from thomasjpfan May 10, 2020 13:42

thomasjpfan approved these changes May 10, 2020

View reviewed changes

thomasjpfan merged commit 865069c into scikit-learn:master May 10, 2020

brigitteunger deleted the docu branch May 11, 2020 07:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix documentation of default values in base, birch#16195

Fix documentation of default values in base, birch#16195
thomasjpfan merged 3 commits intoscikit-learn:masterfrom
brigitteunger:docu

brigitteunger commented Jan 24, 2020

Uh oh!

glemaitre left a comment

Uh oh!

brigitteunger commented Apr 28, 2020

Uh oh!

brigitteunger commented May 9, 2020

Uh oh!

thomasjpfan May 9, 2020

Uh oh!

brigitteunger May 9, 2020

Uh oh!

thomasjpfan commented May 10, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	estimator : {list, tuple, set} of estimator object or estimator object
	estimator : {list, tuple, set} of estimator objects or estimator object

Uh oh!

Conversation

brigitteunger commented Jan 24, 2020

Uh oh!

glemaitre left a comment

Choose a reason for hiding this comment

Uh oh!

brigitteunger commented Apr 28, 2020

Uh oh!

brigitteunger commented May 9, 2020

Uh oh!

thomasjpfan May 9, 2020

Choose a reason for hiding this comment

Uh oh!

brigitteunger May 9, 2020

Choose a reason for hiding this comment

Uh oh!

thomasjpfan commented May 10, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants