Skip to content

Fix documentation of default values in base, birch#16195

Merged
thomasjpfan merged 3 commits intoscikit-learn:masterfrom
brigitteunger:docu
May 10, 2020
Merged

Fix documentation of default values in base, birch#16195
thomasjpfan merged 3 commits intoscikit-learn:masterfrom
brigitteunger:docu

Conversation

@brigitteunger
Copy link
Copy Markdown
Contributor

Fixes parts of #15761.

Copy link
Copy Markdown
Member

@glemaitre glemaitre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would apply the following changes in the same time.

diff --git a/sklearn/base.py b/sklearn/base.py
index e438ef534..ff0d2818b 100644
--- a/sklearn/base.py
+++ b/sklearn/base.py
@@ -45,10 +45,10 @@ def clone(estimator, safe=True):
 
     Parameters
     ----------
-    estimator : estimator object, or list, tuple or set of objects
-        The estimator or group of estimators to be cloned
+    estimator : {list, tuple, set} of estimator object or estimator object
+        The estimator or group of estimators to be cloned.
 
-    safe : boolean, default=True
+    safe : bool, default=True
         If safe is false, clone will fall back to a deep copy on objects
         that are not estimators.
 
@@ -429,7 +429,8 @@ class ClusterMixin:
 
         Parameters
         ----------
-        X : ndarray, shape (n_samples, n_features)
+        X : {array-like, sparse matrix, dataframe} of shape \
+                (n_samples, n_features)
             Input data.
 
         y : Ignored
@@ -437,7 +438,7 @@ class ClusterMixin:
 
         Returns
         -------
-        labels : ndarray, shape (n_samples,)
+        labels : ndarray of shape (n_samples,)
             Cluster labels.
         """
         # non-optimized default implementation; override when a better
@@ -469,9 +470,9 @@ class BiclusterMixin:
 
         Returns
         -------
-        row_ind : np.array, dtype=np.intp
+        row_ind : ndarray, dtype=np.intp
             Indices of rows in the dataset that belong to the bicluster.
-        col_ind : np.array, dtype=np.intp
+        col_ind : ndarray, dtype=np.intp
             Indices of columns in the dataset that belong to the bicluster.
 
         """
@@ -489,7 +490,7 @@ class BiclusterMixin:
 
         Returns
         -------
-        shape : (int, int)
+        shape : tuple (int, int)
             Number of rows and columns (resp.) in the bicluster.
         """
         indices = self.get_indices(i)
@@ -533,10 +534,11 @@ class TransformerMixin:
 
         Parameters
         ----------
-        X : numpy array of shape [n_samples, n_features]
+        X : {array-like, sparse matrix, dataframe} of shape \
+                (n_samples, n_features)
             Training set.
 
-        y : numpy array of shape [n_samples], default=None
+        y : array-like of shape (n_samples,), default=None
             Target values.
 
         **fit_params : dict
@@ -544,7 +546,7 @@ class TransformerMixin:
 
         Returns
         -------
-        X_new : numpy array of shape [n_samples, n_features_new]
+        X_new : ndarray of shape (n_samples, n_features_new)
             Transformed array.
         """
         # non-optimized default implementation; override when a better
@@ -586,7 +588,8 @@ class OutlierMixin:
 
         Parameters
         ----------
-        X : ndarray, shape (n_samples, n_features)
+        X : {array-like, sparse matrix, dataframe} of shape \
+                (n_samples, n_features)
             Input data.
 
         y : Ignored
@@ -594,7 +597,7 @@ class OutlierMixin:
 
         Returns
         -------
-        y : ndarray, shape (n_samples,)
+        y : ndarray of shape (n_samples,)
             1 for inliers, -1 for outliers.
         """
         # override for transductive outlier detectors like LocalOulierFactor
diff --git a/sklearn/cluster/_birch.py b/sklearn/cluster/_birch.py
index d160d581e..48266a70c 100644
--- a/sklearn/cluster/_birch.py
+++ b/sklearn/cluster/_birch.py
@@ -108,28 +108,28 @@ class _CFNode:
 
     Attributes
     ----------
-    subclusters_ : array-like
-        list of subclusters for a particular CFNode.
+    subclusters_ : list
+        List of subclusters for a particular CFNode.
 
     prev_leaf_ : _CFNode
-        prev_leaf. Useful only if is_leaf is True.
+        Useful only if is_leaf is True.
 
     next_leaf_ : _CFNode
-        next_leaf. Useful only if is_leaf is True.
+        Useful only if is_leaf is True.
         the final subclusters.
 
-    init_centroids_ : ndarray, shape (branching_factor + 1, n_features)
-        manipulate ``init_centroids_`` throughout rather than centroids_ since
+    init_centroids_ : ndarray of shape (branching_factor + 1, n_features)
+        Manipulate ``init_centroids_`` throughout rather than centroids_ since
         the centroids are just a view of the ``init_centroids_`` .
 
-    init_sq_norm_ : ndarray, shape (branching_factor + 1,)
+    init_sq_norm_ : ndarray of shape (branching_factor + 1,)
         manipulate init_sq_norm_ throughout. similar to ``init_centroids_``.
 
-    centroids_ : ndarray
-        view of ``init_centroids_``.
+    centroids_ : ndarray of shape (branching_factor + 1, n_features)
+        View of ``init_centroids_``.
 
-    squared_norm_ : ndarray
-        view of ``init_sq_norm_``.
+    squared_norm_ : ndarray of shape (branching_factor + 1,)
+        View of ``init_sq_norm_``.
 
     """
     def __init__(self, threshold, branching_factor, is_leaf, n_features):
@@ -245,7 +245,7 @@ class _CFSubcluster:
 
     Parameters
     ----------
-    linear_sum : ndarray, shape (n_features,), default=None
+    linear_sum : ndarray of shape (n_features,), default=None
         Sample. This is kept optional to allow initialization of empty
         subclusters.
 
@@ -261,7 +261,7 @@ class _CFSubcluster:
     squared_sum_ : float
         Sum of the squared l2 norms of all samples belonging to a subcluster.
 
-    centroid_ : ndarray
+    centroid_ : ndarray of shape (branching_factor + 1, n_features)
         Centroid of the subcluster. Prevent recomputing of centroids when
         ``CFNode.centroids_`` is called.
 
@@ -269,7 +269,7 @@ class _CFSubcluster:
         Child Node of the subcluster. Once a given _CFNode is set as the child
         of the _CFNode, it is set to ``self.child_``.
 
-    sq_norm_ : ndarray
+    sq_norm_ : ndarray of shape (branching_factor + 1,)
         Squared norm of the subcluster. Used to prevent recomputing when
         pairwise minimum distances are computed.
     """
@@ -376,14 +376,14 @@ class Birch(ClusterMixin, TransformerMixin, BaseEstimator):
     dummy_leaf_ : _CFNode
         Start pointer to all the leaves.
 
-    subcluster_centers_ : ndarray,
+    subcluster_centers_ : ndarray
         Centroids of all subclusters read directly from the leaves.
 
-    subcluster_labels_ : ndarray,
+    subcluster_labels_ : ndarray
         Labels assigned to the centroids of the subclusters after
         they are clustered globally.
 
-    labels_ : ndarray, shape (n_samples,)
+    labels_ : ndarray of shape (n_samples,)
         Array of labels assigned to the input data.
         if partial_fit is used instead of fit, they are assigned to the
         last batch of data.
@@ -444,7 +444,7 @@ class Birch(ClusterMixin, TransformerMixin, BaseEstimator):
 
         Parameters
         ----------
-        X : {array-like, sparse matrix}, shape (n_samples, n_features)
+        X : {array-like, sparse matrix} of shape (n_samples, n_features)
             Input data.
 
         y : Ignored
@@ -515,7 +515,7 @@ class Birch(ClusterMixin, TransformerMixin, BaseEstimator):
 
         Returns
         -------
-        leaves : array-like
+        leaves : list of shape (n_leaves,)
             List of the leaf nodes.
         """
         leaf_ptr = self.dummy_leaf_.next_leaf_
@@ -531,7 +531,8 @@ class Birch(ClusterMixin, TransformerMixin, BaseEstimator):
 
         Parameters
         ----------
-        X : {array-like, sparse matrix}, shape (n_samples, n_features), None
+        X : {array-like, sparse matrix} of shape (n_samples, n_features), \
+                default=None
             Input data. If X is not provided, only the global clustering
             step is done.
 
@@ -569,12 +570,12 @@ class Birch(ClusterMixin, TransformerMixin, BaseEstimator):
 
         Parameters
         ----------
-        X : {array-like, sparse matrix}, shape (n_samples, n_features)
+        X : {array-like, sparse matrix} of shape (n_samples, n_features)
             Input data.
 
         Returns
         -------
-        labels : ndarray, shape(n_samples)
+        labels : ndarray of shape (n_samples,)
             Labelled data.
         """
         X = check_array(X, accept_sparse='csr')
@@ -593,12 +594,12 @@ class Birch(ClusterMixin, TransformerMixin, BaseEstimator):
 
         Parameters
         ----------
-        X : {array-like, sparse matrix}, shape (n_samples, n_features)
+        X : {array-like, sparse matrix} of shape (n_samples, n_features)
             Input data.
 
         Returns
         -------
-        X_trans : {array-like, sparse matrix}, shape (n_samples, n_clusters)
+        X_trans : {array-like, sparse matrix} of shape (n_samples, n_clusters)
             Transformed data.
         """
         check_is_fitted(self)

@brigitteunger
Copy link
Copy Markdown
Contributor Author

@glemaitre: I fixed input and output description as you suggested. Do you have time for another review?

@brigitteunger
Copy link
Copy Markdown
Contributor Author

@adrinjalali : I fixed input and output description as @glemaitre suggested. Do you have time for a review?

sklearn/base.py Outdated
----------
estimator : estimator object, or list, tuple or set of objects
The estimator or group of estimators to be cloned
estimator : {list, tuple, set} of estimator object or estimator object
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit:

Suggested change
estimator : {list, tuple, set} of estimator object or estimator object
estimator : {list, tuple, set} of estimator objects or estimator object

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Thank you very much for your review!

@thomasjpfan thomasjpfan merged commit 865069c into scikit-learn:master May 10, 2020
@thomasjpfan
Copy link
Copy Markdown
Member

Thank you @brigitteunger !

@brigitteunger brigitteunger deleted the docu branch May 11, 2020 07:47
adrinjalali pushed a commit to adrinjalali/scikit-learn that referenced this pull request May 11, 2020
…16195)

* Fix documentation of default values in birch

* fix input and output description as suggested

* fix typo

Co-authored-by: Brigitte@home <unger@nue.tu-berlin.de>
adrinjalali pushed a commit that referenced this pull request May 12, 2020
* Fix documentation of default values in birch

* fix input and output description as suggested

* fix typo

Co-authored-by: Brigitte@home <unger@nue.tu-berlin.de>
gio8tisu pushed a commit to gio8tisu/scikit-learn that referenced this pull request May 15, 2020
…16195)

* Fix documentation of default values in birch

* fix input and output description as suggested

* fix typo

Co-authored-by: Brigitte@home <unger@nue.tu-berlin.de>
viclafargue pushed a commit to viclafargue/scikit-learn that referenced this pull request Jun 26, 2020
…16195)

* Fix documentation of default values in birch

* fix input and output description as suggested

* fix typo

Co-authored-by: Brigitte@home <unger@nue.tu-berlin.de>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants