-
-
Notifications
You must be signed in to change notification settings - Fork 26.9k
Make random_state descriptions more informative and refer to Glossary #10548
Description
We recently added a Glossary to our documentation, which describes common parameters among other things. We should now replace descriptions of random_state parameters to make them more concise and informative (see #10415). For example, instead of
random_state : int, RandomState instance or None, optional, default: None
If int, random_state is the seed used by the random number generator;
If RandomState instance, random_state is the random number generator;
If None, the random number generator is the RandomState instance used
by `np.random`.
in both KMeans and MiniBatchKMeans, we might have:
KMeans:
random_state : int, RandomState instance, default=None
Determines random number generation for centroid initialization.
Pass an int for reproducible results across multiple function calls.
See :term:`Glossary <random_state>`.
MiniBatchKMeans:
random_state : int, RandomState instance, default=None
Determines random number generation for centroid initialization and
random reassignment.
Pass an int for reproducible results across multiple function calls.
See :term:`Glossary <random_state>`.
Therefore, the description should focus on what is the impact of random_state on the algorithm.
Contributors interested in contributing this change should take on one module at a time, initially.
The list of estimators to be modified is the following:
List of files to modify using kwinata script
-
sklearn/utils/random.py - 39 open PR
-
sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py - 736, 918
-
sklearn/ensemble/_hist_gradient_boosting/binning.py - 37, 112
-
sklearn/ensemble/_weight_boosting.py - 188, 324, 479, 900, 1022
-
sklearn/decomposition/_dict_learning.py - 364, 485, 692, 1135, 1325 Open PR
-
sklearn/decomposition/_fastica.py - 205, 344 Open PR
-
sklearn/decomposition/_pca.py - 192 Open PR
-
sklearn/decomposition/_sparse_pca.py - 82, 285 Open PR
-
sklearn/decomposition/_lda.py - 60, 79, 225 Open PR
-
sklearn/cluster/_kmeans.py - 56, 241, 380, 583, 700, 1150, 1370
-
sklearn/linear_model/_ransac.py - 152 Open PR
-
sklearn/linear_model/_coordinate_descent.py - 580, 860, 1313, 1487, 1665, 1851, 2016, 2192 Open PR
-
sklearn/linear_model/_sag.py - 154 Open PR
-
sklearn/linear_model/_passive_aggressive.py - 76, 322 Open PR
-
sklearn/linear_model/_logistic.py - 587, 924, 1100, 1658 Open PR
-
sklearn/linear_model/_stochastic_gradient.py - 369, 811, 1419 Open PR
-
sklearn/linear_model/_ridge.py - 325, 693, 853 Open PR
-
sklearn/svm/_base.py - 853 Open PR
-
sklearn/datasets/_samples_generator.py - 127, 323, 440, 531, 618, 688, 767, 904, 965, 1030, 1106, 1159, 1218, 1258, 1307, 1368, 1420, 1483, 1571, 1662
-
sklearn/model_selection/_split.py - 382, 588, 1091, 1196, 1250, 1390, 1492, 1605, 2049 Open PR
-
sklearn/neural_network/_multilayer_perceptron.py - 782, 1174
-
sklearn/covariance/_robust_covariance.py - 63, 233, 328, 545