[MRG + 1] move custom error/warning classes into sklearn.exceptions (and move deprecated away from utils.__init__.py)#4826
Conversation
74a6ef6 to
b9faf05
Compare
|
@larsmans Thanks for the review!! I assumed Also I agree that |
|
Why did you rename the NonBlasDot warning? |
|
Lars felt Can I keep it as |
|
I didn't see his comment. it's fine then. Should we deprecate? I have no strong opinion and I'd be fine with merging as-is. |
|
Thanks for the review! :) @larsmans One final look at this? |
|
@jnothman @agramfort Could I trouble you for a review? :)
|
sklearn/exceptions.py
Outdated
There was a problem hiding this comment.
When taken out of context of a particular module, these need docstrings to explain when they should be used or expected. It may even be appropriate to add these to doc/modules/classes.rst as part of the public API.
d62316b to
66cedb6
Compare
|
Can I skip the doc tests for the added examples? |
sklearn/exceptions.py
Outdated
|
the current doctests fail not because of raising warning but because of errors in the doctests. |
|
@amueller Thanks for the review! :) Also do you feel the Examples are fine? Or should they be written from a devs perspective (As an example for where (s)he'd use the Error/Warning...)? |
9156de2 to
e2dd2a1
Compare
|
Please let me know if there is anything else to be done! Also do we need a user guide for the |
sklearn/exceptions.py
Outdated
ENH NonBLASDotWarning -> EfficiencyWarning; Improve error message DOC Add exceptions module to modules/classes.rst MAINT Move ConvergenceWarning, UndefinedMetricWarning et al into exceptions MAINT Remove ChangedBehaviorWarning from base DOC/FIX Improve DataConversionWarning's docstring
|
Done! @GaelVaroquaux and @pletelli could you please review? :) |
|
Travis failed! |
|
Hmm Interesting failure... Looks like deprecating them has side effects that propagate back to the |
|
I think the last commit should fix it... Its not an ugly hack I feel... |
|
Still failing. Are you running the tests on your computer? This might save you some time. |
|
Apologies! I got a bit overconfident and assumed it would pass :/ (This version passes on my machine!) |
|
and on travis too :) |
There was a problem hiding this comment.
Maybe this guy should have a description.
|
👍 Merging. Good job! |
[MRG + 1] move custom error/warning classes into sklearn.exceptions (and move `deprecated` away from `utils.__init__.py`)
|
Wait.. Description for undefined metric warning? |
|
Anyway thanks for the review and merge :) |
|
Wait.. Description for undefined metric warning?
Oh, yeah. Can you send a new PR. I'll merge it ASAP.
|
|
Done and merged at #5478 |
Squashed commit messages - (For reference)
Major
-----
* ENH p --> n_labels
* FIX *ShuffleSplit: all float/invalid type errors at init and int error at split
* FIX make PredefinedSplit accept test_folds in constructor; Cleanup docstrings
* ENH+TST KFold: make rng to be generated at every split call for reproducibility
* FIX/MAINT KFold: make shuffle a public attr
* FIX Make CVIterableWrapper private.
* FIX reuse len_cv instead of recalculating it
* FIX Prevent adding *SearchCV estimators from the old grid_search module
* re-FIX In all_estimators: the sorting to use only the 1st item (name)
To avoid collision between the old and the new GridSearch classes.
* FIX test_validate.py: Use 2D X (1D X is being detected as a single sample)
* MAINT validate.py --> validation.py
* MAINT make the submodules private
* MAINT Support old cv/gs/lc until 0.19
* FIX/MAINT n_splits --> get_n_splits
* FIX/TST test_logistic.py/test_ovr_multinomial_iris:
pass predefined folds as an iterable
* MAINT expose BaseCrossValidator
* Update the model_selection module with changes from master
- From scikit-learn#5161
- - MAINT remove redundant p variable
- - Add check for sparse prediction in cross_val_predict
- From scikit-learn#5201 - DOC improve random_state param doc
- From scikit-learn#5190 - LabelKFold and test
- From scikit-learn#4583 - LabelShuffleSplit and tests
- From scikit-learn#5300 - shuffle the `labels` not the `indxs` in LabelKFold + tests
- From scikit-learn#5378 - Make the GridSearchCV docs more accurate.
- From scikit-learn#5458 - Remove shuffle from LabelKFold
- From scikit-learn#5466(scikit-learn#4270) - Gaussian Process by Jan Metzen
- From scikit-learn#4826 - Move custom error / warnings into sklearn.exception
Minor
-----
* ENH Make the KFold shuffling test stronger
* FIX/DOC Use the higher level model_selection module as ref
* DOC in check_cv "y : array-like, optional"
* DOC a supervised learning problem --> supervised learning problems
* DOC cross-validators --> cross-validation strategies
* DOC Correct Olivier Grisel's name ;)
* MINOR/FIX cv_indices --> kfold
* FIX/DOC Align the 'See also' section of the new KFold, LeaveOneOut
* TST/FIX imports on separate lines
* FIX use __class__ instead of classmethod
* TST/FIX import directly from model_selection
* COSMIT Relocate the random_state documentation
* COSMIT remove pass
* MAINT Remove deprecation warnings from old tests
* FIX correct import at test_split
* FIX/MAINT Move P_sparse, X, y defns to top; rm unused W_sparse, X_sparse
* FIX random state to avoid doctest failure
* TST n_splits and split wrapping of _CVIterableWrapper
* FIX/MAINT Use multilabel indicator matrix directly
* TST/DOC clarify why we conflate classes 0 and 1
* DOC add comment that this was taken from BaseEstimator
* FIX use of labels is not needed in stratified k fold
* Fix cross_validation reference
* Fix the labels param doc
Squashed commit messages - (For reference)
Major
-----
* ENH p --> n_labels
* FIX *ShuffleSplit: all float/invalid type errors at init and int error at split
* FIX make PredefinedSplit accept test_folds in constructor; Cleanup docstrings
* ENH+TST KFold: make rng to be generated at every split call for reproducibility
* FIX/MAINT KFold: make shuffle a public attr
* FIX Make CVIterableWrapper private.
* FIX reuse len_cv instead of recalculating it
* FIX Prevent adding *SearchCV estimators from the old grid_search module
* re-FIX In all_estimators: the sorting to use only the 1st item (name)
To avoid collision between the old and the new GridSearch classes.
* FIX test_validate.py: Use 2D X (1D X is being detected as a single sample)
* MAINT validate.py --> validation.py
* MAINT make the submodules private
* MAINT Support old cv/gs/lc until 0.19
* FIX/MAINT n_splits --> get_n_splits
* FIX/TST test_logistic.py/test_ovr_multinomial_iris:
pass predefined folds as an iterable
* MAINT expose BaseCrossValidator
* Update the model_selection module with changes from master
- From scikit-learn#5161
- - MAINT remove redundant p variable
- - Add check for sparse prediction in cross_val_predict
- From scikit-learn#5201 - DOC improve random_state param doc
- From scikit-learn#5190 - LabelKFold and test
- From scikit-learn#4583 - LabelShuffleSplit and tests
- From scikit-learn#5300 - shuffle the `labels` not the `indxs` in LabelKFold + tests
- From scikit-learn#5378 - Make the GridSearchCV docs more accurate.
- From scikit-learn#5458 - Remove shuffle from LabelKFold
- From scikit-learn#5466(scikit-learn#4270) - Gaussian Process by Jan Metzen
- From scikit-learn#4826 - Move custom error / warnings into sklearn.exception
Minor
-----
* ENH Make the KFold shuffling test stronger
* FIX/DOC Use the higher level model_selection module as ref
* DOC in check_cv "y : array-like, optional"
* DOC a supervised learning problem --> supervised learning problems
* DOC cross-validators --> cross-validation strategies
* DOC Correct Olivier Grisel's name ;)
* MINOR/FIX cv_indices --> kfold
* FIX/DOC Align the 'See also' section of the new KFold, LeaveOneOut
* TST/FIX imports on separate lines
* FIX use __class__ instead of classmethod
* TST/FIX import directly from model_selection
* COSMIT Relocate the random_state documentation
* COSMIT remove pass
* MAINT Remove deprecation warnings from old tests
* FIX correct import at test_split
* FIX/MAINT Move P_sparse, X, y defns to top; rm unused W_sparse, X_sparse
* FIX random state to avoid doctest failure
* TST n_splits and split wrapping of _CVIterableWrapper
* FIX/MAINT Use multilabel indicator matrix directly
* TST/DOC clarify why we conflate classes 0 and 1
* DOC add comment that this was taken from BaseEstimator
* FIX use of labels is not needed in stratified k fold
* Fix cross_validation reference
* Fix the labels param doc
Squashed commit messages - (For reference)
Major
-----
* ENH p --> n_labels
* FIX *ShuffleSplit: all float/invalid type errors at init and int error at split
* FIX make PredefinedSplit accept test_folds in constructor; Cleanup docstrings
* ENH+TST KFold: make rng to be generated at every split call for reproducibility
* FIX/MAINT KFold: make shuffle a public attr
* FIX Make CVIterableWrapper private.
* FIX reuse len_cv instead of recalculating it
* FIX Prevent adding *SearchCV estimators from the old grid_search module
* re-FIX In all_estimators: the sorting to use only the 1st item (name)
To avoid collision between the old and the new GridSearch classes.
* FIX test_validate.py: Use 2D X (1D X is being detected as a single sample)
* MAINT validate.py --> validation.py
* MAINT make the submodules private
* MAINT Support old cv/gs/lc until 0.19
* FIX/MAINT n_splits --> get_n_splits
* FIX/TST test_logistic.py/test_ovr_multinomial_iris:
pass predefined folds as an iterable
* MAINT expose BaseCrossValidator
* Update the model_selection module with changes from master
- From scikit-learn#5161
- - MAINT remove redundant p variable
- - Add check for sparse prediction in cross_val_predict
- From scikit-learn#5201 - DOC improve random_state param doc
- From scikit-learn#5190 - LabelKFold and test
- From scikit-learn#4583 - LabelShuffleSplit and tests
- From scikit-learn#5300 - shuffle the `labels` not the `indxs` in LabelKFold + tests
- From scikit-learn#5378 - Make the GridSearchCV docs more accurate.
- From scikit-learn#5458 - Remove shuffle from LabelKFold
- From scikit-learn#5466(scikit-learn#4270) - Gaussian Process by Jan Metzen
- From scikit-learn#4826 - Move custom error / warnings into sklearn.exception
Minor
-----
* ENH Make the KFold shuffling test stronger
* FIX/DOC Use the higher level model_selection module as ref
* DOC in check_cv "y : array-like, optional"
* DOC a supervised learning problem --> supervised learning problems
* DOC cross-validators --> cross-validation strategies
* DOC Correct Olivier Grisel's name ;)
* MINOR/FIX cv_indices --> kfold
* FIX/DOC Align the 'See also' section of the new KFold, LeaveOneOut
* TST/FIX imports on separate lines
* FIX use __class__ instead of classmethod
* TST/FIX import directly from model_selection
* COSMIT Relocate the random_state documentation
* COSMIT remove pass
* MAINT Remove deprecation warnings from old tests
* FIX correct import at test_split
* FIX/MAINT Move P_sparse, X, y defns to top; rm unused W_sparse, X_sparse
* FIX random state to avoid doctest failure
* TST n_splits and split wrapping of _CVIterableWrapper
* FIX/MAINT Use multilabel indicator matrix directly
* TST/DOC clarify why we conflate classes 0 and 1
* DOC add comment that this was taken from BaseEstimator
* FIX use of labels is not needed in stratified k fold
* Fix cross_validation reference
* Fix the labels param doc
Squashed commit messages - (For reference)
Major
-----
* ENH p --> n_labels
* FIX *ShuffleSplit: all float/invalid type errors at init and int error at split
* FIX make PredefinedSplit accept test_folds in constructor; Cleanup docstrings
* ENH+TST KFold: make rng to be generated at every split call for reproducibility
* FIX/MAINT KFold: make shuffle a public attr
* FIX Make CVIterableWrapper private.
* FIX reuse len_cv instead of recalculating it
* FIX Prevent adding *SearchCV estimators from the old grid_search module
* re-FIX In all_estimators: the sorting to use only the 1st item (name)
To avoid collision between the old and the new GridSearch classes.
* FIX test_validate.py: Use 2D X (1D X is being detected as a single sample)
* MAINT validate.py --> validation.py
* MAINT make the submodules private
* MAINT Support old cv/gs/lc until 0.19
* FIX/MAINT n_splits --> get_n_splits
* FIX/TST test_logistic.py/test_ovr_multinomial_iris:
pass predefined folds as an iterable
* MAINT expose BaseCrossValidator
* Update the model_selection module with changes from master
- From scikit-learn#5161
- - MAINT remove redundant p variable
- - Add check for sparse prediction in cross_val_predict
- From scikit-learn#5201 - DOC improve random_state param doc
- From scikit-learn#5190 - LabelKFold and test
- From scikit-learn#4583 - LabelShuffleSplit and tests
- From scikit-learn#5300 - shuffle the `labels` not the `indxs` in LabelKFold + tests
- From scikit-learn#5378 - Make the GridSearchCV docs more accurate.
- From scikit-learn#5458 - Remove shuffle from LabelKFold
- From scikit-learn#5466(scikit-learn#4270) - Gaussian Process by Jan Metzen
- From scikit-learn#4826 - Move custom error / warnings into sklearn.exception
Minor
-----
* ENH Make the KFold shuffling test stronger
* FIX/DOC Use the higher level model_selection module as ref
* DOC in check_cv "y : array-like, optional"
* DOC a supervised learning problem --> supervised learning problems
* DOC cross-validators --> cross-validation strategies
* DOC Correct Olivier Grisel's name ;)
* MINOR/FIX cv_indices --> kfold
* FIX/DOC Align the 'See also' section of the new KFold, LeaveOneOut
* TST/FIX imports on separate lines
* FIX use __class__ instead of classmethod
* TST/FIX import directly from model_selection
* COSMIT Relocate the random_state documentation
* COSMIT remove pass
* MAINT Remove deprecation warnings from old tests
* FIX correct import at test_split
* FIX/MAINT Move P_sparse, X, y defns to top; rm unused W_sparse, X_sparse
* FIX random state to avoid doctest failure
* TST n_splits and split wrapping of _CVIterableWrapper
* FIX/MAINT Use multilabel indicator matrix directly
* TST/DOC clarify why we conflate classes 0 and 1
* DOC add comment that this was taken from BaseEstimator
* FIX use of labels is not needed in stratified k fold
* Fix cross_validation reference
* Fix the labels param doc
Squashed commit messages - (For reference)
Major
-----
* ENH p --> n_labels
* FIX *ShuffleSplit: all float/invalid type errors at init and int error at split
* FIX make PredefinedSplit accept test_folds in constructor; Cleanup docstrings
* ENH+TST KFold: make rng to be generated at every split call for reproducibility
* FIX/MAINT KFold: make shuffle a public attr
* FIX Make CVIterableWrapper private.
* FIX reuse len_cv instead of recalculating it
* FIX Prevent adding *SearchCV estimators from the old grid_search module
* re-FIX In all_estimators: the sorting to use only the 1st item (name)
To avoid collision between the old and the new GridSearch classes.
* FIX test_validate.py: Use 2D X (1D X is being detected as a single sample)
* MAINT validate.py --> validation.py
* MAINT make the submodules private
* MAINT Support old cv/gs/lc until 0.19
* FIX/MAINT n_splits --> get_n_splits
* FIX/TST test_logistic.py/test_ovr_multinomial_iris:
pass predefined folds as an iterable
* MAINT expose BaseCrossValidator
* Update the model_selection module with changes from master
- From scikit-learn#5161
- - MAINT remove redundant p variable
- - Add check for sparse prediction in cross_val_predict
- From scikit-learn#5201 - DOC improve random_state param doc
- From scikit-learn#5190 - LabelKFold and test
- From scikit-learn#4583 - LabelShuffleSplit and tests
- From scikit-learn#5300 - shuffle the `labels` not the `indxs` in LabelKFold + tests
- From scikit-learn#5378 - Make the GridSearchCV docs more accurate.
- From scikit-learn#5458 - Remove shuffle from LabelKFold
- From scikit-learn#5466(scikit-learn#4270) - Gaussian Process by Jan Metzen
- From scikit-learn#4826 - Move custom error / warnings into sklearn.exception
Minor
-----
* ENH Make the KFold shuffling test stronger
* FIX/DOC Use the higher level model_selection module as ref
* DOC in check_cv "y : array-like, optional"
* DOC a supervised learning problem --> supervised learning problems
* DOC cross-validators --> cross-validation strategies
* DOC Correct Olivier Grisel's name ;)
* MINOR/FIX cv_indices --> kfold
* FIX/DOC Align the 'See also' section of the new KFold, LeaveOneOut
* TST/FIX imports on separate lines
* FIX use __class__ instead of classmethod
* TST/FIX import directly from model_selection
* COSMIT Relocate the random_state documentation
* COSMIT remove pass
* MAINT Remove deprecation warnings from old tests
* FIX correct import at test_split
* FIX/MAINT Move P_sparse, X, y defns to top; rm unused W_sparse, X_sparse
* FIX random state to avoid doctest failure
* TST n_splits and split wrapping of _CVIterableWrapper
* FIX/MAINT Use multilabel indicator matrix directly
* TST/DOC clarify why we conflate classes 0 and 1
* DOC add comment that this was taken from BaseEstimator
* FIX use of labels is not needed in stratified k fold
* Fix cross_validation reference
* Fix the labels param doc
--------------------
* ENH Reogranize classes/fn from grid_search into search.py
* ENH Reogranize classes/fn from cross_validation into split.py
* ENH Reogranize cls/fn from cross_validation/learning_curve into validate.py
* MAINT Merge _check_cv into check_cv inside the model_selection module
* MAINT Update all the imports to point to the model_selection module
* FIX use iter_cv to iterate throught the new style/old style cv objs
* TST Add tests for the new model_selection members
* ENH Wrap the old-style cv obj/iterables instead of using iter_cv
* ENH Use scipy's binomial coefficient function comb for calucation of nCk
* ENH Few enhancements to the split module
* ENH Improve check_cv input validation and docstring
* MAINT _get_test_folds(X, y, labels) --> _get_test_folds(labels)
* TST if 1d arrays for X introduce any errors
* ENH use 1d X arrays for all tests;
* ENH X_10 --> X (global var)
Minor
-----
* ENH _PartitionIterator --> _BaseCrossValidator;
* ENH CVIterator --> CVIterableWrapper
* TST Import the old SKF locally
* FIX/TST Clean up the split module's tests.
* DOC Improve documentation of the cv parameter
* COSMIT consistently hyphenate cross-validation/cross-validator
* TST Calculate n_samples from X
* COSMIT Use separate lines for each import.
* COSMIT cross_validation_generator --> cross_validator
Commits merged manually
-----------------------
* FIX Document the random_state attribute in RandomSearchCV
* MAINT Use check_cv instead of _check_cv
* ENH refactor OVO decision function, use it in SVC for sklearn-like
decision_function shape
* FIX avoid memory cost when sampling from large parameter grids
ENH Major to Minor incremental enhancements to the model_selection
Squashed commit messages - (For reference)
Major
-----
* ENH p --> n_labels
* FIX *ShuffleSplit: all float/invalid type errors at init and int error at split
* FIX make PredefinedSplit accept test_folds in constructor; Cleanup docstrings
* ENH+TST KFold: make rng to be generated at every split call for reproducibility
* FIX/MAINT KFold: make shuffle a public attr
* FIX Make CVIterableWrapper private.
* FIX reuse len_cv instead of recalculating it
* FIX Prevent adding *SearchCV estimators from the old grid_search module
* re-FIX In all_estimators: the sorting to use only the 1st item (name)
To avoid collision between the old and the new GridSearch classes.
* FIX test_validate.py: Use 2D X (1D X is being detected as a single sample)
* MAINT validate.py --> validation.py
* MAINT make the submodules private
* MAINT Support old cv/gs/lc until 0.19
* FIX/MAINT n_splits --> get_n_splits
* FIX/TST test_logistic.py/test_ovr_multinomial_iris:
pass predefined folds as an iterable
* MAINT expose BaseCrossValidator
* Update the model_selection module with changes from master
- From #5161
- - MAINT remove redundant p variable
- - Add check for sparse prediction in cross_val_predict
- From #5201 - DOC improve random_state param doc
- From #5190 - LabelKFold and test
- From #4583 - LabelShuffleSplit and tests
- From #5300 - shuffle the `labels` not the `indxs` in LabelKFold + tests
- From #5378 - Make the GridSearchCV docs more accurate.
- From #5458 - Remove shuffle from LabelKFold
- From #5466(#4270) - Gaussian Process by Jan Metzen
- From #4826 - Move custom error / warnings into sklearn.exception
Minor
-----
* ENH Make the KFold shuffling test stronger
* FIX/DOC Use the higher level model_selection module as ref
* DOC in check_cv "y : array-like, optional"
* DOC a supervised learning problem --> supervised learning problems
* DOC cross-validators --> cross-validation strategies
* DOC Correct Olivier Grisel's name ;)
* MINOR/FIX cv_indices --> kfold
* FIX/DOC Align the 'See also' section of the new KFold, LeaveOneOut
* TST/FIX imports on separate lines
* FIX use __class__ instead of classmethod
* TST/FIX import directly from model_selection
* COSMIT Relocate the random_state documentation
* COSMIT remove pass
* MAINT Remove deprecation warnings from old tests
* FIX correct import at test_split
* FIX/MAINT Move P_sparse, X, y defns to top; rm unused W_sparse, X_sparse
* FIX random state to avoid doctest failure
* TST n_splits and split wrapping of _CVIterableWrapper
* FIX/MAINT Use multilabel indicator matrix directly
* TST/DOC clarify why we conflate classes 0 and 1
* DOC add comment that this was taken from BaseEstimator
* FIX use of labels is not needed in stratified k fold
* Fix cross_validation reference
* Fix the labels param doc
FIX/DOC/MAINT Addressing the review comments by Arnaud and Andy
COSMIT Sort the members alphabetically
COSMIT len_cv --> n_splits
COSMIT Merge 2 if; FIX Use kwargs
DOC Add my name to the authors :D
DOC make labels parameter consistent
FIX Remove hack for boolean indices; + COSMIT idx --> indices; DOC Add Returns
COSMIT preds --> predictions
DOC Add Returns and neatly arrange X, y, labels
FIX idx(s)/ind(s)--> indice(s)
COSMIT Merge if and else to elif
COSMIT n --> n_samples
COSMIT Use bincount only once
COSMIT cls --> class_i / class_i (ith class indices) -->
perm_indices_class_i
FIX/ENH/TST Addressing the final reviews
COSMIT c --> count
FIX/TST make check_cv raise ValueError for string cv value
TST nested cv (gs inside cross_val_score) works for diff cvs
FIX/ENH Raise ValueError when labels is None for label based cvs;
TST if labels is being passed correctly to the cv and that the
ValueError is being propagated to the cross_val_score/predict and grid
search
FIX pass labels to cross_val_score
FIX use make_classification
DOC Add Returns; COSMIT Remove scaffolding
TST add a test to check the _build_repr helper
REVERT the old GS/RS should also be tested by the common tests.
ENH Add a tuple of all/label based CVS
FIX raise VE even at get_n_splits if labels is None
FIX Fabian's comments
PEP8
Hijacks into @larsmans' #4309
(@larsmans apologies for proceeding without ur reply... I thought this would be nice to have while fixing #2904 since 2 / 5 classes were in cross_validation.py and grid_search.py)
BTW I left out the
utils.arpack.ArpackErrorandutils.tests.test_estimator_checks.CorrectNotFittedError.Please review @amueller @larsmans @ogrisel @agramfort :)