[MRG + 1] Add fowlkess-mallows and other supervised cluster metrics to SCORERS dict so it can be used in hyper-param search#8117
Conversation
|
Currently no clustering scores are listed there, are they? That's because clusterers don't generally implement |
|
Ah so no clustering metric is added :/
This doesn't work - grid_search = GridSearchCV(km, param_grid=dict(n_clusters=[2, 3, 4]),
scoring='fowlkes_mallows_score')
grid_search.fit(X, y) |
|
should we add all the clustering metrics? |
|
(All cluster metrics that use supervised evaluation (compare true and predicted labels like a classification metric)?) |
|
Oh, strange. I don't mind other supervised measures being there, but really
we need to deal with the scoring framework for clusterers. (Might be an
interesting thing to shape up as a GSoC project??)
…On 27 December 2016 at 09:25, (Venkat) Raghav (Rajagopalan) < ***@***.***> wrote:
(All cluster metrics that compare true and predicted labels like a
classification metric?)
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#8117 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAEz62ern5ycLCAu_X6_4ywtzBRImot-ks5rMD7ngaJpZM4LV6WK>
.
|
281197d to
77a133d
Compare
+1 Maybe we should first start a dedicated issue or wiki page well before GSoC timeline to first sketch out the design so the student can spend less time in API design and more time in implementation... |
|
And this is ready for a review... All supervised cluster metrics have been added and there is a test for |
|
please update |
|
Done! :) |
|
(Apart from another look by Joel for the 1st review) can I have the 2nd review too from @TomDLT @tguillemot @lesteve in parallel? |
|
Thanks for the review @tguillemot :) |
|
Another review from @amueller maybe? |
| from sklearn.metrics import (f1_score, r2_score, roc_auc_score, fbeta_score, | ||
| log_loss, precision_score, recall_score) | ||
| from sklearn.metrics.cluster import adjusted_rand_score | ||
| from sklearn.metrics import cluster as cluster_module |
There was a problem hiding this comment.
I prefer you export directly all the necessary metrics as it's done on the line before.
|
thx @raghavrv ! |
…o SCORERS dict so it can be used in hyper-param search (scikit-learn#8117) * Add supervised cluster metrics to metrics.scorers * Add all the supervised cluster metrics to the tests * Add test for fowlkes_mallows_score in unsupervised grid search * COSMIT: Clarify comment on CLUSTER_SCORERS * Fix doctest
…o SCORERS dict so it can be used in hyper-param search (scikit-learn#8117) * Add supervised cluster metrics to metrics.scorers * Add all the supervised cluster metrics to the tests * Add test for fowlkes_mallows_score in unsupervised grid search * COSMIT: Clarify comment on CLUSTER_SCORERS * Fix doctest
…ate on multiple metrics (#7388) * ENH cross_val_score now supports multiple metrics * DOCFIX permutation_test_score * ENH validate multiple metric scorers * ENH Move validation of multimetric scoring param out * ENH GridSearchCV and RandomizedSearchCV now support multiple metrics * EXA Add an example demonstrating the multiple metric in GridSearchCV * ENH Let check_multimetric_scoring tell if its multimetric or not * FIX For single metric name of scorer should remain 'score' * ENH validation_curve and learning_curve now support multiple metrics * MNT move _aggregate_score_dicts helper into _validation.py * TST More testing/ Fixing scores to the correct values * EXA Add cross_val_score to multimetric example * Rename to multiple_metric_evaluation.py * MNT Remove scaffolding * FIX doctest imports * FIX wrap the scorer and unwrap the score when using _score() in rfe * TST Cleanup the tests. Test for is_multimetric too * TST Make sure it registers as single metric when scoring is of that type * PEP8 * Don't use dict comprehension to make it work in python2.6 * ENH/FIX/TST grid_scores_ should not be available for multimetric evaluation * FIX+TST delegated methods NA when multimetric is enabled... TST Add general tests to GridSearchCV and RandomizedSearchCV * ENH add option to disable delegation on multimetric scoring * Remove old function from __all__ * flake8 * FIX revert disable_on_multimetric * stash * Fix incorrect rebase * [ci skip] * Make sure refit works as expected and remove irrelevant tests * Allow passing standard scorers by name in multimetric scorers * Fix example * flake8 * Address reviews * Fix indentation * Ensure {'acc': 'accuracy'} and ['precision'] are valid inputs * Test that for single metric, 'score' is a key * Typos * Fix incorrect rebase * Compare multimetric grid search with multiple single metric searches * Test X, y list and pandas input; Test multimetric for unsupervised grid search * Fix tests; Unsupervised multimetric gs will not pass until #8117 is merged * Make a plot of Precision vs ROC AUC for RandomForest varying the n_estimators * Add example to grid_search.rst * Use the classic tuning of C param in SVM instead of estimators in RF * FIX Remove scoring arg in deafult scorer test * flake8 * Search for min_samples_split in DTC; Also show f-score * REVIEW Make check_multimetric_scoring private * FIX Add more samples to see if 3% mismatch on 32 bit systems gets fixed * REVIEW Plot best score; Shorten legends * REVIEW/COSMIT multimetric --> multi-metric * REVIEW Mark the best scores of P/R scores too * Revert "FIX Add more samples to see if 3% mismatch on 32 bit systems gets fixed" This reverts commit ba766d9. * ENH Use looping for iid testing * FIX use param grid as scipy's stats dist in 0.12 do not accept seed * ENH more looping less code; Use small non-noisy dataset * FIX Use named arg after expanded args * TST More testing of the refit parameter * Test that in multimetric search refit to single metric, the delegated methods work as expected. * Test that setting probability=False works with multimetric too * Test refit=False gives sensible error * COSMIT multimetric --> multi-metric * REV Correct example doc * COSMIT * REVIEW Make tests stronger; Fix bugs in _check_multimetric_scorer * REVIEW refit param: Raise for empty strings * TST Invalid refit params * REVIEW Use <scorer_name> alone; recall --> Recall * REV specify when we expect scorers to not be None * FLAKE8 * REVERT multimetrics in learning_curve and validation_curve * REVIEW Simpler coding style * COSMIT * COSMIT * REV Compress example a bit. Move comment to top * FIX fit_grid_point's previous API must be preserved * Flake8 * TST Use loop; Compare with single-metric * REVIEW Use dict-comprehension instead of helper * REVIEW Remove redundant test * Fix tests incorrect braces * COSMIT * REVIEW Use regexp * REV Simplify aggregation of score dicts * FIX precision and accuracy test * FIX doctest and flake8 * TST the best_* attributes multimetric with single metric * Address @jnothman's review * Address more comments \o/ * DOCFIXES * Fix use the validated fit_param from fit's arguments * Revert alpha to a lower value as before * Using def instead of lambda * Address @jnothman's review batch 1: Fix tests / Doc fixes * Remove superfluous tests * Remove more superfluous testing * TST/FIX loop over refit and check found n_clusters * Cosmetic touches * Use zip instead of manually listing the keys * Fix inverse_transform * FIX bug in fit_grid_point; Allow only single score TST if fit_grid_point works as intended * ENH Use only ROC-AUC and F1-score * Fix typos and flake8; Address Andy's reviews MNT Add a comment on why we do such a transpose + some fixes * ENH Better error messages for incorrect multimetric scoring values +... ENH Avoid exception traceback while using incorrect scoring string * Dict keys must be of string type only * 1. Better error message for invalid scoring 2... Internal functions return single score for single metric scoring * Fix test failures and shuffle tests * Avoid wrapping scorer as dict in learning_curve * Remove doc example as asked for * Some leftover ones * Don't wrap scorer in validation_curve either * Add a doc example and skip it as dict order fails doctest * Import zip from six for python2.7 compat * Make cross_val_score return a cv_results-like dict * Add relevant sections to userguide * Flake8 fixes * Add whatsnew and fix broken links * Use AUC and accuracy instead of f1 * Fix failing doctests cross_validation.rst * DOC add the wrapper example for metrics that return multiple return values * Address andy's comments * Be less weird * Address more of andy's comments * Make a separate cross_validate function to return dict and a cross_val_score * Update the docs to reflect the new cross_validate function * Add cross_validate to toc-tree * Add more tests on type of cross_validate return and time limits * FIX failing doctests * FIX ensure keys are not plural * DOC fix * Address some pending comments * Remove the comment as it is irrelevant now * Remove excess blank line * Fix flake8 inconsistencies * Allow fit_times to be 0 to conform with windows precision * DOC specify how refit param is to be set in multiple metric case * TST ensure cross_validate works for string single metrics + address @jnothman's reviews * Doc fixes * Remove the shape and transform parameter of _aggregate_score_dicts * Address Joel's doc comments * Fix broken doctest * Fix the spurious file * Address Andy's comments * MNT Remove erroneous entry * Address Andy's comments * FIX broken links * Update whats_new.rst missing newline
…ate on multiple metrics (scikit-learn#7388) * ENH cross_val_score now supports multiple metrics * DOCFIX permutation_test_score * ENH validate multiple metric scorers * ENH Move validation of multimetric scoring param out * ENH GridSearchCV and RandomizedSearchCV now support multiple metrics * EXA Add an example demonstrating the multiple metric in GridSearchCV * ENH Let check_multimetric_scoring tell if its multimetric or not * FIX For single metric name of scorer should remain 'score' * ENH validation_curve and learning_curve now support multiple metrics * MNT move _aggregate_score_dicts helper into _validation.py * TST More testing/ Fixing scores to the correct values * EXA Add cross_val_score to multimetric example * Rename to multiple_metric_evaluation.py * MNT Remove scaffolding * FIX doctest imports * FIX wrap the scorer and unwrap the score when using _score() in rfe * TST Cleanup the tests. Test for is_multimetric too * TST Make sure it registers as single metric when scoring is of that type * PEP8 * Don't use dict comprehension to make it work in python2.6 * ENH/FIX/TST grid_scores_ should not be available for multimetric evaluation * FIX+TST delegated methods NA when multimetric is enabled... TST Add general tests to GridSearchCV and RandomizedSearchCV * ENH add option to disable delegation on multimetric scoring * Remove old function from __all__ * flake8 * FIX revert disable_on_multimetric * stash * Fix incorrect rebase * [ci skip] * Make sure refit works as expected and remove irrelevant tests * Allow passing standard scorers by name in multimetric scorers * Fix example * flake8 * Address reviews * Fix indentation * Ensure {'acc': 'accuracy'} and ['precision'] are valid inputs * Test that for single metric, 'score' is a key * Typos * Fix incorrect rebase * Compare multimetric grid search with multiple single metric searches * Test X, y list and pandas input; Test multimetric for unsupervised grid search * Fix tests; Unsupervised multimetric gs will not pass until scikit-learn#8117 is merged * Make a plot of Precision vs ROC AUC for RandomForest varying the n_estimators * Add example to grid_search.rst * Use the classic tuning of C param in SVM instead of estimators in RF * FIX Remove scoring arg in deafult scorer test * flake8 * Search for min_samples_split in DTC; Also show f-score * REVIEW Make check_multimetric_scoring private * FIX Add more samples to see if 3% mismatch on 32 bit systems gets fixed * REVIEW Plot best score; Shorten legends * REVIEW/COSMIT multimetric --> multi-metric * REVIEW Mark the best scores of P/R scores too * Revert "FIX Add more samples to see if 3% mismatch on 32 bit systems gets fixed" This reverts commit ba766d9. * ENH Use looping for iid testing * FIX use param grid as scipy's stats dist in 0.12 do not accept seed * ENH more looping less code; Use small non-noisy dataset * FIX Use named arg after expanded args * TST More testing of the refit parameter * Test that in multimetric search refit to single metric, the delegated methods work as expected. * Test that setting probability=False works with multimetric too * Test refit=False gives sensible error * COSMIT multimetric --> multi-metric * REV Correct example doc * COSMIT * REVIEW Make tests stronger; Fix bugs in _check_multimetric_scorer * REVIEW refit param: Raise for empty strings * TST Invalid refit params * REVIEW Use <scorer_name> alone; recall --> Recall * REV specify when we expect scorers to not be None * FLAKE8 * REVERT multimetrics in learning_curve and validation_curve * REVIEW Simpler coding style * COSMIT * COSMIT * REV Compress example a bit. Move comment to top * FIX fit_grid_point's previous API must be preserved * Flake8 * TST Use loop; Compare with single-metric * REVIEW Use dict-comprehension instead of helper * REVIEW Remove redundant test * Fix tests incorrect braces * COSMIT * REVIEW Use regexp * REV Simplify aggregation of score dicts * FIX precision and accuracy test * FIX doctest and flake8 * TST the best_* attributes multimetric with single metric * Address @jnothman's review * Address more comments \o/ * DOCFIXES * Fix use the validated fit_param from fit's arguments * Revert alpha to a lower value as before * Using def instead of lambda * Address @jnothman's review batch 1: Fix tests / Doc fixes * Remove superfluous tests * Remove more superfluous testing * TST/FIX loop over refit and check found n_clusters * Cosmetic touches * Use zip instead of manually listing the keys * Fix inverse_transform * FIX bug in fit_grid_point; Allow only single score TST if fit_grid_point works as intended * ENH Use only ROC-AUC and F1-score * Fix typos and flake8; Address Andy's reviews MNT Add a comment on why we do such a transpose + some fixes * ENH Better error messages for incorrect multimetric scoring values +... ENH Avoid exception traceback while using incorrect scoring string * Dict keys must be of string type only * 1. Better error message for invalid scoring 2... Internal functions return single score for single metric scoring * Fix test failures and shuffle tests * Avoid wrapping scorer as dict in learning_curve * Remove doc example as asked for * Some leftover ones * Don't wrap scorer in validation_curve either * Add a doc example and skip it as dict order fails doctest * Import zip from six for python2.7 compat * Make cross_val_score return a cv_results-like dict * Add relevant sections to userguide * Flake8 fixes * Add whatsnew and fix broken links * Use AUC and accuracy instead of f1 * Fix failing doctests cross_validation.rst * DOC add the wrapper example for metrics that return multiple return values * Address andy's comments * Be less weird * Address more of andy's comments * Make a separate cross_validate function to return dict and a cross_val_score * Update the docs to reflect the new cross_validate function * Add cross_validate to toc-tree * Add more tests on type of cross_validate return and time limits * FIX failing doctests * FIX ensure keys are not plural * DOC fix * Address some pending comments * Remove the comment as it is irrelevant now * Remove excess blank line * Fix flake8 inconsistencies * Allow fit_times to be 0 to conform with windows precision * DOC specify how refit param is to be set in multiple metric case * TST ensure cross_validate works for string single metrics + address @jnothman's reviews * Doc fixes * Remove the shape and transform parameter of _aggregate_score_dicts * Address Joel's doc comments * Fix broken doctest * Fix the spurious file * Address Andy's comments * MNT Remove erroneous entry * Address Andy's comments * FIX broken links * Update whats_new.rst missing newline
…ate on multiple metrics (scikit-learn#7388) * ENH cross_val_score now supports multiple metrics * DOCFIX permutation_test_score * ENH validate multiple metric scorers * ENH Move validation of multimetric scoring param out * ENH GridSearchCV and RandomizedSearchCV now support multiple metrics * EXA Add an example demonstrating the multiple metric in GridSearchCV * ENH Let check_multimetric_scoring tell if its multimetric or not * FIX For single metric name of scorer should remain 'score' * ENH validation_curve and learning_curve now support multiple metrics * MNT move _aggregate_score_dicts helper into _validation.py * TST More testing/ Fixing scores to the correct values * EXA Add cross_val_score to multimetric example * Rename to multiple_metric_evaluation.py * MNT Remove scaffolding * FIX doctest imports * FIX wrap the scorer and unwrap the score when using _score() in rfe * TST Cleanup the tests. Test for is_multimetric too * TST Make sure it registers as single metric when scoring is of that type * PEP8 * Don't use dict comprehension to make it work in python2.6 * ENH/FIX/TST grid_scores_ should not be available for multimetric evaluation * FIX+TST delegated methods NA when multimetric is enabled... TST Add general tests to GridSearchCV and RandomizedSearchCV * ENH add option to disable delegation on multimetric scoring * Remove old function from __all__ * flake8 * FIX revert disable_on_multimetric * stash * Fix incorrect rebase * [ci skip] * Make sure refit works as expected and remove irrelevant tests * Allow passing standard scorers by name in multimetric scorers * Fix example * flake8 * Address reviews * Fix indentation * Ensure {'acc': 'accuracy'} and ['precision'] are valid inputs * Test that for single metric, 'score' is a key * Typos * Fix incorrect rebase * Compare multimetric grid search with multiple single metric searches * Test X, y list and pandas input; Test multimetric for unsupervised grid search * Fix tests; Unsupervised multimetric gs will not pass until scikit-learn#8117 is merged * Make a plot of Precision vs ROC AUC for RandomForest varying the n_estimators * Add example to grid_search.rst * Use the classic tuning of C param in SVM instead of estimators in RF * FIX Remove scoring arg in deafult scorer test * flake8 * Search for min_samples_split in DTC; Also show f-score * REVIEW Make check_multimetric_scoring private * FIX Add more samples to see if 3% mismatch on 32 bit systems gets fixed * REVIEW Plot best score; Shorten legends * REVIEW/COSMIT multimetric --> multi-metric * REVIEW Mark the best scores of P/R scores too * Revert "FIX Add more samples to see if 3% mismatch on 32 bit systems gets fixed" This reverts commit ba766d9. * ENH Use looping for iid testing * FIX use param grid as scipy's stats dist in 0.12 do not accept seed * ENH more looping less code; Use small non-noisy dataset * FIX Use named arg after expanded args * TST More testing of the refit parameter * Test that in multimetric search refit to single metric, the delegated methods work as expected. * Test that setting probability=False works with multimetric too * Test refit=False gives sensible error * COSMIT multimetric --> multi-metric * REV Correct example doc * COSMIT * REVIEW Make tests stronger; Fix bugs in _check_multimetric_scorer * REVIEW refit param: Raise for empty strings * TST Invalid refit params * REVIEW Use <scorer_name> alone; recall --> Recall * REV specify when we expect scorers to not be None * FLAKE8 * REVERT multimetrics in learning_curve and validation_curve * REVIEW Simpler coding style * COSMIT * COSMIT * REV Compress example a bit. Move comment to top * FIX fit_grid_point's previous API must be preserved * Flake8 * TST Use loop; Compare with single-metric * REVIEW Use dict-comprehension instead of helper * REVIEW Remove redundant test * Fix tests incorrect braces * COSMIT * REVIEW Use regexp * REV Simplify aggregation of score dicts * FIX precision and accuracy test * FIX doctest and flake8 * TST the best_* attributes multimetric with single metric * Address @jnothman's review * Address more comments \o/ * DOCFIXES * Fix use the validated fit_param from fit's arguments * Revert alpha to a lower value as before * Using def instead of lambda * Address @jnothman's review batch 1: Fix tests / Doc fixes * Remove superfluous tests * Remove more superfluous testing * TST/FIX loop over refit and check found n_clusters * Cosmetic touches * Use zip instead of manually listing the keys * Fix inverse_transform * FIX bug in fit_grid_point; Allow only single score TST if fit_grid_point works as intended * ENH Use only ROC-AUC and F1-score * Fix typos and flake8; Address Andy's reviews MNT Add a comment on why we do such a transpose + some fixes * ENH Better error messages for incorrect multimetric scoring values +... ENH Avoid exception traceback while using incorrect scoring string * Dict keys must be of string type only * 1. Better error message for invalid scoring 2... Internal functions return single score for single metric scoring * Fix test failures and shuffle tests * Avoid wrapping scorer as dict in learning_curve * Remove doc example as asked for * Some leftover ones * Don't wrap scorer in validation_curve either * Add a doc example and skip it as dict order fails doctest * Import zip from six for python2.7 compat * Make cross_val_score return a cv_results-like dict * Add relevant sections to userguide * Flake8 fixes * Add whatsnew and fix broken links * Use AUC and accuracy instead of f1 * Fix failing doctests cross_validation.rst * DOC add the wrapper example for metrics that return multiple return values * Address andy's comments * Be less weird * Address more of andy's comments * Make a separate cross_validate function to return dict and a cross_val_score * Update the docs to reflect the new cross_validate function * Add cross_validate to toc-tree * Add more tests on type of cross_validate return and time limits * FIX failing doctests * FIX ensure keys are not plural * DOC fix * Address some pending comments * Remove the comment as it is irrelevant now * Remove excess blank line * Fix flake8 inconsistencies * Allow fit_times to be 0 to conform with windows precision * DOC specify how refit param is to be set in multiple metric case * TST ensure cross_validate works for string single metrics + address @jnothman's reviews * Doc fixes * Remove the shape and transform parameter of _aggregate_score_dicts * Address Joel's doc comments * Fix broken doctest * Fix the spurious file * Address Andy's comments * MNT Remove erroneous entry * Address Andy's comments * FIX broken links * Update whats_new.rst missing newline
…ate on multiple metrics (scikit-learn#7388) * ENH cross_val_score now supports multiple metrics * DOCFIX permutation_test_score * ENH validate multiple metric scorers * ENH Move validation of multimetric scoring param out * ENH GridSearchCV and RandomizedSearchCV now support multiple metrics * EXA Add an example demonstrating the multiple metric in GridSearchCV * ENH Let check_multimetric_scoring tell if its multimetric or not * FIX For single metric name of scorer should remain 'score' * ENH validation_curve and learning_curve now support multiple metrics * MNT move _aggregate_score_dicts helper into _validation.py * TST More testing/ Fixing scores to the correct values * EXA Add cross_val_score to multimetric example * Rename to multiple_metric_evaluation.py * MNT Remove scaffolding * FIX doctest imports * FIX wrap the scorer and unwrap the score when using _score() in rfe * TST Cleanup the tests. Test for is_multimetric too * TST Make sure it registers as single metric when scoring is of that type * PEP8 * Don't use dict comprehension to make it work in python2.6 * ENH/FIX/TST grid_scores_ should not be available for multimetric evaluation * FIX+TST delegated methods NA when multimetric is enabled... TST Add general tests to GridSearchCV and RandomizedSearchCV * ENH add option to disable delegation on multimetric scoring * Remove old function from __all__ * flake8 * FIX revert disable_on_multimetric * stash * Fix incorrect rebase * [ci skip] * Make sure refit works as expected and remove irrelevant tests * Allow passing standard scorers by name in multimetric scorers * Fix example * flake8 * Address reviews * Fix indentation * Ensure {'acc': 'accuracy'} and ['precision'] are valid inputs * Test that for single metric, 'score' is a key * Typos * Fix incorrect rebase * Compare multimetric grid search with multiple single metric searches * Test X, y list and pandas input; Test multimetric for unsupervised grid search * Fix tests; Unsupervised multimetric gs will not pass until scikit-learn#8117 is merged * Make a plot of Precision vs ROC AUC for RandomForest varying the n_estimators * Add example to grid_search.rst * Use the classic tuning of C param in SVM instead of estimators in RF * FIX Remove scoring arg in deafult scorer test * flake8 * Search for min_samples_split in DTC; Also show f-score * REVIEW Make check_multimetric_scoring private * FIX Add more samples to see if 3% mismatch on 32 bit systems gets fixed * REVIEW Plot best score; Shorten legends * REVIEW/COSMIT multimetric --> multi-metric * REVIEW Mark the best scores of P/R scores too * Revert "FIX Add more samples to see if 3% mismatch on 32 bit systems gets fixed" This reverts commit ba766d9. * ENH Use looping for iid testing * FIX use param grid as scipy's stats dist in 0.12 do not accept seed * ENH more looping less code; Use small non-noisy dataset * FIX Use named arg after expanded args * TST More testing of the refit parameter * Test that in multimetric search refit to single metric, the delegated methods work as expected. * Test that setting probability=False works with multimetric too * Test refit=False gives sensible error * COSMIT multimetric --> multi-metric * REV Correct example doc * COSMIT * REVIEW Make tests stronger; Fix bugs in _check_multimetric_scorer * REVIEW refit param: Raise for empty strings * TST Invalid refit params * REVIEW Use <scorer_name> alone; recall --> Recall * REV specify when we expect scorers to not be None * FLAKE8 * REVERT multimetrics in learning_curve and validation_curve * REVIEW Simpler coding style * COSMIT * COSMIT * REV Compress example a bit. Move comment to top * FIX fit_grid_point's previous API must be preserved * Flake8 * TST Use loop; Compare with single-metric * REVIEW Use dict-comprehension instead of helper * REVIEW Remove redundant test * Fix tests incorrect braces * COSMIT * REVIEW Use regexp * REV Simplify aggregation of score dicts * FIX precision and accuracy test * FIX doctest and flake8 * TST the best_* attributes multimetric with single metric * Address @jnothman's review * Address more comments \o/ * DOCFIXES * Fix use the validated fit_param from fit's arguments * Revert alpha to a lower value as before * Using def instead of lambda * Address @jnothman's review batch 1: Fix tests / Doc fixes * Remove superfluous tests * Remove more superfluous testing * TST/FIX loop over refit and check found n_clusters * Cosmetic touches * Use zip instead of manually listing the keys * Fix inverse_transform * FIX bug in fit_grid_point; Allow only single score TST if fit_grid_point works as intended * ENH Use only ROC-AUC and F1-score * Fix typos and flake8; Address Andy's reviews MNT Add a comment on why we do such a transpose + some fixes * ENH Better error messages for incorrect multimetric scoring values +... ENH Avoid exception traceback while using incorrect scoring string * Dict keys must be of string type only * 1. Better error message for invalid scoring 2... Internal functions return single score for single metric scoring * Fix test failures and shuffle tests * Avoid wrapping scorer as dict in learning_curve * Remove doc example as asked for * Some leftover ones * Don't wrap scorer in validation_curve either * Add a doc example and skip it as dict order fails doctest * Import zip from six for python2.7 compat * Make cross_val_score return a cv_results-like dict * Add relevant sections to userguide * Flake8 fixes * Add whatsnew and fix broken links * Use AUC and accuracy instead of f1 * Fix failing doctests cross_validation.rst * DOC add the wrapper example for metrics that return multiple return values * Address andy's comments * Be less weird * Address more of andy's comments * Make a separate cross_validate function to return dict and a cross_val_score * Update the docs to reflect the new cross_validate function * Add cross_validate to toc-tree * Add more tests on type of cross_validate return and time limits * FIX failing doctests * FIX ensure keys are not plural * DOC fix * Address some pending comments * Remove the comment as it is irrelevant now * Remove excess blank line * Fix flake8 inconsistencies * Allow fit_times to be 0 to conform with windows precision * DOC specify how refit param is to be set in multiple metric case * TST ensure cross_validate works for string single metrics + address @jnothman's reviews * Doc fixes * Remove the shape and transform parameter of _aggregate_score_dicts * Address Joel's doc comments * Fix broken doctest * Fix the spurious file * Address Andy's comments * MNT Remove erroneous entry * Address Andy's comments * FIX broken links * Update whats_new.rst missing newline
…o SCORERS dict so it can be used in hyper-param search (scikit-learn#8117) * Add supervised cluster metrics to metrics.scorers * Add all the supervised cluster metrics to the tests * Add test for fowlkes_mallows_score in unsupervised grid search * COSMIT: Clarify comment on CLUSTER_SCORERS * Fix doctest
…ate on multiple metrics (scikit-learn#7388) * ENH cross_val_score now supports multiple metrics * DOCFIX permutation_test_score * ENH validate multiple metric scorers * ENH Move validation of multimetric scoring param out * ENH GridSearchCV and RandomizedSearchCV now support multiple metrics * EXA Add an example demonstrating the multiple metric in GridSearchCV * ENH Let check_multimetric_scoring tell if its multimetric or not * FIX For single metric name of scorer should remain 'score' * ENH validation_curve and learning_curve now support multiple metrics * MNT move _aggregate_score_dicts helper into _validation.py * TST More testing/ Fixing scores to the correct values * EXA Add cross_val_score to multimetric example * Rename to multiple_metric_evaluation.py * MNT Remove scaffolding * FIX doctest imports * FIX wrap the scorer and unwrap the score when using _score() in rfe * TST Cleanup the tests. Test for is_multimetric too * TST Make sure it registers as single metric when scoring is of that type * PEP8 * Don't use dict comprehension to make it work in python2.6 * ENH/FIX/TST grid_scores_ should not be available for multimetric evaluation * FIX+TST delegated methods NA when multimetric is enabled... TST Add general tests to GridSearchCV and RandomizedSearchCV * ENH add option to disable delegation on multimetric scoring * Remove old function from __all__ * flake8 * FIX revert disable_on_multimetric * stash * Fix incorrect rebase * [ci skip] * Make sure refit works as expected and remove irrelevant tests * Allow passing standard scorers by name in multimetric scorers * Fix example * flake8 * Address reviews * Fix indentation * Ensure {'acc': 'accuracy'} and ['precision'] are valid inputs * Test that for single metric, 'score' is a key * Typos * Fix incorrect rebase * Compare multimetric grid search with multiple single metric searches * Test X, y list and pandas input; Test multimetric for unsupervised grid search * Fix tests; Unsupervised multimetric gs will not pass until scikit-learn#8117 is merged * Make a plot of Precision vs ROC AUC for RandomForest varying the n_estimators * Add example to grid_search.rst * Use the classic tuning of C param in SVM instead of estimators in RF * FIX Remove scoring arg in deafult scorer test * flake8 * Search for min_samples_split in DTC; Also show f-score * REVIEW Make check_multimetric_scoring private * FIX Add more samples to see if 3% mismatch on 32 bit systems gets fixed * REVIEW Plot best score; Shorten legends * REVIEW/COSMIT multimetric --> multi-metric * REVIEW Mark the best scores of P/R scores too * Revert "FIX Add more samples to see if 3% mismatch on 32 bit systems gets fixed" This reverts commit ba766d9. * ENH Use looping for iid testing * FIX use param grid as scipy's stats dist in 0.12 do not accept seed * ENH more looping less code; Use small non-noisy dataset * FIX Use named arg after expanded args * TST More testing of the refit parameter * Test that in multimetric search refit to single metric, the delegated methods work as expected. * Test that setting probability=False works with multimetric too * Test refit=False gives sensible error * COSMIT multimetric --> multi-metric * REV Correct example doc * COSMIT * REVIEW Make tests stronger; Fix bugs in _check_multimetric_scorer * REVIEW refit param: Raise for empty strings * TST Invalid refit params * REVIEW Use <scorer_name> alone; recall --> Recall * REV specify when we expect scorers to not be None * FLAKE8 * REVERT multimetrics in learning_curve and validation_curve * REVIEW Simpler coding style * COSMIT * COSMIT * REV Compress example a bit. Move comment to top * FIX fit_grid_point's previous API must be preserved * Flake8 * TST Use loop; Compare with single-metric * REVIEW Use dict-comprehension instead of helper * REVIEW Remove redundant test * Fix tests incorrect braces * COSMIT * REVIEW Use regexp * REV Simplify aggregation of score dicts * FIX precision and accuracy test * FIX doctest and flake8 * TST the best_* attributes multimetric with single metric * Address @jnothman's review * Address more comments \o/ * DOCFIXES * Fix use the validated fit_param from fit's arguments * Revert alpha to a lower value as before * Using def instead of lambda * Address @jnothman's review batch 1: Fix tests / Doc fixes * Remove superfluous tests * Remove more superfluous testing * TST/FIX loop over refit and check found n_clusters * Cosmetic touches * Use zip instead of manually listing the keys * Fix inverse_transform * FIX bug in fit_grid_point; Allow only single score TST if fit_grid_point works as intended * ENH Use only ROC-AUC and F1-score * Fix typos and flake8; Address Andy's reviews MNT Add a comment on why we do such a transpose + some fixes * ENH Better error messages for incorrect multimetric scoring values +... ENH Avoid exception traceback while using incorrect scoring string * Dict keys must be of string type only * 1. Better error message for invalid scoring 2... Internal functions return single score for single metric scoring * Fix test failures and shuffle tests * Avoid wrapping scorer as dict in learning_curve * Remove doc example as asked for * Some leftover ones * Don't wrap scorer in validation_curve either * Add a doc example and skip it as dict order fails doctest * Import zip from six for python2.7 compat * Make cross_val_score return a cv_results-like dict * Add relevant sections to userguide * Flake8 fixes * Add whatsnew and fix broken links * Use AUC and accuracy instead of f1 * Fix failing doctests cross_validation.rst * DOC add the wrapper example for metrics that return multiple return values * Address andy's comments * Be less weird * Address more of andy's comments * Make a separate cross_validate function to return dict and a cross_val_score * Update the docs to reflect the new cross_validate function * Add cross_validate to toc-tree * Add more tests on type of cross_validate return and time limits * FIX failing doctests * FIX ensure keys are not plural * DOC fix * Address some pending comments * Remove the comment as it is irrelevant now * Remove excess blank line * Fix flake8 inconsistencies * Allow fit_times to be 0 to conform with windows precision * DOC specify how refit param is to be set in multiple metric case * TST ensure cross_validate works for string single metrics + address @jnothman's reviews * Doc fixes * Remove the shape and transform parameter of _aggregate_score_dicts * Address Joel's doc comments * Fix broken doctest * Fix the spurious file * Address Andy's comments * MNT Remove erroneous entry * Address Andy's comments * FIX broken links * Update whats_new.rst missing newline
…o SCORERS dict so it can be used in hyper-param search (scikit-learn#8117) * Add supervised cluster metrics to metrics.scorers * Add all the supervised cluster metrics to the tests * Add test for fowlkes_mallows_score in unsupervised grid search * COSMIT: Clarify comment on CLUSTER_SCORERS * Fix doctest
…ate on multiple metrics (scikit-learn#7388) * ENH cross_val_score now supports multiple metrics * DOCFIX permutation_test_score * ENH validate multiple metric scorers * ENH Move validation of multimetric scoring param out * ENH GridSearchCV and RandomizedSearchCV now support multiple metrics * EXA Add an example demonstrating the multiple metric in GridSearchCV * ENH Let check_multimetric_scoring tell if its multimetric or not * FIX For single metric name of scorer should remain 'score' * ENH validation_curve and learning_curve now support multiple metrics * MNT move _aggregate_score_dicts helper into _validation.py * TST More testing/ Fixing scores to the correct values * EXA Add cross_val_score to multimetric example * Rename to multiple_metric_evaluation.py * MNT Remove scaffolding * FIX doctest imports * FIX wrap the scorer and unwrap the score when using _score() in rfe * TST Cleanup the tests. Test for is_multimetric too * TST Make sure it registers as single metric when scoring is of that type * PEP8 * Don't use dict comprehension to make it work in python2.6 * ENH/FIX/TST grid_scores_ should not be available for multimetric evaluation * FIX+TST delegated methods NA when multimetric is enabled... TST Add general tests to GridSearchCV and RandomizedSearchCV * ENH add option to disable delegation on multimetric scoring * Remove old function from __all__ * flake8 * FIX revert disable_on_multimetric * stash * Fix incorrect rebase * [ci skip] * Make sure refit works as expected and remove irrelevant tests * Allow passing standard scorers by name in multimetric scorers * Fix example * flake8 * Address reviews * Fix indentation * Ensure {'acc': 'accuracy'} and ['precision'] are valid inputs * Test that for single metric, 'score' is a key * Typos * Fix incorrect rebase * Compare multimetric grid search with multiple single metric searches * Test X, y list and pandas input; Test multimetric for unsupervised grid search * Fix tests; Unsupervised multimetric gs will not pass until scikit-learn#8117 is merged * Make a plot of Precision vs ROC AUC for RandomForest varying the n_estimators * Add example to grid_search.rst * Use the classic tuning of C param in SVM instead of estimators in RF * FIX Remove scoring arg in deafult scorer test * flake8 * Search for min_samples_split in DTC; Also show f-score * REVIEW Make check_multimetric_scoring private * FIX Add more samples to see if 3% mismatch on 32 bit systems gets fixed * REVIEW Plot best score; Shorten legends * REVIEW/COSMIT multimetric --> multi-metric * REVIEW Mark the best scores of P/R scores too * Revert "FIX Add more samples to see if 3% mismatch on 32 bit systems gets fixed" This reverts commit ba766d9. * ENH Use looping for iid testing * FIX use param grid as scipy's stats dist in 0.12 do not accept seed * ENH more looping less code; Use small non-noisy dataset * FIX Use named arg after expanded args * TST More testing of the refit parameter * Test that in multimetric search refit to single metric, the delegated methods work as expected. * Test that setting probability=False works with multimetric too * Test refit=False gives sensible error * COSMIT multimetric --> multi-metric * REV Correct example doc * COSMIT * REVIEW Make tests stronger; Fix bugs in _check_multimetric_scorer * REVIEW refit param: Raise for empty strings * TST Invalid refit params * REVIEW Use <scorer_name> alone; recall --> Recall * REV specify when we expect scorers to not be None * FLAKE8 * REVERT multimetrics in learning_curve and validation_curve * REVIEW Simpler coding style * COSMIT * COSMIT * REV Compress example a bit. Move comment to top * FIX fit_grid_point's previous API must be preserved * Flake8 * TST Use loop; Compare with single-metric * REVIEW Use dict-comprehension instead of helper * REVIEW Remove redundant test * Fix tests incorrect braces * COSMIT * REVIEW Use regexp * REV Simplify aggregation of score dicts * FIX precision and accuracy test * FIX doctest and flake8 * TST the best_* attributes multimetric with single metric * Address @jnothman's review * Address more comments \o/ * DOCFIXES * Fix use the validated fit_param from fit's arguments * Revert alpha to a lower value as before * Using def instead of lambda * Address @jnothman's review batch 1: Fix tests / Doc fixes * Remove superfluous tests * Remove more superfluous testing * TST/FIX loop over refit and check found n_clusters * Cosmetic touches * Use zip instead of manually listing the keys * Fix inverse_transform * FIX bug in fit_grid_point; Allow only single score TST if fit_grid_point works as intended * ENH Use only ROC-AUC and F1-score * Fix typos and flake8; Address Andy's reviews MNT Add a comment on why we do such a transpose + some fixes * ENH Better error messages for incorrect multimetric scoring values +... ENH Avoid exception traceback while using incorrect scoring string * Dict keys must be of string type only * 1. Better error message for invalid scoring 2... Internal functions return single score for single metric scoring * Fix test failures and shuffle tests * Avoid wrapping scorer as dict in learning_curve * Remove doc example as asked for * Some leftover ones * Don't wrap scorer in validation_curve either * Add a doc example and skip it as dict order fails doctest * Import zip from six for python2.7 compat * Make cross_val_score return a cv_results-like dict * Add relevant sections to userguide * Flake8 fixes * Add whatsnew and fix broken links * Use AUC and accuracy instead of f1 * Fix failing doctests cross_validation.rst * DOC add the wrapper example for metrics that return multiple return values * Address andy's comments * Be less weird * Address more of andy's comments * Make a separate cross_validate function to return dict and a cross_val_score * Update the docs to reflect the new cross_validate function * Add cross_validate to toc-tree * Add more tests on type of cross_validate return and time limits * FIX failing doctests * FIX ensure keys are not plural * DOC fix * Address some pending comments * Remove the comment as it is irrelevant now * Remove excess blank line * Fix flake8 inconsistencies * Allow fit_times to be 0 to conform with windows precision * DOC specify how refit param is to be set in multiple metric case * TST ensure cross_validate works for string single metrics + address @jnothman's reviews * Doc fixes * Remove the shape and transform parameter of _aggregate_score_dicts * Address Joel's doc comments * Fix broken doctest * Fix the spurious file * Address Andy's comments * MNT Remove erroneous entry * Address Andy's comments * FIX broken links * Update whats_new.rst missing newline
…ate on multiple metrics (scikit-learn#7388) * ENH cross_val_score now supports multiple metrics * DOCFIX permutation_test_score * ENH validate multiple metric scorers * ENH Move validation of multimetric scoring param out * ENH GridSearchCV and RandomizedSearchCV now support multiple metrics * EXA Add an example demonstrating the multiple metric in GridSearchCV * ENH Let check_multimetric_scoring tell if its multimetric or not * FIX For single metric name of scorer should remain 'score' * ENH validation_curve and learning_curve now support multiple metrics * MNT move _aggregate_score_dicts helper into _validation.py * TST More testing/ Fixing scores to the correct values * EXA Add cross_val_score to multimetric example * Rename to multiple_metric_evaluation.py * MNT Remove scaffolding * FIX doctest imports * FIX wrap the scorer and unwrap the score when using _score() in rfe * TST Cleanup the tests. Test for is_multimetric too * TST Make sure it registers as single metric when scoring is of that type * PEP8 * Don't use dict comprehension to make it work in python2.6 * ENH/FIX/TST grid_scores_ should not be available for multimetric evaluation * FIX+TST delegated methods NA when multimetric is enabled... TST Add general tests to GridSearchCV and RandomizedSearchCV * ENH add option to disable delegation on multimetric scoring * Remove old function from __all__ * flake8 * FIX revert disable_on_multimetric * stash * Fix incorrect rebase * [ci skip] * Make sure refit works as expected and remove irrelevant tests * Allow passing standard scorers by name in multimetric scorers * Fix example * flake8 * Address reviews * Fix indentation * Ensure {'acc': 'accuracy'} and ['precision'] are valid inputs * Test that for single metric, 'score' is a key * Typos * Fix incorrect rebase * Compare multimetric grid search with multiple single metric searches * Test X, y list and pandas input; Test multimetric for unsupervised grid search * Fix tests; Unsupervised multimetric gs will not pass until scikit-learn#8117 is merged * Make a plot of Precision vs ROC AUC for RandomForest varying the n_estimators * Add example to grid_search.rst * Use the classic tuning of C param in SVM instead of estimators in RF * FIX Remove scoring arg in deafult scorer test * flake8 * Search for min_samples_split in DTC; Also show f-score * REVIEW Make check_multimetric_scoring private * FIX Add more samples to see if 3% mismatch on 32 bit systems gets fixed * REVIEW Plot best score; Shorten legends * REVIEW/COSMIT multimetric --> multi-metric * REVIEW Mark the best scores of P/R scores too * Revert "FIX Add more samples to see if 3% mismatch on 32 bit systems gets fixed" This reverts commit ba766d9. * ENH Use looping for iid testing * FIX use param grid as scipy's stats dist in 0.12 do not accept seed * ENH more looping less code; Use small non-noisy dataset * FIX Use named arg after expanded args * TST More testing of the refit parameter * Test that in multimetric search refit to single metric, the delegated methods work as expected. * Test that setting probability=False works with multimetric too * Test refit=False gives sensible error * COSMIT multimetric --> multi-metric * REV Correct example doc * COSMIT * REVIEW Make tests stronger; Fix bugs in _check_multimetric_scorer * REVIEW refit param: Raise for empty strings * TST Invalid refit params * REVIEW Use <scorer_name> alone; recall --> Recall * REV specify when we expect scorers to not be None * FLAKE8 * REVERT multimetrics in learning_curve and validation_curve * REVIEW Simpler coding style * COSMIT * COSMIT * REV Compress example a bit. Move comment to top * FIX fit_grid_point's previous API must be preserved * Flake8 * TST Use loop; Compare with single-metric * REVIEW Use dict-comprehension instead of helper * REVIEW Remove redundant test * Fix tests incorrect braces * COSMIT * REVIEW Use regexp * REV Simplify aggregation of score dicts * FIX precision and accuracy test * FIX doctest and flake8 * TST the best_* attributes multimetric with single metric * Address @jnothman's review * Address more comments \o/ * DOCFIXES * Fix use the validated fit_param from fit's arguments * Revert alpha to a lower value as before * Using def instead of lambda * Address @jnothman's review batch 1: Fix tests / Doc fixes * Remove superfluous tests * Remove more superfluous testing * TST/FIX loop over refit and check found n_clusters * Cosmetic touches * Use zip instead of manually listing the keys * Fix inverse_transform * FIX bug in fit_grid_point; Allow only single score TST if fit_grid_point works as intended * ENH Use only ROC-AUC and F1-score * Fix typos and flake8; Address Andy's reviews MNT Add a comment on why we do such a transpose + some fixes * ENH Better error messages for incorrect multimetric scoring values +... ENH Avoid exception traceback while using incorrect scoring string * Dict keys must be of string type only * 1. Better error message for invalid scoring 2... Internal functions return single score for single metric scoring * Fix test failures and shuffle tests * Avoid wrapping scorer as dict in learning_curve * Remove doc example as asked for * Some leftover ones * Don't wrap scorer in validation_curve either * Add a doc example and skip it as dict order fails doctest * Import zip from six for python2.7 compat * Make cross_val_score return a cv_results-like dict * Add relevant sections to userguide * Flake8 fixes * Add whatsnew and fix broken links * Use AUC and accuracy instead of f1 * Fix failing doctests cross_validation.rst * DOC add the wrapper example for metrics that return multiple return values * Address andy's comments * Be less weird * Address more of andy's comments * Make a separate cross_validate function to return dict and a cross_val_score * Update the docs to reflect the new cross_validate function * Add cross_validate to toc-tree * Add more tests on type of cross_validate return and time limits * FIX failing doctests * FIX ensure keys are not plural * DOC fix * Address some pending comments * Remove the comment as it is irrelevant now * Remove excess blank line * Fix flake8 inconsistencies * Allow fit_times to be 0 to conform with windows precision * DOC specify how refit param is to be set in multiple metric case * TST ensure cross_validate works for string single metrics + address @jnothman's reviews * Doc fixes * Remove the shape and transform parameter of _aggregate_score_dicts * Address Joel's doc comments * Fix broken doctest * Fix the spurious file * Address Andy's comments * MNT Remove erroneous entry * Address Andy's comments * FIX broken links * Update whats_new.rst missing newline
…o SCORERS dict so it can be used in hyper-param search (scikit-learn#8117) * Add supervised cluster metrics to metrics.scorers * Add all the supervised cluster metrics to the tests * Add test for fowlkes_mallows_score in unsupervised grid search * COSMIT: Clarify comment on CLUSTER_SCORERS * Fix doctest
…ate on multiple metrics (scikit-learn#7388) * ENH cross_val_score now supports multiple metrics * DOCFIX permutation_test_score * ENH validate multiple metric scorers * ENH Move validation of multimetric scoring param out * ENH GridSearchCV and RandomizedSearchCV now support multiple metrics * EXA Add an example demonstrating the multiple metric in GridSearchCV * ENH Let check_multimetric_scoring tell if its multimetric or not * FIX For single metric name of scorer should remain 'score' * ENH validation_curve and learning_curve now support multiple metrics * MNT move _aggregate_score_dicts helper into _validation.py * TST More testing/ Fixing scores to the correct values * EXA Add cross_val_score to multimetric example * Rename to multiple_metric_evaluation.py * MNT Remove scaffolding * FIX doctest imports * FIX wrap the scorer and unwrap the score when using _score() in rfe * TST Cleanup the tests. Test for is_multimetric too * TST Make sure it registers as single metric when scoring is of that type * PEP8 * Don't use dict comprehension to make it work in python2.6 * ENH/FIX/TST grid_scores_ should not be available for multimetric evaluation * FIX+TST delegated methods NA when multimetric is enabled... TST Add general tests to GridSearchCV and RandomizedSearchCV * ENH add option to disable delegation on multimetric scoring * Remove old function from __all__ * flake8 * FIX revert disable_on_multimetric * stash * Fix incorrect rebase * [ci skip] * Make sure refit works as expected and remove irrelevant tests * Allow passing standard scorers by name in multimetric scorers * Fix example * flake8 * Address reviews * Fix indentation * Ensure {'acc': 'accuracy'} and ['precision'] are valid inputs * Test that for single metric, 'score' is a key * Typos * Fix incorrect rebase * Compare multimetric grid search with multiple single metric searches * Test X, y list and pandas input; Test multimetric for unsupervised grid search * Fix tests; Unsupervised multimetric gs will not pass until scikit-learn#8117 is merged * Make a plot of Precision vs ROC AUC for RandomForest varying the n_estimators * Add example to grid_search.rst * Use the classic tuning of C param in SVM instead of estimators in RF * FIX Remove scoring arg in deafult scorer test * flake8 * Search for min_samples_split in DTC; Also show f-score * REVIEW Make check_multimetric_scoring private * FIX Add more samples to see if 3% mismatch on 32 bit systems gets fixed * REVIEW Plot best score; Shorten legends * REVIEW/COSMIT multimetric --> multi-metric * REVIEW Mark the best scores of P/R scores too * Revert "FIX Add more samples to see if 3% mismatch on 32 bit systems gets fixed" This reverts commit ba766d9. * ENH Use looping for iid testing * FIX use param grid as scipy's stats dist in 0.12 do not accept seed * ENH more looping less code; Use small non-noisy dataset * FIX Use named arg after expanded args * TST More testing of the refit parameter * Test that in multimetric search refit to single metric, the delegated methods work as expected. * Test that setting probability=False works with multimetric too * Test refit=False gives sensible error * COSMIT multimetric --> multi-metric * REV Correct example doc * COSMIT * REVIEW Make tests stronger; Fix bugs in _check_multimetric_scorer * REVIEW refit param: Raise for empty strings * TST Invalid refit params * REVIEW Use <scorer_name> alone; recall --> Recall * REV specify when we expect scorers to not be None * FLAKE8 * REVERT multimetrics in learning_curve and validation_curve * REVIEW Simpler coding style * COSMIT * COSMIT * REV Compress example a bit. Move comment to top * FIX fit_grid_point's previous API must be preserved * Flake8 * TST Use loop; Compare with single-metric * REVIEW Use dict-comprehension instead of helper * REVIEW Remove redundant test * Fix tests incorrect braces * COSMIT * REVIEW Use regexp * REV Simplify aggregation of score dicts * FIX precision and accuracy test * FIX doctest and flake8 * TST the best_* attributes multimetric with single metric * Address @jnothman's review * Address more comments \o/ * DOCFIXES * Fix use the validated fit_param from fit's arguments * Revert alpha to a lower value as before * Using def instead of lambda * Address @jnothman's review batch 1: Fix tests / Doc fixes * Remove superfluous tests * Remove more superfluous testing * TST/FIX loop over refit and check found n_clusters * Cosmetic touches * Use zip instead of manually listing the keys * Fix inverse_transform * FIX bug in fit_grid_point; Allow only single score TST if fit_grid_point works as intended * ENH Use only ROC-AUC and F1-score * Fix typos and flake8; Address Andy's reviews MNT Add a comment on why we do such a transpose + some fixes * ENH Better error messages for incorrect multimetric scoring values +... ENH Avoid exception traceback while using incorrect scoring string * Dict keys must be of string type only * 1. Better error message for invalid scoring 2... Internal functions return single score for single metric scoring * Fix test failures and shuffle tests * Avoid wrapping scorer as dict in learning_curve * Remove doc example as asked for * Some leftover ones * Don't wrap scorer in validation_curve either * Add a doc example and skip it as dict order fails doctest * Import zip from six for python2.7 compat * Make cross_val_score return a cv_results-like dict * Add relevant sections to userguide * Flake8 fixes * Add whatsnew and fix broken links * Use AUC and accuracy instead of f1 * Fix failing doctests cross_validation.rst * DOC add the wrapper example for metrics that return multiple return values * Address andy's comments * Be less weird * Address more of andy's comments * Make a separate cross_validate function to return dict and a cross_val_score * Update the docs to reflect the new cross_validate function * Add cross_validate to toc-tree * Add more tests on type of cross_validate return and time limits * FIX failing doctests * FIX ensure keys are not plural * DOC fix * Address some pending comments * Remove the comment as it is irrelevant now * Remove excess blank line * Fix flake8 inconsistencies * Allow fit_times to be 0 to conform with windows precision * DOC specify how refit param is to be set in multiple metric case * TST ensure cross_validate works for string single metrics + address @jnothman's reviews * Doc fixes * Remove the shape and transform parameter of _aggregate_score_dicts * Address Joel's doc comments * Fix broken doctest * Fix the spurious file * Address Andy's comments * MNT Remove erroneous entry * Address Andy's comments * FIX broken links * Update whats_new.rst missing newline
…ate on multiple metrics (scikit-learn#7388) * ENH cross_val_score now supports multiple metrics * DOCFIX permutation_test_score * ENH validate multiple metric scorers * ENH Move validation of multimetric scoring param out * ENH GridSearchCV and RandomizedSearchCV now support multiple metrics * EXA Add an example demonstrating the multiple metric in GridSearchCV * ENH Let check_multimetric_scoring tell if its multimetric or not * FIX For single metric name of scorer should remain 'score' * ENH validation_curve and learning_curve now support multiple metrics * MNT move _aggregate_score_dicts helper into _validation.py * TST More testing/ Fixing scores to the correct values * EXA Add cross_val_score to multimetric example * Rename to multiple_metric_evaluation.py * MNT Remove scaffolding * FIX doctest imports * FIX wrap the scorer and unwrap the score when using _score() in rfe * TST Cleanup the tests. Test for is_multimetric too * TST Make sure it registers as single metric when scoring is of that type * PEP8 * Don't use dict comprehension to make it work in python2.6 * ENH/FIX/TST grid_scores_ should not be available for multimetric evaluation * FIX+TST delegated methods NA when multimetric is enabled... TST Add general tests to GridSearchCV and RandomizedSearchCV * ENH add option to disable delegation on multimetric scoring * Remove old function from __all__ * flake8 * FIX revert disable_on_multimetric * stash * Fix incorrect rebase * [ci skip] * Make sure refit works as expected and remove irrelevant tests * Allow passing standard scorers by name in multimetric scorers * Fix example * flake8 * Address reviews * Fix indentation * Ensure {'acc': 'accuracy'} and ['precision'] are valid inputs * Test that for single metric, 'score' is a key * Typos * Fix incorrect rebase * Compare multimetric grid search with multiple single metric searches * Test X, y list and pandas input; Test multimetric for unsupervised grid search * Fix tests; Unsupervised multimetric gs will not pass until scikit-learn#8117 is merged * Make a plot of Precision vs ROC AUC for RandomForest varying the n_estimators * Add example to grid_search.rst * Use the classic tuning of C param in SVM instead of estimators in RF * FIX Remove scoring arg in deafult scorer test * flake8 * Search for min_samples_split in DTC; Also show f-score * REVIEW Make check_multimetric_scoring private * FIX Add more samples to see if 3% mismatch on 32 bit systems gets fixed * REVIEW Plot best score; Shorten legends * REVIEW/COSMIT multimetric --> multi-metric * REVIEW Mark the best scores of P/R scores too * Revert "FIX Add more samples to see if 3% mismatch on 32 bit systems gets fixed" This reverts commit ba766d9. * ENH Use looping for iid testing * FIX use param grid as scipy's stats dist in 0.12 do not accept seed * ENH more looping less code; Use small non-noisy dataset * FIX Use named arg after expanded args * TST More testing of the refit parameter * Test that in multimetric search refit to single metric, the delegated methods work as expected. * Test that setting probability=False works with multimetric too * Test refit=False gives sensible error * COSMIT multimetric --> multi-metric * REV Correct example doc * COSMIT * REVIEW Make tests stronger; Fix bugs in _check_multimetric_scorer * REVIEW refit param: Raise for empty strings * TST Invalid refit params * REVIEW Use <scorer_name> alone; recall --> Recall * REV specify when we expect scorers to not be None * FLAKE8 * REVERT multimetrics in learning_curve and validation_curve * REVIEW Simpler coding style * COSMIT * COSMIT * REV Compress example a bit. Move comment to top * FIX fit_grid_point's previous API must be preserved * Flake8 * TST Use loop; Compare with single-metric * REVIEW Use dict-comprehension instead of helper * REVIEW Remove redundant test * Fix tests incorrect braces * COSMIT * REVIEW Use regexp * REV Simplify aggregation of score dicts * FIX precision and accuracy test * FIX doctest and flake8 * TST the best_* attributes multimetric with single metric * Address @jnothman's review * Address more comments \o/ * DOCFIXES * Fix use the validated fit_param from fit's arguments * Revert alpha to a lower value as before * Using def instead of lambda * Address @jnothman's review batch 1: Fix tests / Doc fixes * Remove superfluous tests * Remove more superfluous testing * TST/FIX loop over refit and check found n_clusters * Cosmetic touches * Use zip instead of manually listing the keys * Fix inverse_transform * FIX bug in fit_grid_point; Allow only single score TST if fit_grid_point works as intended * ENH Use only ROC-AUC and F1-score * Fix typos and flake8; Address Andy's reviews MNT Add a comment on why we do such a transpose + some fixes * ENH Better error messages for incorrect multimetric scoring values +... ENH Avoid exception traceback while using incorrect scoring string * Dict keys must be of string type only * 1. Better error message for invalid scoring 2... Internal functions return single score for single metric scoring * Fix test failures and shuffle tests * Avoid wrapping scorer as dict in learning_curve * Remove doc example as asked for * Some leftover ones * Don't wrap scorer in validation_curve either * Add a doc example and skip it as dict order fails doctest * Import zip from six for python2.7 compat * Make cross_val_score return a cv_results-like dict * Add relevant sections to userguide * Flake8 fixes * Add whatsnew and fix broken links * Use AUC and accuracy instead of f1 * Fix failing doctests cross_validation.rst * DOC add the wrapper example for metrics that return multiple return values * Address andy's comments * Be less weird * Address more of andy's comments * Make a separate cross_validate function to return dict and a cross_val_score * Update the docs to reflect the new cross_validate function * Add cross_validate to toc-tree * Add more tests on type of cross_validate return and time limits * FIX failing doctests * FIX ensure keys are not plural * DOC fix * Address some pending comments * Remove the comment as it is irrelevant now * Remove excess blank line * Fix flake8 inconsistencies * Allow fit_times to be 0 to conform with windows precision * DOC specify how refit param is to be set in multiple metric case * TST ensure cross_validate works for string single metrics + address @jnothman's reviews * Doc fixes * Remove the shape and transform parameter of _aggregate_score_dicts * Address Joel's doc comments * Fix broken doctest * Fix the spurious file * Address Andy's comments * MNT Remove erroneous entry * Address Andy's comments * FIX broken links * Update whats_new.rst missing newline
This adds all cluster metrics that uses supervised evaluation like fowlkess-mallows etc...
Code to reproduce
At master
In this branch
@tguillemot @jnothman