[WIP] ENH Multilabel confusion matrix#10628
Closed
jnothman wants to merge 6 commits intoscikit-learn:masterfrom
Closed
[WIP] ENH Multilabel confusion matrix#10628jnothman wants to merge 6 commits intoscikit-learn:masterfrom
jnothman wants to merge 6 commits intoscikit-learn:masterfrom
Conversation
|
This pull request fixes 2 alerts - view on lgtm.com fixed alerts:
Comment posted by lgtm.com |
|
This pull request fixes 2 alerts - view on lgtm.com fixed alerts:
Comment posted by lgtm.com |
|
This pull request fixes 2 alerts - view on lgtm.com fixed alerts:
Comment posted by lgtm.com |
|
This pull request fixes 2 alerts when merging 542ec86 into e78263f - view on lgtm.com fixed alerts:
Comment posted by lgtm.com |
Contributor
|
Hi @jnothman , I am continuing your work on this. But I am not familiar with the codecov thing, this check seems to be failing? Do I need to fix this? |
8 tasks
Member
Author
|
benchmark means seeing if this is as fast or faster than the existing
precision/recall implementation.
codecov tells you if there are tests that run every line of new code. there
should be
|
Contributor
Member
Author
|
yes, it looks a lot slower, at least in some cases. can you profile and
work out where it's much slower?
sanple_weight should be 1d. 2d should raise an exception
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

This PR considers a helper for multilabel/set-wise evaluation metrics such as precision, recall, fbeta, jaccard (#10083), fall-out, miss rate and specificity (#5516). It also incorporates suggestions from #8126 regarding efficiency of multilabel true positives calculation (but does not optimise for micro-average, perhaps unfortunately). Unlike
confusion_matrixit is optimised for the multilabel case, but also handles multiclass problems like they are handled inprecision_recall_fscore_support: as binarised OvR problems.It benefits us by simplifying the
precision_recall_fscore_supportand future jaccard implementations greatly, and allows for further refactors between them. It also benefits us by making a clear calculation of sufficient statistics (although perhaps more statistics than necessary) from which standard metrics are a simple calculation: it makes the code less mystifying. In that sense, this is mostly a cosmetic change, but it provides users with the ability to easily generalise the P/R/F/S implementation to related metrics.TODO:
multilabel_confusion_matrixand use it inprecision_recall_fscore_supportas an indirect form of testingmultilabel_confusion_matrixmultilabel_confusion_matrixIf another contributor would like to take this on, I would welcome it. I have marked this as Easy because the code and technical knowledge involved is not hard, but it will take a bit of work, and clarity of understanding.