[MRG] EHN Provide a pos_label parameter in plot_roc_curve#17651
[MRG] EHN Provide a pos_label parameter in plot_roc_curve#17651glemaitre merged 20 commits intoscikit-learn:masterfrom
Conversation
glemaitre
left a comment
There was a problem hiding this comment.
Having the refactoring is nice. I would like to have the other 2 PRs in first thought.
Could you add en entry in whats new in 0.24
thomasjpfan
left a comment
There was a problem hiding this comment.
I think the refactor of _get_target_scores is nice.
I am +0 on the refactor into a new base class. Is there another visualization we would like to add that would use this new base class?
sklearn/metrics/_plot/base.py
Outdated
| estimator_name : str, default=None | ||
| Name of estimator. If None, then the estimator name is not shown. | ||
|
|
||
| pos_label : str or int, default=None |
There was a problem hiding this comment.
Having pos_label here makes this base class less generic. We would only be able to use this to plot curves related to classification.
There was a problem hiding this comment.
are there any plot classes you believe should use the CurveDisplay class?
There was a problem hiding this comment.
@thomasjpfan Do we have curves for regression?
I assume that we could make a trick to find what type of data we handle since we have y and to know if this is a classification problem. If this is not, we can discard pos_label
There was a problem hiding this comment.
I'll be happy with renaming it to ClassificationCurveDisplay
I forgot I had this comment still here.
Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>
Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>
Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>
|
@claramatos The 2 other PRs have been merged so we can go forward here. You should resolve the conflict and once done, I will look at the tests. |
|
Sorry I had to revert #17594 that I merged too quickly (see the discussion at the end for the details). |
|
Since #17569 is closed can this be closed to? (I've already merged the most recent master version) @glemaitre |
thomasjpfan
left a comment
There was a problem hiding this comment.
I think PR is doing three things.
- Refactor into
CurveDisplay, which I am concerned with. I am thinking of how this plays out when we want to support multiclass with many curves. I didn't do this type of refactor at first because I wanted the ROC and PR curve to evolve independently of each other. - Refactor into
_get_target_scores, which I am happy with. - Add
pos_labelintoplot_roc_curve, which I am happy with.
@claramatos Can we make this PR only about 2 and 3 and have 1 for a follow up PR?
sklearn/metrics/_plot/base.py
Outdated
| estimator_name : str, default=None | ||
| Name of estimator. If None, then the estimator name is not shown. | ||
|
|
||
| pos_label : str or int, default=None |
There was a problem hiding this comment.
I'll be happy with renaming it to ClassificationCurveDisplay
I forgot I had this comment still here.
|
So your suggestion would be to remove the |
|
Sorry for being unclear. I am suggesting to remove |
|
@thomasjpfan I followed your suggestion and removed |
sklearn/metrics/_plot/base.py
Outdated
| return prediction_method | ||
|
|
||
|
|
||
| def _get_target_scores(X, estimator, response_method, pos_label=None): |
There was a problem hiding this comment.
Using the word "scores" here may get confused with metrics and scorers. Maybe _get_response is more clear?
Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>
Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>
Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>
thomasjpfan
left a comment
There was a problem hiding this comment.
LGTM Thank you @claramatos !
|
Actually this is fine since we don't compute the ROC-AUC via Thanks @claramatos |
Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com> Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>
Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com> Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>
closes #15573
relates with #17569
Add a pos_label parameter to specify which class to be the positive class when
estimator.classes_[1]is not the right choice when usingplot_roc_curve.