ENH add from_cv_results in PrecisionRecallDisplay (single Display)#30508
ENH add from_cv_results in PrecisionRecallDisplay (single Display)#30508glemaitre merged 61 commits intoscikit-learn:mainfrom
from_cv_results in PrecisionRecallDisplay (single Display)#30508Conversation
lucyleeow
left a comment
There was a problem hiding this comment.
I forgot to mention, I think I would like to decide on the order parameters for these display classes and their methods. They seem to have a lot of overlap and it would be great if they could be consistent.
I know that this would not matter when using the methods but it would be nice for the documentation API page if they were consistent?
| name_ = [name_] * n_multi if name_ is None else name_ | ||
| average_precision_ = ( | ||
| [None] * n_multi if self.average_precision is None else self.average_precision |
There was a problem hiding this comment.
I don't like this, but could not immediately think of a better way to do it
| precision_all.append(precision) | ||
| recall_all.append(recall) | ||
| ap_all.append(average_precision) |
There was a problem hiding this comment.
Don't like this but not sure on the zip suggested in #30399 (comment) as you've got to unpack at the end 🤔
lucyleeow
left a comment
There was a problem hiding this comment.
Some notes on review suggestions. Namely to make all the multi class params (precisions, recalls etc) list of ndarrays.
Also realised we did not need separate plot_single_curve function, as most of the complexity was in _get_line_kwargs
|
Just wanted to document here that we discussed a potential enhancement for comparing between estimators, where you have cv results from several estimators (so several fold curves for each estimator). Potentially this could be added as a separate function, where you pass the display object, and estimators desired. Not planned, just a potential additional in future. |
|
Hey, I think that you can revive this PR now that the roc curve is merged. Let's try to reuse code from the other PR if possible :) |
|
Thanks @jeremiedbb ! I think @glemaitre mentioned there was some discussion about what to do with the 'chance' level (average precision). In the current PR I have calculated a single average precision (AP) for all the data. I think others suggested that we should calculate average precision for each fold, which I can see is more accurate but I am concerned about the visualization appearance. Here I have used 5 cv splits and plotting chance for each, and colouring each pair of precision-recall curve/chance line the same colour: Some concerns about the visualization:
I will have more of a think of a better solution for this. |
|
So for the "Chance level" I would consider all lines to have the same color (and a lower alpha) and in the legend to have a single entry showing the mean + std. I would think it is enough. Also it is easy to link a chance level line with its PR curve because they meet when the recall is 1.0 |
There was a problem hiding this comment.
The test_precision_recall_display.pychanges should mostly be adding from_cv_results to the parametrization of tests. Aside from that there are 2 extra from_cv_results tests.
There is an extra test_display_kwargs_deprecation in test_common_curve_display.py (which lets us delete the same test in test_roc_curve_display.py) - but I think this is okay to leave without needed a new PR. The previous merge conflicts have been enough of a battle.
|
@jeremiedbb hopefully this is ready for last review now 🙏 |
jeremiedbb
left a comment
There was a problem hiding this comment.
A new round of comments. Mostly nitpicks at this stage.
|
Done, thank you @jeremiedbb ! |
|
Any more changes @jeremiedbb ? Thank you for review 🙏 @AnneBeyer maybe you may be interested in reviewing this? |
|
This #30508 (comment) hasn't been addressed yet |
7f12b40 to
6123938
Compare
glemaitre
left a comment
There was a problem hiding this comment.
LGTM as well.
Thanks @lucyleeow.
|
Thanks all! |
|
Looks like this is breaking the build doc on |

Reference Issues/PRs
Follows on from #30399
What does this implement/fix? Explain your changes.
Proof of concept of adding multi displays to
PrecisionRecallDisplayfrom_cv_resultsinRocCurveDisplay(singleRocCurveDisplay) #30399, so we can definitely factorize out, though small intricacies may make it complexplotmethod is complex due to handling both single and multi curve and doing a lot more checking, as user is able to use it outside of thefrom_estimatorandfrom_predictionsmethods.Detailed discussions of problems in review comments.
Any other comments?
cc @glemaitre @jeremiedbb