Skip to content

ENH add from_cv_results in PrecisionRecallDisplay (single Display)#30508

Merged
glemaitre merged 61 commits intoscikit-learn:mainfrom
lucyleeow:from_cv_precisionrecall
Feb 23, 2026
Merged

ENH add from_cv_results in PrecisionRecallDisplay (single Display)#30508
glemaitre merged 61 commits intoscikit-learn:mainfrom
lucyleeow:from_cv_precisionrecall

Conversation

@lucyleeow
Copy link
Copy Markdown
Member

Reference Issues/PRs

Follows on from #30399

What does this implement/fix? Explain your changes.

Proof of concept of adding multi displays to PrecisionRecallDisplay

  • A lot of the code is similar to that in ENH add from_cv_results in RocCurveDisplay (single RocCurveDisplay) #30399, so we can definitely factorize out, though small intricacies may make it complex
  • The plot method is complex due to handling both single and multi curve and doing a lot more checking, as user is able to use it outside of the from_estimator and from_predictions methods.

Detailed discussions of problems in review comments.

Any other comments?

cc @glemaitre @jeremiedbb

@github-actions
Copy link
Copy Markdown

github-actions bot commented Dec 19, 2024

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: 9aded6c. Link to the linter CI: here

Copy link
Copy Markdown
Member Author

@lucyleeow lucyleeow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I forgot to mention, I think I would like to decide on the order parameters for these display classes and their methods. They seem to have a lot of overlap and it would be great if they could be consistent.

I know that this would not matter when using the methods but it would be nice for the documentation API page if they were consistent?

Comment on lines +272 to +274
name_ = [name_] * n_multi if name_ is None else name_
average_precision_ = (
[None] * n_multi if self.average_precision is None else self.average_precision
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like this, but could not immediately think of a better way to do it

Comment on lines +836 to +838
precision_all.append(precision)
recall_all.append(recall)
ap_all.append(average_precision)
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't like this but not sure on the zip suggested in #30399 (comment) as you've got to unpack at the end 🤔

Copy link
Copy Markdown
Member Author

@lucyleeow lucyleeow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some notes on review suggestions. Namely to make all the multi class params (precisions, recalls etc) list of ndarrays.

Also realised we did not need separate plot_single_curve function, as most of the complexity was in _get_line_kwargs

@lucyleeow
Copy link
Copy Markdown
Member Author

Just wanted to document here that we discussed a potential enhancement for comparing between estimators, where you have cv results from several estimators (so several fold curves for each estimator). Potentially this could be added as a separate function, where you pass the display object, and estimators desired. Not planned, just a potential additional in future.

@jeremiedbb
Copy link
Copy Markdown
Member

Hey, I think that you can revive this PR now that the roc curve is merged. Let's try to reuse code from the other PR if possible :)

@lucyleeow
Copy link
Copy Markdown
Member Author

Thanks @jeremiedbb !

I think @glemaitre mentioned there was some discussion about what to do with the 'chance' level (average precision). In the current PR I have calculated a single average precision (AP) for all the data. I think others suggested that we should calculate average precision for each fold, which I can see is more accurate but I am concerned about the visualization appearance.

Here I have used 5 cv splits and plotting chance for each, and colouring each pair of precision-recall curve/chance line the same colour:

image

Some concerns about the visualization:

  • it would not be unusual for AP to be the same, as has occurred above. The orange and blue lines have the same AP and we can see that results in a single brown-ish line.
  • in ROC curve, we decided to colour all CV folds the same colour by default, as we thought this would be most appropriate:
    • as number of folds increases, it would be hard distinguish each line individually
    • we assume we are interested in comparing the results between estimators, thus it makes sense to colour all the cv results from one estimator the same colour

I will have more of a think of a better solution for this.

@glemaitre
Copy link
Copy Markdown
Member

glemaitre commented Jun 13, 2025

So for the "Chance level" I would consider all lines to have the same color (and a lower alpha) and in the legend to have a single entry showing the mean + std. I would think it is enough. Also it is easy to link a chance level line with its PR curve because they meet when the recall is 1.0

@auguste-probabl auguste-probabl moved this from In progress - High Priority to In progress in Labs Feb 9, 2026
Copy link
Copy Markdown
Member Author

@lucyleeow lucyleeow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test_precision_recall_display.pychanges should mostly be adding from_cv_results to the parametrization of tests. Aside from that there are 2 extra from_cv_results tests.

There is an extra test_display_kwargs_deprecation in test_common_curve_display.py (which lets us delete the same test in test_roc_curve_display.py) - but I think this is okay to leave without needed a new PR. The previous merge conflicts have been enough of a battle.

@lucyleeow
Copy link
Copy Markdown
Member Author

lucyleeow commented Feb 11, 2026

@jeremiedbb hopefully this is ready for last review now 🙏

Copy link
Copy Markdown
Member

@jeremiedbb jeremiedbb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A new round of comments. Mostly nitpicks at this stage.

@lucyleeow
Copy link
Copy Markdown
Member Author

Done, thank you @jeremiedbb !

@lucyleeow
Copy link
Copy Markdown
Member Author

Any more changes @jeremiedbb ? Thank you for review 🙏

@AnneBeyer maybe you may be interested in reviewing this?

Copy link
Copy Markdown
Member

@jeremiedbb jeremiedbb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM !

@jeremiedbb jeremiedbb added the Waiting for Second Reviewer First reviewer is done, need a second one! label Feb 13, 2026
@jeremiedbb jeremiedbb added this to the 1.9 milestone Feb 13, 2026
@jeremiedbb
Copy link
Copy Markdown
Member

This #30508 (comment) hasn't been addressed yet

@lucyleeow lucyleeow force-pushed the from_cv_precisionrecall branch from 7f12b40 to 6123938 Compare February 17, 2026 05:59
Copy link
Copy Markdown
Member

@glemaitre glemaitre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM as well.

Thanks @lucyleeow.

@glemaitre glemaitre merged commit 7ea267f into scikit-learn:main Feb 23, 2026
37 checks passed
@github-project-automation github-project-automation bot moved this from In Progress to Done in Visualization and displays Feb 23, 2026
@github-project-automation github-project-automation bot moved this from In progress to Done in Labs Feb 23, 2026
@lucyleeow lucyleeow deleted the from_cv_precisionrecall branch February 24, 2026 01:32
@lucyleeow
Copy link
Copy Markdown
Member Author

Thanks all!

@lesteve
Copy link
Copy Markdown
Member

lesteve commented Feb 24, 2026

Looks like this is breaking the build doc on main because examples/model_selection/plot_cost_sensitive_learning.py uses deprecated kwargs, see build log

  Unexpected failing examples (1):
    
        ../examples/model_selection/plot_cost_sensitive_learning.py failed leaving traceback:
    
        Traceback (most recent call last):
          File "/home/circleci/project/examples/model_selection/plot_cost_sensitive_learning.py", line 366, in <module>
            plot_roc_pr_curves(model, tuned_model, title=title)
          File "/home/circleci/project/examples/model_selection/plot_cost_sensitive_learning.py", line 302, in plot_roc_pr_curves
            PrecisionRecallDisplay.from_estimator(
          File "/home/circleci/project/sklearn/metrics/_plot/precision_recall_curve.py", line 550, in from_estimator
            return cls.from_predictions(
                   ^^^^^^^^^^^^^^^^^^^^^
          File "/home/circleci/project/sklearn/metrics/_plot/precision_recall_curve.py", line 728, in from_predictions
            return viz.plot(
                   ^^^^^^^^^
          File "/home/circleci/project/sklearn/metrics/_plot/precision_recall_curve.py", line 308, in plot
            curve_kwargs = self._validate_curve_kwargs(
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
          File "/home/circleci/project/sklearn/utils/_plotting.py", line 193, in _validate_curve_kwargs
            warnings.warn(
        FutureWarning: `**kwargs` is deprecated and will be removed in 1.11. Pass all matplotlib arguments to `curve_kwargs` as a dictionary instead.
    
    -------------------------------------------------------------------------------

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

module:metrics module:utils Waiting for Second Reviewer First reviewer is done, need a second one!

Projects

Development

Successfully merging this pull request may close these issues.

8 participants