FEA add confusion_matrix_at_thresholds#30134
Conversation
adrinjalali
left a comment
There was a problem hiding this comment.
You'd also need to add this in api_reference.py under the right section to have it rendered in the docs properly.
@glemaitre you happy with the name?
|
I think I'm fine with the name. I was looking if can have the word |
Co-authored-by: Guillaume Lemaitre <guillaume@probabl.ai>
Co-authored-by: Guillaume Lemaitre <guillaume@probabl.ai>
Co-authored-by: Guillaume Lemaitre <guillaume@probabl.ai>
Co-authored-by: Adrin Jalali <adrin.jalali@gmail.com>
|
@adrinjalali There is an issue with numpydoc validation in the |
|
When you look at the CI log, this is the error, not the return: [gw0] linux -- Python 3.12.7 /usr/share/miniconda/envs/testvenv/bin/python
810 Decreasing score values.
811
812 Examples
813 -------
814 >>> import numpy as np
815 >>> from sklearn.metrics import binary_classification_curve
816 >>> y_true = np.array([0, 0, 1, 1])
817 >>> y_scores = np.array([0.1, 0.4, 0.35, 0.8])
818 >>> fps, tps, thresholds = binary_classification_curve(y_true, y_scores)
819 >>> fps
Expected:
array([0, 1, 1, 2])
Got:
array([0., 1., 1., 2.])You just need to fix the output to floats |
jeremiedbb
left a comment
There was a problem hiding this comment.
Thanks for the PR @SuccessMoses. I directly pushed the requested change to return negatives as well. I also added a small smoke test, we don't need more since it's heavily tested through the other curve functions.
LGTM.
|
I wonder if |
|
After some discussion with @glemaitre and @ogrisel, we converged toward I tested with both and I find the second option a bit too long since it makes all calls to this function being multi-line, hence hurting the readability a bit. Since a confusion matrix per threshold doesn't make sense for multiclass (with a single threshold as we use to do in sklearn) it didn't feel that necessary. So in the end I went for |
| # For binary problems, :func:`sklearn.metrics.confusion_matrix` has the ``ravel`` method | ||
| # we can use get counts of true negatives, false positives, false negatives and | ||
| # true positives. |
There was a problem hiding this comment.
should we remove this paragraph? It's not clear to me here.
There was a problem hiding this comment.
I've added a line, just to link the two paragraphs, but not sure if it is ideal either.
|
Doc's failing @jeremiedbb |
Just need to update the import in |
confusion_matrix_at_thresholds
lucyleeow
left a comment
There was a problem hiding this comment.
Do we want to amend the tests names (e.g., test_binary_clf_curve_multiclass_error) to use the new name?
| # we can use get counts of true negatives, false positives, false negatives and | ||
| # true positives. | ||
| # | ||
| # :func:`sklearn.metrics.binary_classification_curve` |
| (2, 1, 2, 3) | ||
|
|
||
| With :func:`confusion_matrix_at_thresholds` we can get true negatives, false positives, | ||
| false negatives and true positives for different thresholds:: |
There was a problem hiding this comment.
For the new user, should we expand on how we determine the thresholds used?
lucyleeow
left a comment
There was a problem hiding this comment.
I've directly pushed my suggested changes, to help move this along, and fixed the doc failure.
Also checked the docs render okay.
|
Actually, still one last comment we may want to address here:
|
|
@adrinjalali or @jeremiedbb feel free to merge if you're happy! |
Reference Issues/PRs
What does this implement/fix? Explain your changes.
Fixes #16470
Any other comments?
sklearn/metrics/_ranking.py, changed the name of the function_binary_clf_curvetobinary_classifcation_curvewithout changing the body. I also changed test functions liketest_binary_clf_curve_multiclass_errorwithout changing the bodydet_curve,roc_curveandprecision_recall_curvecall this function, so I updated the name of the function in the body