FEA Add array API support to `coverage_error` by jaffourt · Pull Request #32626 · scikit-learn/scikit-learn

jaffourt · 2025-10-31T19:53:12Z

Reference Issues/PRs

Towards #26024

What does this implement/fix? Explain your changes.

Add array API support to coverage_error

Any other comments?

github-actions · 2025-10-31T19:54:22Z

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

_{Generated for commit: 28e6c5a. Link to the linter CI: here}

I think numpy converts a mixed int/float array (used in the test_ranking.py tests for `coverage_error`) to np.float64. Which is causing errors on MPS

jaffourt · 2025-11-05T12:19:38Z

Unit tests are passing, but the following are not finishing successfully:

Azure Pipelines / scikit-learn.scikit-learn
scikit-learn.scikit-learn (Linux pylatest_pip_openblas_pandas)
scikit-learn.scikit-learn (macOS pylatest_conda_forge_mkl_no_openmp)

Azure pipeline is showing the following errors:

Action	Error
Linux pylatest_pip_openblas_pandas	The job running on agent Azure Pipelines 5 ran longer than the maximum time of 120 minutes. For more information, see https://go.microsoft.com/fwlink/?linkid=2077134.
Linux pylatest_pip_openblas_pandas • Install	The Operation will be canceled. The next steps may not contain expected logs.
Linux pylatest_pip_openblas_pandas • Install	The operation was canceled.
macOS pylatest_conda_forge_mkl_no_openmp	This is a scheduled macos-13 brownout. The macOS-13 based runner images are being deprecated. For more details, see actions/runner-images#13046.
macOS pylatest_conda_forge_mkl_no_openmp	This is a scheduled macos-13 brownout. The macOS-13 based runner images are being deprecated. For more details, see actions/runner-images#13046.
macOS pylatest_conda_forge_mkl_no_openmp	The remote provider was unable to process the request.

jaffourt · 2025-11-05T12:54:58Z

sklearn/metrics/_ranking.py

-    coverage = (y_score >= y_min_relevant).sum(axis=1)
-    coverage = coverage.filled(0)
+    y_true_bool = xp.astype(y_true, xp.bool, device=device_, copy=False)
+    y_score_masked = xp.where(y_true_bool, y_score, xp.inf)


We are replacing the masked values with inf values, since the masked-array (y_score_mask) was only used for finding the minimum value in each row.

Nothing is larger than xp.inf so functionally this is the same.

Are we confident that y_score will not have any xp.inf?

Are we confident that y_score will not have any xp.inf?

I've been thinking about this too. Just in case, for y_true as the function input, then check_array will catch any inf values and raise a ValueError, so that part is covered.

On the other hand, y_score_masked can contain inf, which happens when y_true contains a zero row. However, since we do:

y_min_relevant = xp.reshape(xp.min(y_score_masked, axis=1), (-1, 1)) coverage = xp.count_nonzero(y_score >= y_min_relevant, axis=1)

coverage will only contain finite values, so we should be fine here too.

jaffourt · 2025-11-17T14:57:32Z

@OmarManzoor @lucyleeow This is ready for review when either of you have a chance to take a look :)

OmarManzoor

LGTM. Thank you @jaffourt

OmarManzoor · 2025-11-25T08:43:15Z

sklearn/metrics/_ranking.py

-    coverage = (y_score >= y_min_relevant).sum(axis=1)
-    coverage = coverage.filled(0)
+    y_true_bool = xp.astype(y_true, xp.bool, device=device_, copy=False)
+    y_score_masked = xp.where(y_true_bool, y_score, xp.inf)


Are we confident that y_score will not have any xp.inf?

OmarManzoor · 2025-11-25T08:43:48Z

CC: @lucyleeow @virchan for a second review

virchan

Thanks for PR, @jaffourt.

I have some comments. Otherwise, LGTM.

virchan · 2025-11-26T04:31:42Z

sklearn/metrics/_ranking.py

-    coverage = (y_score >= y_min_relevant).sum(axis=1)
-    coverage = coverage.filled(0)
+    y_true_bool = xp.astype(y_true, xp.bool, device=device_, copy=False)
+    y_score_masked = xp.where(y_true_bool, y_score, xp.inf)


Are we confident that y_score will not have any xp.inf?

I've been thinking about this too. Just in case, for y_true as the function input, then check_array will catch any inf values and raise a ValueError, so that part is covered.

On the other hand, y_score_masked can contain inf, which happens when y_true contains a zero row. However, since we do:

y_min_relevant = xp.reshape(xp.min(y_score_masked, axis=1), (-1, 1)) coverage = xp.count_nonzero(y_score >= y_min_relevant, axis=1)

coverage will only contain finite values, so we should be fine here too.

sklearn/metrics/_ranking.py

lucyleeow · 2025-11-26T10:45:23Z

Haven't had time to do a proper review, but tests for continuous metrics had been added in #32422, which added several continuous metrics e.g., brier_score_loss etc. I've factorized the tests to test_common.py in #32793 and I am thinking it may be better to use those here - it adds tests for string y_true and multiclass and multilabel cases.

@virchan

resolve @virchan PR comment Co-authored-by: Virgil Chan <virchan.math@gmail.com>

jaffourt · 2025-12-01T13:27:07Z

@OmarManzoor @virchan @lucyleeow Thank you for your thoughtful reviews!

Regarding the xp.inf values in y_score discussion, I think this implementation handles that case correctly as @virchan noted. However, should we add an explicit test for this somewhere? Maybe we could add an np.inf value to one of the input arrays in test_common.py?

Haven't had time to do a proper review, but tests for continuous metrics had been added in #32422, which added several continuous metrics e.g., brier_score_loss etc. I've factorized the tests to test_common.py in #32793 and I am thinking it may be better to use those here - it adds tests for string y_true and multiclass and multilabel cases.

This makes sense to me, I can use the new tests in test_common.py once #32793 is merged. Slightly confused if #32755 is also relevant to this though?

lucyleeow · 2025-12-10T09:16:22Z

I think this implementation handles that case correctly as @virchan noted. However, should we add an explicit test for this somewhere?

We have test_continuous_inf_nan_input which I think should do it. I think tests that check functionality of the metric can be numpy only, so I don't think we need a separate array API version of the test.

I can use the new tests in test_common.py once #32793 is merged. Slightly confused if #32755 is also relevant to this though?

#32793 sort of depends on #32755, so sorry this PR is 3rd in the order of PRs to be merged! The original tests that I wanted to factorise out in #32793 included the mixed array input tests (as well as general array API compliance test), so I didn't want to delete the original tests without adding both the tests in #32755 and #32793

jaffourt added 3 commits October 31, 2025 15:16

wip

7a97277

cleanup

18686e3

add test

2886f26

github-actions bot added the module:metrics label Oct 31, 2025

towncrier

1ae8c65

jaffourt changed the title ~~Add array API support to coverage_error~~ FEA Add array API support to coverage_error Oct 31, 2025

jaffourt added 5 commits October 31, 2025 16:40

all float y_score_np

4b7f92c

I think numpy converts a mixed int/float array (used in the test_ranking.py tests for `coverage_error`) to np.float64. Which is causing errors on MPS

trying again

b725c72

update docs

2de2060

explicit test array dtypes

e588949

add multilabel test

7bd84c2

jaffourt commented Nov 5, 2025

View reviewed changes

lucyleeow added the Array API label Nov 5, 2025

lucyleeow mentioned this pull request Nov 6, 2025

Make more of the "tools" of scikit-learn Array API compatible #26024

Open

Merge remote-tracking branch 'upstream/main' into aapi_coverage_error

0b44c11

betatim added this to Array API Nov 17, 2025

lucyleeow moved this to In Progress in Array API Nov 18, 2025

OmarManzoor approved these changes Nov 25, 2025

View reviewed changes

OmarManzoor added Waiting for Second Reviewer First reviewer is done, need a second one! CUDA CI labels Nov 25, 2025

github-actions bot removed the CUDA CI label Nov 25, 2025

lucyleeow mentioned this pull request Nov 26, 2025

TST Add array API continuous metric common tests #32793

Draft

virchan reviewed Nov 26, 2025

View reviewed changes

Update sklearn/metrics/_ranking.py

28e6c5a

resolve @virchan PR comment Co-authored-by: Virgil Chan <virchan.math@gmail.com>

Uh oh!

Conversation

jaffourt commented Oct 31, 2025

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Any other comments?

Uh oh!

github-actions bot commented Oct 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✔️ Linting Passed

Uh oh!

jaffourt commented Nov 5, 2025

Uh oh!

jaffourt Nov 5, 2025

Choose a reason for hiding this comment

Uh oh!

OmarManzoor Nov 25, 2025

Choose a reason for hiding this comment

Uh oh!

virchan Nov 26, 2025

Choose a reason for hiding this comment

Uh oh!

jaffourt commented Nov 17, 2025

Uh oh!

OmarManzoor left a comment

Choose a reason for hiding this comment

Uh oh!

OmarManzoor Nov 25, 2025

Choose a reason for hiding this comment

Uh oh!

OmarManzoor commented Nov 25, 2025

Uh oh!

virchan left a comment

Choose a reason for hiding this comment

Uh oh!

virchan Nov 26, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

lucyleeow commented Nov 26, 2025

Uh oh!

jaffourt commented Dec 1, 2025

Uh oh!

lucyleeow commented Dec 10, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

github-actions bot commented Oct 31, 2025 •

edited

Loading