FEA Add array API support to coverage_error#32626
FEA Add array API support to coverage_error#32626jaffourt wants to merge 11 commits intoscikit-learn:mainfrom
coverage_error#32626Conversation
coverage_errorcoverage_error
I think numpy converts a mixed int/float array (used in the test_ranking.py tests for `coverage_error`) to np.float64. Which is causing errors on MPS
|
Unit tests are passing, but the following are not finishing successfully:
Azure pipeline is showing the following errors:
|
| coverage = (y_score >= y_min_relevant).sum(axis=1) | ||
| coverage = coverage.filled(0) | ||
| y_true_bool = xp.astype(y_true, xp.bool, device=device_, copy=False) | ||
| y_score_masked = xp.where(y_true_bool, y_score, xp.inf) |
There was a problem hiding this comment.
We are replacing the masked values with inf values, since the masked-array (y_score_mask) was only used for finding the minimum value in each row.
Nothing is larger than xp.inf so functionally this is the same.
There was a problem hiding this comment.
Are we confident that y_score will not have any xp.inf?
There was a problem hiding this comment.
Are we confident that
y_scorewill not have anyxp.inf?
I've been thinking about this too. Just in case, for y_true as the function input, then check_array will catch any inf values and raise a ValueError, so that part is covered.
On the other hand, y_score_masked can contain inf, which happens when y_true contains a zero row. However, since we do:
y_min_relevant = xp.reshape(xp.min(y_score_masked, axis=1), (-1, 1))
coverage = xp.count_nonzero(y_score >= y_min_relevant, axis=1)coverage will only contain finite values, so we should be fine here too.
|
@OmarManzoor @lucyleeow This is ready for review when either of you have a chance to take a look :) |
OmarManzoor
left a comment
There was a problem hiding this comment.
LGTM. Thank you @jaffourt
| coverage = (y_score >= y_min_relevant).sum(axis=1) | ||
| coverage = coverage.filled(0) | ||
| y_true_bool = xp.astype(y_true, xp.bool, device=device_, copy=False) | ||
| y_score_masked = xp.where(y_true_bool, y_score, xp.inf) |
There was a problem hiding this comment.
Are we confident that y_score will not have any xp.inf?
|
CC: @lucyleeow @virchan for a second review |
| coverage = (y_score >= y_min_relevant).sum(axis=1) | ||
| coverage = coverage.filled(0) | ||
| y_true_bool = xp.astype(y_true, xp.bool, device=device_, copy=False) | ||
| y_score_masked = xp.where(y_true_bool, y_score, xp.inf) |
There was a problem hiding this comment.
Are we confident that
y_scorewill not have anyxp.inf?
I've been thinking about this too. Just in case, for y_true as the function input, then check_array will catch any inf values and raise a ValueError, so that part is covered.
On the other hand, y_score_masked can contain inf, which happens when y_true contains a zero row. However, since we do:
y_min_relevant = xp.reshape(xp.min(y_score_masked, axis=1), (-1, 1))
coverage = xp.count_nonzero(y_score >= y_min_relevant, axis=1)coverage will only contain finite values, so we should be fine here too.
|
Haven't had time to do a proper review, but tests for continuous metrics had been added in #32422, which added several continuous metrics e.g., |
resolve @virchan PR comment Co-authored-by: Virgil Chan <virchan.math@gmail.com>
|
@OmarManzoor @virchan @lucyleeow Thank you for your thoughtful reviews! Regarding the
This makes sense to me, I can use the new tests in |
We have
#32793 sort of depends on #32755, so sorry this PR is 3rd in the order of PRs to be merged! The original tests that I wanted to factorise out in #32793 included the mixed array input tests (as well as general array API compliance test), so I didn't want to delete the original tests without adding both the tests in #32755 and #32793 |
Reference Issues/PRs
Towards #26024
What does this implement/fix? Explain your changes.
Add array API support to
coverage_errorAny other comments?