Add `move_to` function to convert array namespace and device to namespace and device by lucyleeow · Pull Request #31829 · scikit-learn/scikit-learn

lucyleeow · 2025-07-24T05:54:11Z

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Adds a function that converts arrays to the namespace and device of the reference array.

Tries DLPack first, and if either array does not support it, tries to convert manually.

Any other comments?

This is an initial attempt, and what it would look like in a simple metric. Feedback welcome. (Tests to come)

I thought about also outputting the namespace and device of the reference array, to avoid the second call to get_namespace_and_device, but I thought it would make the outputs too messy.

cc @ogrisel @betatim @StefanieSenger @virchan @lesteve

github-actions · 2025-07-24T05:55:10Z

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

_{Generated for commit: 71aaad2. Link to the linter CI: here}

sklearn/metrics/_classification.py

lucyleeow · 2025-07-24T12:04:09Z

sklearn/utils/_array_api.py

+            try:
+                # Note will copy if required
+                array_converted = xp_ref.from_dlpack(array, device=device_ref)
+            except AttributeError:


I decided to only except AttributeError, which I think occurs if input or output namespace does not support dlpack.

from_dlpack can give 2 (or more) other errors

BufferError - The dlpack and dlpack_device methods on the input array may raise BufferError when the data cannot be exported as DLPack (e.g., incompatible dtype, strides, or device). It may also raise other errors when export fails for other reasons (e.g., not enough memory available to materialize the data). from_dlpack must propagate such exceptions.

I thought that if dlpack fails to convert due to one of the above errors, it would not make sense to try ourselves manually.

ValueError - If data exchange is possible via an explicit copy but copy is set to False.

I've left copy=None, allowing it to copy if need be, so this error is not relevant. I am not sure about the copy=None setting though, it is a lot of memory usage.

Talking to Evgeni about what could possibly cause a BufferError - here are some ideas, but but I don't know if any of these will cause an error:

numpy supports negative strides but torch does not (https://discuss.pytorch.org/t/negative-strides-in-tensor-error/134287) (I don't think torch supports DLPack atm, so hard to figure out)

numpy structured array?

numpy memmapped array

virchan

I suspect we could simplify _convert_to_reference a bit when xp_ref is NumPy:

if _is_numpy_namespace(xp_ref):
   return tuple([_convert_to_numpy(array, get_namespace(array)) for array in arrays])

However, I'm not sure this offers much benefit beyond readability, since in most cases there are only two arrays to convert: y_true and sample_weight.

I'll have to give it some more thought.

sklearn/utils/_array_api.py

lucyleeow · 2025-07-25T09:48:08Z

I suspect we could simplify _convert_to_reference a bit when xp_ref is NumPy:

At the moment that would be simpler, but I think _convert_to_numpy is considered a bit of a 'hack'. It's got numpy conversions for specific array namespaces hard coded (needs maintenance, though I do doubt any of the APIs will change) and it doesn't necessarily work for all array API arrays. I think DLPack is more future proof and it has a copy parameter (we could specify to avoid copying if we wanted to) - at least this was the thinking when I decided to just try DLPack first...

since in most cases there are only two arrays to convert: y_true and sample_weight.

For metrics I think this would mostly be it, but for estimators there could be other arrays to convert.

sklearn/utils/_array_api.py

sklearn/metrics/_classification.py

sklearn/utils/_array_api.py

jeremiedbb · 2025-07-25T12:54:14Z

What's the difference with sklearn.utils._array_api.ensure_common_namespace_device ? Looks like both are trying to do the same thing.

betatim · 2025-07-25T15:07:25Z

What's the difference with sklearn.utils._array_api.ensure_common_namespace_device ? Looks like both are trying to do the same thing.

I think the goal of both is the same. I also think that ensure_common_namespace_device is broken right now, at least for the application that Lucy has in mind. Maybe the thing to do is to replace the code of ensure_common_namespace_device with that of the new function?

ogrisel

I think I would be in favor of renaming this method to move_to(arrays, namespace, device) since the device and namespace will already be inspected in the caller most of the time for other purposes.

sklearn/utils/_array_api.py

betatim · 2025-08-27T13:34:49Z

What's the difference with sklearn.utils._array_api.ensure_common_namespace_device ? Looks like both are trying to do the same thing.

I think we should remove/edit ensure_common_namespace_device as part of this PR and use move_to instead.

betatim · 2025-08-28T07:08:45Z

We should probably create a common test that checks that "everything follows X/y_pred" works for estimators and functions. But it will require quite a bit of renovation work in existing code. I think this means we should tackle this in a new PR. And even there I wonder if we should add the test and make all the changes or if we should have several smaller PRs that make the change and then add the test. WDYT?

lucyleeow · 2025-09-03T11:08:45Z

We should probably create a common test that checks that "everything follows X/y_pred" works for estimators and functions.

Why don't I add a unit test for move_to in this PR and add a common "everything follows X/y_pred" for estimators and metrics in a separate PR. I feel like that will take some iterating and thought...

lucyleeow · 2025-09-04T05:22:05Z

Looking into adding a unit test to test different array combinations, I have a stupid question - is it worth testing arrays on different devices but in the same namespace? Namely torch. I think it would be testing torch cpu -> torch cuda but something like torch cuda -> torch mps seems unlikely to occur in real life?

StefanieSenger

Thank you for your work, @lucyleeow. I only have two nit comments and a question: Why don't we test for all possible combinations here? I have read the whole discussion and have seen how the test evolved from being joined, then split, then joined again. This is a central function of array API and I feel it is necessary to test for all possible combinations.

This is only a partial review and I only intend to comment here.

sklearn/utils/tests/test_array_api.py

sklearn/utils/_array_api.py

sklearn/utils/tests/test_array_api.py

sklearn/metrics/_classification.py

betatim · 2025-11-14T14:58:56Z

sklearn/utils/_array_api.py

+                    # methods are not present on the input array
+                    # `TypeError` and `NotImplementedError` for packages that do not
+                    # yet support dlpack 1.0
+                    # (i.e. the `device`/`copy` kwargs, e.g., torch <= 2.8.0)


This line and L518 talk about copy= as a keyword argument. But I can't see it being used anywhere. So I'm confused. Is the comment out of date? Or is it on purpose that the comment mentions it but it isn't in the code?

Ah yes, sorry this assumes knowledge!

The two relevant PRs are data-apis/array-api#741 and dmlc/dlpack#136 (which references the first PR). DLPack v1 added both the copy and device kwargs at the same time;

Explicitly enable/disable data copies (through the new copy argument). The default is still None ("may copy") to preserve backward compatibility and align with other constructors (ex: asarray).

Allow copying to CPU (through the new device argument, plus setting copy=True)

thus I speak of them together, as when array libraries (e.g., pytorch) add support for DLPack v1, they (generally) add support for both kwargs together.

The default value for copy=None, which means only copy if needed. (this kwarg can also be True or False - https://data-apis.org/array-api/latest/API_specification/generated/array_api.from_dlpack.html#from-dlpack)

Maybe I can add reference to data-apis/array-api#741 ?

betatim · 2025-11-14T15:03:59Z

I'd propose the following in an attempt to wrap up this PR (it has already gone down some rabbit holes and generated a lot of comments :D).

Address the open comments (that have come since Olivier's review), do not add more new review comments (that aren't responses to existing ones), merge #32705 and then merge this. Then find out what we forgot and address it.

WDYT?

ogrisel · 2025-11-14T17:04:52Z

I merged #32705, let me sync with main and retrigger CUDA CI.

ogrisel · 2025-11-14T17:17:04Z

@lucyleeow there are a few suggestions/comments left to address in the above reviews, but otherwise LGTM for merge once addressed.

lucyleeow · 2025-11-15T02:48:48Z

Reviews addressed and I removed the RunTimeError from the except, which will check that #32705 did what it was supposed to! 🤞

sklearn/utils/_array_api.py

ogrisel · 2025-11-17T08:49:29Z

I expanded the inline comment based on review feedback, synced with main and marked as auto-merge. Thanks @lucyleeow and reviewers!

lucyleeow · 2025-11-18T01:50:07Z

Thanks all for your time and your thoughts!

wip

573b9f4

github-actions bot added module:metrics module:utils labels Jul 24, 2025

jeremiedbb reviewed Jul 24, 2025

View reviewed changes

sklearn/metrics/_classification.py Outdated Show resolved Hide resolved

lucyleeow commented Jul 24, 2025

View reviewed changes

virchan reviewed Jul 25, 2025

View reviewed changes