Add move_to function to convert array namespace and device to namespace and device#31829
Conversation
sklearn/utils/_array_api.py
Outdated
| try: | ||
| # Note will copy if required | ||
| array_converted = xp_ref.from_dlpack(array, device=device_ref) | ||
| except AttributeError: |
There was a problem hiding this comment.
I decided to only except AttributeError, which I think occurs if input or output namespace does not support dlpack.
from_dlpack can give 2 (or more) other errors
- BufferError - The dlpack and dlpack_device methods on the input array may raise BufferError when the data cannot be exported as DLPack (e.g., incompatible dtype, strides, or device). It may also raise other errors when export fails for other reasons (e.g., not enough memory available to materialize the data). from_dlpack must propagate such exceptions.
- I thought that if dlpack fails to convert due to one of the above errors, it would not make sense to try ourselves manually.
- ValueError - If data exchange is possible via an explicit copy but copy is set to False.
- I've left
copy=None, allowing it to copy if need be, so this error is not relevant. I am not sure about thecopy=Nonesetting though, it is a lot of memory usage.
- I've left
There was a problem hiding this comment.
Talking to Evgeni about what could possibly cause a BufferError - here are some ideas, but but I don't know if any of these will cause an error:
- numpy supports negative strides but torch does not (https://discuss.pytorch.org/t/negative-strides-in-tensor-error/134287) (I don't think torch supports DLPack atm, so hard to figure out)
- numpy structured array?
- numpy memmapped array
virchan
left a comment
There was a problem hiding this comment.
I suspect we could simplify _convert_to_reference a bit when xp_ref is NumPy:
if _is_numpy_namespace(xp_ref):
return tuple([_convert_to_numpy(array, get_namespace(array)) for array in arrays])However, I'm not sure this offers much benefit beyond readability, since in most cases there are only two arrays to convert: y_true and sample_weight.
I'll have to give it some more thought.
At the moment that would be simpler, but I think
For metrics I think this would mostly be it, but for estimators there could be other arrays to convert. |
|
What's the difference with |
I think the goal of both is the same. I also think that |
ogrisel
left a comment
There was a problem hiding this comment.
I think I would be in favor of renaming this method to move_to(arrays, namespace, device) since the device and namespace will already be inspected in the caller most of the time for other purposes.
I think we should remove/edit |
|
We should probably create a common test that checks that "everything follows X/y_pred" works for estimators and functions. But it will require quite a bit of renovation work in existing code. I think this means we should tackle this in a new PR. And even there I wonder if we should add the test and make all the changes or if we should have several smaller PRs that make the change and then add the test. WDYT? |
Why don't I add a unit test for |
|
Looking into adding a unit test to test different array combinations, I have a stupid question - is it worth testing arrays on different devices but in the same namespace? Namely torch. I think it would be testing torch cpu -> torch cuda but something like torch cuda -> torch mps seems unlikely to occur in real life? |
StefanieSenger
left a comment
There was a problem hiding this comment.
Thank you for your work, @lucyleeow. I only have two nit comments and a question: Why don't we test for all possible combinations here? I have read the whole discussion and have seen how the test evolved from being joined, then split, then joined again. This is a central function of array API and I feel it is necessary to test for all possible combinations.
This is only a partial review and I only intend to comment here.
| # methods are not present on the input array | ||
| # `TypeError` and `NotImplementedError` for packages that do not | ||
| # yet support dlpack 1.0 | ||
| # (i.e. the `device`/`copy` kwargs, e.g., torch <= 2.8.0) |
There was a problem hiding this comment.
This line and L518 talk about copy= as a keyword argument. But I can't see it being used anywhere. So I'm confused. Is the comment out of date? Or is it on purpose that the comment mentions it but it isn't in the code?
There was a problem hiding this comment.
Ah yes, sorry this assumes knowledge!
The two relevant PRs are data-apis/array-api#741 and dmlc/dlpack#136 (which references the first PR). DLPack v1 added both the copy and device kwargs at the same time;
- Explicitly enable/disable data copies (through the new copy argument). The default is still None ("may copy") to preserve backward compatibility and align with other constructors (ex: asarray).
- Allow copying to CPU (through the new device argument, plus setting copy=True)
thus I speak of them together, as when array libraries (e.g., pytorch) add support for DLPack v1, they (generally) add support for both kwargs together.
The default value for copy=None, which means only copy if needed. (this kwarg can also be True or False - https://data-apis.org/array-api/latest/API_specification/generated/array_api.from_dlpack.html#from-dlpack)
There was a problem hiding this comment.
Maybe I can add reference to data-apis/array-api#741 ?
|
I'd propose the following in an attempt to wrap up this PR (it has already gone down some rabbit holes and generated a lot of comments :D). Address the open comments (that have come since Olivier's review), do not add more new review comments (that aren't responses to existing ones), merge #32705 and then merge this. Then find out what we forgot and address it. WDYT? |
|
I merged #32705, let me sync with main and retrigger CUDA CI. |
|
@lucyleeow there are a few suggestions/comments left to address in the above reviews, but otherwise LGTM for merge once addressed. |
|
Reviews addressed and I removed the |
|
I expanded the inline comment based on review feedback, synced with |
|
Thanks all for your time and your thoughts! |
Reference Issues/PRs
Towards #28668 and #31274
What does this implement/fix? Explain your changes.
Adds a function that converts arrays to the namespace and device of the reference array.
Tries DLPack first, and if either array does not support it, tries to convert manually.
Any other comments?
This is an initial attempt, and what it would look like in a simple metric. Feedback welcome. (Tests to come)
I thought about also outputting the namespace and device of the reference array, to avoid the second call to
get_namespace_and_device, but I thought it would make the outputs too messy.cc @ogrisel @betatim @StefanieSenger @virchan @lesteve