Add tests for train_test_split with Array API input by betatim · Pull Request #26855 · scikit-learn/scikit-learn

betatim · 2023-07-18T16:12:10Z

Reference Issues/PRs

(need to find one)

What does this implement/fix? Explain your changes.

This mostly adds some tests that use train_test_split with Array API input and compare to using a pure Numpy array as input.

Any other comments?

First attempt of seeing what happens when you feed cupy/pytorch/array api arrays to train_test_split. Need to explore more of the different parameters to see if they all "just work".

add a changelog entry

github-actions · 2023-07-18T16:13:59Z

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

_{Generated for commit: 998eb93. Link to the linter CI: here}

ogrisel

Great that it works out of the box.

sklearn/model_selection/tests/test_split.py

betatim · 2023-07-19T16:14:18Z

sklearn/utils/_array_api.py

+    def __eq__(self, other):
+        return self._namespace == other._namespace


Do we want this? It is convenient in the test to be able to compare (wrapped) namespaces for equivalence.

I think i prefer explicit namespace assertions 8n tests. We could have a helper to assert same namespace in tests.

What do you mean with explicit? Getting a string representation that we can compare to the array_namespace passed in to the test?

In the test itself I use get_namespace(input)[0] == get_namespace(output)[0] to check that input and output are in the same namespace. This works when the namespace is one from the array compat library, but not for the few namespaces that we wrap in this wrapper.

I am okay with overriding __eq__ like this.

sklearn/preprocessing/tests/test_function_transformer.py

betatim · 2023-07-20T06:11:04Z

@thomasjpfan and @ogrisel - if you want to look at a PR that mostly adds new tests, this is one :D

sklearn/model_selection/tests/test_split.py

ogrisel · 2023-07-20T06:26:01Z

sklearn/utils/_array_api.py

+    def __eq__(self, other):
+        return self._namespace == other._namespace


I think i prefer explicit namespace assertions 8n tests. We could have a helper to assert same namespace in tests.

thomasjpfan

Thanks for the PR!

sklearn/model_selection/tests/test_split.py

thomasjpfan · 2023-07-31T19:09:40Z

sklearn/utils/_array_api.py

+    def __eq__(self, other):
+        return self._namespace == other._namespace


I am okay with overriding __eq__ like this.

sklearn/preprocessing/tests/test_function_transformer.py

betatim · 2023-08-04T08:51:11Z

Should we list this kind of thing (functions, not estimators) in the "estimators with support" section of doc/modules/array_api.rst? New section?

thomasjpfan · 2023-08-07T18:48:00Z

Should we list this kind of thing (functions, not estimators) in the "estimators with support" section of doc/modules/array_api.rst? New section?

I like a new section. I think it's good to keep track of all the Array API supported estimators & functions in array_api.rst.

betatim · 2023-08-08T07:23:36Z

What do you think of the current patch? I added subsections, one called Estimators and one called Tools.

thomasjpfan · 2023-08-09T17:46:19Z

What do you think of the current patch? I added subsections, one called Estimators and one called Tools.

That is okay with me.

Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>

github-actions bot added module:model_selection module:utils labels Jul 18, 2023

ogrisel reviewed Jul 18, 2023

View reviewed changes

sklearn/model_selection/tests/test_split.py Show resolved Hide resolved

ogrisel added the Array API label Jul 18, 2023

betatim commented Jul 19, 2023

View reviewed changes

sklearn/model_selection/tests/test_split.py Outdated Show resolved Hide resolved

betatim marked this pull request as ready for review July 19, 2023 16:13

betatim commented Jul 19, 2023

View reviewed changes

sklearn/preprocessing/tests/test_function_transformer.py Show resolved Hide resolved

ogrisel approved these changes Jul 20, 2023

View reviewed changes

thomasjpfan reviewed Jul 31, 2023

View reviewed changes

betatim added 5 commits August 3, 2023 13:52

Add tests for train_test_split with Array API input

4fc3d53

Check dtype, device and array namespace of returned values

1818e6d

Remove use of _safe_indexing in test

b4a9fba

Remove debug, use suffixes for variables

06e29dc

Add what's new entry

8a7814d

betatim force-pushed the array_api_train_test_split branch from fcb0edf to 8a7814d Compare August 3, 2023 11:53

List train_test_split as supporting Array API input

1127854

Merge remote-tracking branch 'upstream/main' into pr/26855

998eb93

thomasjpfan enabled auto-merge (squash) August 9, 2023 17:36

thomasjpfan approved these changes Aug 9, 2023

View reviewed changes

thomasjpfan merged commit 1b0a51b into scikit-learn:main Aug 9, 2023

betatim deleted the array_api_train_test_split branch August 10, 2023 13:04

TamaraAtanasoska pushed a commit to TamaraAtanasoska/scikit-learn that referenced this pull request Aug 21, 2023

Add tests for train_test_split with Array API input (scikit-learn#26855)

c66339b

Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>

REDVM pushed a commit to REDVM/scikit-learn that referenced this pull request Nov 16, 2023

Add tests for train_test_split with Array API input (scikit-learn#26855)

a6d824b

Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>

betatim mentioned this pull request Feb 12, 2024

FIX Fix array API train_test_split #28407

Merged

		def __eq__(self, other):
		return self._namespace == other._namespace

Uh oh!

Conversation

betatim commented Jul 18, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Any other comments?

Uh oh!

github-actions bot commented Jul 18, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✔️ Linting Passed

Uh oh!

ogrisel left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

betatim Jul 19, 2023

Choose a reason for hiding this comment

Uh oh!

ogrisel Jul 20, 2023

Choose a reason for hiding this comment

Uh oh!

betatim Jul 20, 2023

Choose a reason for hiding this comment

Uh oh!

thomasjpfan Jul 31, 2023

Choose a reason for hiding this comment

Uh oh!

Uh oh!

betatim commented Jul 20, 2023

Uh oh!

Uh oh!

ogrisel Jul 20, 2023

Choose a reason for hiding this comment

Uh oh!

thomasjpfan left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

thomasjpfan Jul 31, 2023

Choose a reason for hiding this comment

Uh oh!

Uh oh!

betatim commented Aug 4, 2023

Uh oh!

thomasjpfan commented Aug 7, 2023

Uh oh!

betatim commented Aug 8, 2023

Uh oh!

thomasjpfan commented Aug 9, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

betatim commented Jul 18, 2023 •

edited

Loading

github-actions bot commented Jul 18, 2023 •

edited

Loading