Skip to content

TST Extend tests for scipy.sparse/*array in sklearn/feature_extraction/tests/test_text#27219

Merged
glemaitre merged 8 commits intoscikit-learn:mainfrom
Charlie-XIAO:tst_sp_text
Sep 14, 2023
Merged

TST Extend tests for scipy.sparse/*array in sklearn/feature_extraction/tests/test_text#27219
glemaitre merged 8 commits intoscikit-learn:mainfrom
Charlie-XIAO:tst_sp_text

Conversation

@Charlie-XIAO
Copy link
Copy Markdown
Contributor

Towards #27090.

There is a test case failing originally, i.e., test_tfidf_transformer_sparse. This is because sparse arrays, being multiplied with whatever, raises if they have inconsistent shape (even if dimensions match). I'm not sure if the class does not support sparse arrays (so that I should remove the test parametrization), or it's something else that went wrong (I'm not familiar with (and in fact have never used) sparse arrays).

Currently I'm replacing x*y with (y.T*x.T).T but this is definitely not the final solution. Please let me know how I should deal with this.

@github-actions
Copy link
Copy Markdown

github-actions bot commented Aug 30, 2023

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: 0e0c75f. Link to the linter CI: here

@Charlie-XIAO Charlie-XIAO changed the title TST Extend tests for scipy.sparse/*array in sklearn/feature_extraction/text TST Extend tests for scipy.sparse/*array in sklearn/feature_extraction/tests/test_text Aug 30, 2023
Copy link
Copy Markdown
Contributor

@work-mohit work-mohit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Try these changes, and I'm not sure about the changes in the text.py file, you can check with/ without them.

@Charlie-XIAO
Copy link
Copy Markdown
Contributor Author

Try these changes, and I'm not sure about the changes in the text.py file, you can check with/ without them.

I think this test is testing csr results against csc? Or did I miss anything in your suggestion?

@glemaitre glemaitre self-requested a review September 13, 2023 17:32
Copy link
Copy Markdown
Member

@glemaitre glemaitre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM up to the point that the CIs are passing.
Thanks @Charlie-XIAO

@glemaitre
Copy link
Copy Markdown
Member

@OmarManzoor I think this one is correct. The sphinx error should not be a big deal. I assume that it will not happen in main. We did not edit any part of the docstring of the TfidifTransformer.

Copy link
Copy Markdown
Contributor

@OmarManzoor OmarManzoor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise LGTM.

@glemaitre glemaitre merged commit aee4cc2 into scikit-learn:main Sep 14, 2023
@Charlie-XIAO Charlie-XIAO deleted the tst_sp_text branch September 14, 2023 08:48
REDVM pushed a commit to REDVM/scikit-learn that referenced this pull request Nov 16, 2023
…tion/tests/test_text` (scikit-learn#27219)

Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>
Co-authored-by: Omar Salman <omar.salman@arbisoft.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants