Skip to content

TST Extend tests for scipy.sparse.*array in sklearn/ensemble/tests/test_weight_boosting.py#27148

Merged
OmarManzoor merged 3 commits intoscikit-learn:mainfrom
yuanx749:sparse-weight-boosting
Aug 24, 2023
Merged

TST Extend tests for scipy.sparse.*array in sklearn/ensemble/tests/test_weight_boosting.py#27148
OmarManzoor merged 3 commits intoscikit-learn:mainfrom
yuanx749:sparse-weight-boosting

Conversation

@yuanx749
Copy link
Copy Markdown
Contributor

Towards #27090.

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Any other comments?

@github-actions
Copy link
Copy Markdown

github-actions bot commented Aug 24, 2023

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: bc7839b. Link to the linter CI: here

Copy link
Copy Markdown
Member

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here are suggestions to improve variable names to make the intentions of the tests easier to grasp.

Otherwise, LGTM.

sparse_results = sparse_classifier.staged_decision_function(X_test_sparse)
dense_results = dense_classifier.staged_decision_function(X_test)
for sprase_res, dense_res in zip(sparse_results, dense_results):
assert_array_almost_equal(sprase_res, dense_res)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While we are at it, let's fix the typo: sprase => sparse.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Furthermore, the names "sparse_results" and "sparse_res" are confusing. Those are not sparse out datastructures but results of a classifier that fits and predicts on sparse inputs datastructures.

I think we should rename those to dense_clf_results / sparse_clf_results instead (and similarly for the "_res" variables).



@pytest.mark.parametrize(
"sparse_container, sparse_type",
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment for sparse_type.


def test_sparse_classification():
@pytest.mark.parametrize(
"sparse_container, sparse_type",
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please rename sparse_type to expected_internal_type.

@yuanx749
Copy link
Copy Markdown
Contributor Author

As per your suggestions, I changed the variable names to be more clear. @ogrisel

Copy link
Copy Markdown
Member

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm!

# Verify sparsity of data is maintained during training
types = [i.data_type_ for i in sparse_classifier.estimators_]

assert all([t == expected_internal_type for t in types])
Copy link
Copy Markdown
Contributor

@OmarManzoor OmarManzoor Aug 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR @yuanx749! I just have a question regarding fixing the expected type for each parametrized case. Previously we were checking whether we have either csc_matrix or csr_matrix, now we only have csc for csc containers and csr matrix otherwise. I haven't checked the code so just want to confirm that do we expect csr array in all the other cases?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, according to the doc
https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.AdaBoostClassifier.html#sklearn.ensemble.AdaBoostClassifier.fit

Sparse matrix can be CSC, CSR, COO, DOK, or LIL. COO, DOK, and LIL are converted to CSR.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for clarifying

Copy link
Copy Markdown
Contributor

@OmarManzoor OmarManzoor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@OmarManzoor OmarManzoor merged commit a9611d0 into scikit-learn:main Aug 24, 2023
@yuanx749 yuanx749 deleted the sparse-weight-boosting branch August 25, 2023 03:15
akaashpatelmns pushed a commit to akaashp2000/scikit-learn that referenced this pull request Aug 25, 2023
glemaitre pushed a commit to glemaitre/scikit-learn that referenced this pull request Aug 29, 2023
REDVM pushed a commit to REDVM/scikit-learn that referenced this pull request Nov 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants