TST Extend tests for scipy.sparse.*array in sklearn/ensemble/tests/test_weight_boosting.py#27148
Conversation
…st_weight_boosting.py
ogrisel
left a comment
There was a problem hiding this comment.
Here are suggestions to improve variable names to make the intentions of the tests easier to grasp.
Otherwise, LGTM.
| sparse_results = sparse_classifier.staged_decision_function(X_test_sparse) | ||
| dense_results = dense_classifier.staged_decision_function(X_test) | ||
| for sprase_res, dense_res in zip(sparse_results, dense_results): | ||
| assert_array_almost_equal(sprase_res, dense_res) |
There was a problem hiding this comment.
While we are at it, let's fix the typo: sprase => sparse.
There was a problem hiding this comment.
Furthermore, the names "sparse_results" and "sparse_res" are confusing. Those are not sparse out datastructures but results of a classifier that fits and predicts on sparse inputs datastructures.
I think we should rename those to dense_clf_results / sparse_clf_results instead (and similarly for the "_res" variables).
|
|
||
|
|
||
| @pytest.mark.parametrize( | ||
| "sparse_container, sparse_type", |
|
|
||
| def test_sparse_classification(): | ||
| @pytest.mark.parametrize( | ||
| "sparse_container, sparse_type", |
There was a problem hiding this comment.
Please rename sparse_type to expected_internal_type.
|
As per your suggestions, I changed the variable names to be more clear. @ogrisel |
| # Verify sparsity of data is maintained during training | ||
| types = [i.data_type_ for i in sparse_classifier.estimators_] | ||
|
|
||
| assert all([t == expected_internal_type for t in types]) |
There was a problem hiding this comment.
Thanks for the PR @yuanx749! I just have a question regarding fixing the expected type for each parametrized case. Previously we were checking whether we have either csc_matrix or csr_matrix, now we only have csc for csc containers and csr matrix otherwise. I haven't checked the code so just want to confirm that do we expect csr array in all the other cases?
There was a problem hiding this comment.
Yes, according to the doc
https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.AdaBoostClassifier.html#sklearn.ensemble.AdaBoostClassifier.fit
Sparse matrix can be CSC, CSR, COO, DOK, or LIL. COO, DOK, and LIL are converted to CSR.
…/test_weight_boosting.py` (scikit-learn#27148)
…/test_weight_boosting.py` (scikit-learn#27148)
…/test_weight_boosting.py` (scikit-learn#27148)
Towards #27090.
Reference Issues/PRs
What does this implement/fix? Explain your changes.
Any other comments?