Integrate scikit-learn's set_output method into TransactionEncoder#1087
Integrate scikit-learn's set_output method into TransactionEncoder#1087rasbt merged 20 commits intorasbt:masterfrom
set_output method into TransactionEncoder#1087Conversation
- Added two new tests, `test_get_feature_names_out` and `test_set_output`. Passing these tests is a step towards the output of `TransactionEncoder` being formatted as a pandas.DataFramed by default.
- Added `get_feature_names_out` method to `TransactionEncoder` to expose the `set_output` method.
- Updated test to include more checks. It is now back in a failing state.
- Updated test_set_output docstring to be more explicit. - Added numpy assertion to check that the transformed output columns match the original columns_ attribute for test_set_output. - Added numpy assertion to check that the get_feature_names_out output match the original columns_ attribute for test_get_feature_names_out.
- Added logic similar to that in `sklearn.base.ClassNamePrefixFeaturesOutMixin` and `sklearn.base.OneToOneFeatureMixin` for the get_feature_names_out method.
- Updated the user guide to show both the get_feature_names_out method and the set_output method.
- Updated changelog to reflect new features.
|
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
- Updated issue number.
- Updated issue number (again) to reflect the PR link instead of the issue link.
- Ran isort over imports to fix failing check in PR.
|
Thanks a lot for the PR, really appreciate it and hope to review it in the upcoming days! |
- Increased scikit-learn version to minimum required for set_output to work.
Sure thing! I wasn't sure what to put regarding newer version release dates so I bumped the patch number by one and set the release date to "TBD". Lmk if I should change anything. |
|
Hey @rasbt it looks like the pipeline checks keep failing for files unrelated to this PR. Should I log a separate issue and open a new PR to fix those too? |
|
No worries, it should be fixed now via #1089 (except for the transaction encoder test, but that's something we can address in this PR) |
|
Hm, I think the new failures could be due to the sklearn version bump |
I noticed that scikit-learn 1.1.3 is being installed in the github workflow. Can we bump it to 1.2.2 as that is required for |
|
Yes, please feel free to bump it up. We probably need to fix some other places that have not been adjusted for the most recent version though |
- Bumped scikit-learn version up to 1.2.2 to match requirements.txt.
- Bumped scikit-learn version up to 1.2.2 to match environment.yml and requirements.txt.
|
Running
Both result in the test passing. I'm in favor of the first option as it doesn't require any changes to the |
- Updated `test_inverse_transform` to passing state by removing conversion to numpy array.
|
Turns out |
- Updated scikit-learn version to 1.3.1 to integerate fix from scikit-learn/scikit-learn#27044 modified: environment.yml - Updated scikit-learn version to 1.3.1 to integerate fix from scikit-learn/scikit-learn#27044 modified: requirements.txt - Updated scikit-learn version to 1.3.1 to integerate fix from scikit-learn/scikit-learn#27044
|
Of course bumping the version results in more failed tests 🤦 I'm going to make a separate PR to handle the |
|
Arg sorry. Yeah, a sklearn bump was overdue but I recently didn't have the time to look into it since I wasn't using the affected features. If this is too much work, don't worry about it, I can understand if you want to drop this. I could revisit the version bump in the upcoming weeks then, address this, and then merge your PR once it's addressed. |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #1087 +/- ##
==========================================
+ Coverage 78.29% 78.32% +0.03%
==========================================
Files 196 196
Lines 11140 11157 +17
Branches 1404 1404
==========================================
+ Hits 8722 8739 +17
Misses 2200 2200
Partials 218 218 ☔ View full report in Codecov by Sentry. |
|
@rasbt looks like all checks passed. Ready to merge 😎 |
rasbt
left a comment
There was a problem hiding this comment.
This looks great. Thanks so much for the efforts! And sorry again about the unit test hassles!


Code of Conduct
Description
This defines the :method:
get_feature_names_outin :class:TransactionEncoderto expose the :method:set_output.Related issues or pull requests
Fixes #1085
Pull Request Checklist
./docs/sources/CHANGELOG.mdfile (if applicable)./mlxtend/*/testsdirectories (if applicable)mlxtend/docs/sources/(if applicable)PYTHONPATH='.' pytest ./mlxtend -svand make sure that all unit tests pass (for small modifications, it might be sufficient to only run the specific test file, e.g.,PYTHONPATH='.' pytest ./mlxtend/classifier/tests/test_stacking_cv_classifier.py -sv)flake8 ./mlxtend