ENH Allow SelectFromModel's max_features to accept callables#22356
ENH Allow SelectFromModel's max_features to accept callables#22356adrinjalali merged 43 commits intoscikit-learn:mainfrom
SelectFromModel's max_features to accept callables#22356Conversation
SelectFromModel's max_features to accept floats and callables
thomasjpfan
left a comment
There was a problem hiding this comment.
Having both floats and callables in a single PR may stall it. I think it's best to pick one for the first PR.
Callable is the more interesting one since it will be the first time we will support a callable for a *_features parameter.
thomasjpfan
left a comment
There was a problem hiding this comment.
Thanks for the update! This looks good.
For new code, I prefer using single backticks. (We changed the setting ~2.5 years ago to make single backticks == double backticks)
SelectFromModel's max_features to accept floats and callablesSelectFromModel's max_features to accept callables
|
Updated the code to incorporate your suggestions! I noticed the different styles of back-ticks and was curious about that -- thanks for clearing it up! Right now there's not really any style standardization regarding single vs double back-ticks and quotes right? |
Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>
…4/scikit-learn into sfm_max_features_upgrade
|
@adrinjalali @glemaitre Would either of you be interested in reviewing this PR? |
|
The CI failures are related to the changes in this PR. |
Co-authored-by: Adrin Jalali <adrin.jalali@gmail.com>
|
CI failure is related to our doc changes. I'm not sure how to fix them (has to to with indentations and new lines). You can try and see if adding new lines or changing indentation would fix it: |
|
Should be good to go now! |
…kit-learn#22356) * Initial implementation * Improved error handling and stability * Added unit tests * Updated test to use `max_features_` instead of `max_features` * Added documentation for new private attribute `max_features_` * Improved error handling for callables * Updated whats_new * Removed incorrect term reference to `max_features` * Removed float case and improved testing * Updated test names to more clearly reflect intention * Added a sample callable in `max_features` description * Improved documentation and streamlined error handling * Updated example to include demonstrate using a callable for max_features * Separated out callable demo into separate example * Removed demo from `max_features` docs (now in example) * Updated changelog * Apply suggestions from code review Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com> * Trimmed unneeded comments * Updated tests to reflect new error handling * Removed new line at end of docstring * Updated docstring * Fixed example syntax error * Fixed example syntax * Apply suggestions from code review Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com> * Reverted irrelevant changes * Update sklearn/feature_selection/_from_model.py Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com> * Fixed error message * Improved test coverage * Minor doc improvement -- added a list for `max_features` type * Update sklearn/feature_selection/_from_model.py Co-authored-by: Adrin Jalali <adrin.jalali@gmail.com> * Improved input validation and added test for array-like * Updated doc to use no longer use lambda function * Fixed docstring list * Added missing whitespace for list format in docstring Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com> Co-authored-by: Adrin Jalali <adrin.jalali@gmail.com>
Reference Issues/PRs
Fixes #21117
What does this implement/fix? Explain your changes.
This PR improves
SelectFromModel'smax_featuresby allowing it to:To accomplish this, I introduced a new private attribute
max_features_which is calculated infitbased on what type was received formax_features. The calculations for integral, and callable values, e.g.:max_features_=max_featuresmax_features_=round(max_features*X.shape[1])max_features_=max_features(X)I then augmented existing unit tests to account for these new cases
Any other comments?