-
-
Notifications
You must be signed in to change notification settings - Fork 26.9k
Better documentation for RFECV #27193
Copy link
Copy link
Closed
Labels
Description
There is almost no description in the documentation of how RFECV actually works. The user guide simply says
RFECV performs RFE in a cross-validation loop to find the optimal number of features.
and the API page simply says
Recursive feature elimination with cross-validation to select features.
My best guess for what RFECV is actually doing is the following.
- Start with all features.
- Do the following (in either order):
a) Fit the estimator on all rows ofX(for the current subset of features). Usecoefs_orfeature_importances_or a callable to select the feature(s) that will be removed in the next round.
b) Run cross-validation with the estimator onXto estimate the accuracy of the estimator trained on the current subset of features. - Remove the features chosen for removal in step 2a.
- Repeat steps 2 and 3 until the minimum number of features has been reached.
- Select the set of features that maximizes the CV scores calculated in step 2b. (This set of features is recorded in the
support_attribute.)
Is that correct? Furthermore, can a detailed explanation of what RFECV is doing be added to the documentation?
Reactions are currently unavailable