-
-
Notifications
You must be signed in to change notification settings - Fork 26.9k
RFECV float (percentage) step does not reduce features in each iteration as described #10368
Description
Hi scikit-learn dev team,
Apologies if I am not understanding how the step feature of RFECV works. With RFECV, when you specify an integer step, you see in verbose mode that it steps down by exactly the number of features until zero, e.g. step=100:
...
Fitting estimator with 1013 features.
Fitting estimator with 913 features.
Fitting estimator with 813 features.
Fitting estimator with 713 features.
Fitting estimator with 613 features.
Fitting estimator with 513 features.
Fitting estimator with 413 features.
Fitting estimator with 313 features.
Fitting estimator with 213 features.
Fitting estimator with 113 features.
Fitting estimator with 13 features.
But with a float (percentage) step it doesn't do what you'd expect. If I specify a step=0.1, I would expect it to reduce the number of features by exactly 10% in each iteration such that it's reducing by fewer and being being more fine-tuned as the number of features decreases. But it seems to be doing quite the opposite where it removes more features than 10% as it gets closer to zero:
Fitting estimator with 54613 features.
Fitting estimator with 49152 features.
Fitting estimator with 43691 features.
Fitting estimator with 38230 features.
Fitting estimator with 32769 features.
Fitting estimator with 27308 features.
Fitting estimator with 21847 features.
Fitting estimator with 16386 features.
Fitting estimator with 10925 features.
Fitting estimator with 5464 features.
Fitting estimator with 3 features.
This doesn't seem to be consistent with the integer step specification (also in general I don't think users would want percentage step to behave in this way). The reason I would like percentage step to work as mentioned above is that there are use cases where one starts with lot of features and would like to fine-tune each RFECV iteration and remove fewer features as it gets closer to zero.