-
-
Notifications
You must be signed in to change notification settings - Fork 26.9k
Adding estimators_samples_ property to ensemble of tree methods such as RandomForest and ExtraTrees #26716
Description
Describe the workflow you want to enable
In BaggingClassifier/Regressor, which is basically an ensemble of trees, it is possible to generate the sampled indices on the fly for the forest using estimators_samples_.
I am wondering if it is possible to add this to all the forest-based methods (e.g. ExtraTrees and RandomForest), since there are many times, one might be interested in analyzing the samples which each tree did not see.
The current forests have the option of fitting the oob decision function during fit(), but we lose information on what the oob sample indices were with respect to each tree.
Describe your proposed solution
Add a similar estimators_samples_ function, which generates the sample indices on the fly, to prevent having to store the sample data.
Describe alternatives you've considered, if relevant
A user would have to keep track of this outside of the class, which may be prone to error.
Additional context
Xref on BaggingClassifier: https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.BaggingRegressor.html#sklearn.ensemble.BaggingRegressor.estimators_samples_