Skip to content

Adding estimators_samples_ property to ensemble of tree methods such as RandomForest and ExtraTrees #26716

@adam2392

Description

@adam2392

Describe the workflow you want to enable

In BaggingClassifier/Regressor, which is basically an ensemble of trees, it is possible to generate the sampled indices on the fly for the forest using estimators_samples_.

I am wondering if it is possible to add this to all the forest-based methods (e.g. ExtraTrees and RandomForest), since there are many times, one might be interested in analyzing the samples which each tree did not see.

The current forests have the option of fitting the oob decision function during fit(), but we lose information on what the oob sample indices were with respect to each tree.

Describe your proposed solution

Add a similar estimators_samples_ function, which generates the sample indices on the fly, to prevent having to store the sample data.

Describe alternatives you've considered, if relevant

A user would have to keep track of this outside of the class, which may be prone to error.

Additional context

Xref on BaggingClassifier: https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.BaggingRegressor.html#sklearn.ensemble.BaggingRegressor.estimators_samples_

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions