Skip to content

Random Forest: Memory increases linearly with n_estimators #8244

@constantinpape

Description

@constantinpape

I have benchmarked different random forest implementation and found that its RAM consumption increases linearly with n_estimators:
https://github.com/constantinpape/rf_benchmarks#sklearn-ram-issues
This leads to memory errors for relatively small feature matrices, when using several estimators.
See https://github.com/constantinpape/rf_benchmarks/blob/master/scripts/profile_sklearn_mem.py
for the script I have used for profiling.
Probably there is a feature matrix for every tree happening.
I am using version 0.18.1.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions