feature_importances_ should be a method in the ideal design

This issue is not meant to be very practical, just a place to share my thoughts.

I believe `feature_importances_` should have been designed as `get_feature_importances()` (which is, perhaps, funny because I think the `get_feature_names` design is pretty broken too), for the following reasons:
* calculating feature importances can be costly, and should not (and is not in some cases) be calculated at `fit` time unnecessary
* there are often multiple ways to calculate feature importances (as simply as choice of norm for `coef_`), and (as long as they depend on the same sufficient statistics) the user may fairly not decide which is appropriate until after `fit`. Thus `get_feature_importances` could have parameters to choose its method. Meta-estimators such as `SelectFromModel` and `RFE` currently have parameters for how they should interpret `coef_` as feature importances, but really these are parameters that should be passed to the linear model's `get_feature_importances`; the model itself should know how to summarise its `coef_`, and doing so gets more complicated once we have multi-output `coef_`.
* it is semantically different from other attributes, not being a sufficient statistic upon which basis the estimator makes predictions

I don't think there is currently sufficient motivation to change, but I could be persuaded.

Ping @kmike?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feature_importances_ should be a method in the ideal design #9606

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

feature_importances_ should be a method in the ideal design #9606

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions