Skip to content

Rewrite user-guide to clarify feature_importances_ are impurity based#14530

Closed
shahules786 wants to merge 4 commits intoscikit-learn:masterfrom
shahules786:new_branch
Closed

Rewrite user-guide to clarify feature_importances_ are impurity based#14530
shahules786 wants to merge 4 commits intoscikit-learn:masterfrom
shahules786:new_branch

Conversation

@shahules786
Copy link
Copy Markdown

@shahules786 shahules786 commented Jul 31, 2019

Reference Issues/PRs

closes #14528

What does this implement/fix? Explain your changes.

Rewritten user-guide to clarify feature importance as permutation importance,including a small and precise explanation of how it is calculated.

Any other comments?

@glemaitre
Copy link
Copy Markdown
Member

@maverick100 Could you edit your original post adding the issue number to look at to see which issue you are solving.

@shahules786 shahules786 changed the title Rewrite user-guide to clarify feature_importances_ are impurity based Rewrite user-guide to clarify feature_importances_ are impurity based #14528 Jul 31, 2019
@shahules786
Copy link
Copy Markdown
Author

@maverick100 Could you edit your original post adding the issue number to look at to see which issue you are solving.

Yes,Done

@glemaitre glemaitre changed the title Rewrite user-guide to clarify feature_importances_ are impurity based #14528 Rewrite user-guide to clarify feature_importances_ are impurity based Jul 31, 2019
The verbosity level

loss : string, optional
loss : string, optional (default="hinge")
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you revert all the changes in this file? This is not related to this PR.

The importance of a feature is computed as the (normalized) total
reduction of the criterion brought by that feature.
It is also known as the Gini importance.
We measure the importance of a feature by calculating
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is wrong. The docstring is correct here. The feature_importances_ is computed using the Gini importance.
What needs to be changed is the User Guide in which we need to change the occurrence feature importance by additional information mentioning that this is the Gini importance.

And as @amueller mentioned, whenever possible we should change our example or User Guide to use the feature importance based on random permutation instead.

@amueller
Copy link
Copy Markdown
Member

amueller commented Aug 5, 2019

Are you planning on working on this issue? The PR currently doesn't contain any relevant changes, so I suggest closing it, unless you're planning to add changes soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Rewrite user-guide to clarify feature_importances_ are impurity based

3 participants