Skip to content

[MRG] Permutation importance support sample weight #16906

Merged
rth merged 8 commits intoscikit-learn:masterfrom
RoeiKa:permutation_importance_sample_weight
Aug 14, 2020
Merged

[MRG] Permutation importance support sample weight #16906
rth merged 8 commits intoscikit-learn:masterfrom
RoeiKa:permutation_importance_sample_weight

Conversation

@RoeiKa
Copy link
Copy Markdown
Contributor

@RoeiKa RoeiKa commented Apr 12, 2020

Reference Issues/PRs

None

What does this implement/fix? Explain your changes.

This PR adds sample weight support for the permutation_importance function.

Any other comments?

@RoeiKa RoeiKa changed the title Permutation importance support sample weight [MRG]Permutation importance support sample weight Apr 12, 2020
@RoeiKa RoeiKa changed the title [MRG]Permutation importance support sample weight [MRG] Permutation importance support sample weight Apr 12, 2020
@rth rth requested a review from thomasjpfan July 15, 2020 21:44
Copy link
Copy Markdown
Member

@thomasjpfan thomasjpfan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the PR @RoeiKa !

Copy link
Copy Markdown
Member

@thomasjpfan thomasjpfan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add an entry to the change log at doc/whats_new/v0.24.rst with tag ||. Like the other entries there, please reference this pull request with :pr: and credit yourself (and other contributors if applicable) with :user:.

@RoeiKa
Copy link
Copy Markdown
Contributor Author

RoeiKa commented Jul 17, 2020

An entry was added to doc/whats_new/v0.24.rst

else:
X_permuted[:, col_idx] = X_permuted[shuffling_idx, col_idx]
feature_score = scorer(estimator, X_permuted, y)
feature_score = scorer(estimator, X_permuted, y, sample_weight)
Copy link
Copy Markdown
Member

@thomasjpfan thomasjpfan Aug 8, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can be a little more careful here. If a user passes a scorer that is a callable without sample_weight, this would break their code. What about:

Suggested change
feature_score = scorer(estimator, X_permuted, y, sample_weight)
if sample_weight is not None
feature_score = scorer(estimator, X_permuted, y,
sample_weight=sample_weight)
else:
feature_score = scorer(estimator, X_permuted, y)

And a small test to make sure passing a callable scorer with the signature:

def my_scorer(estimator, X, y):
	return 1

does not error.

Copy link
Copy Markdown
Contributor Author

@RoeiKa RoeiKa Aug 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point. The only issue is that both permutation_importance and _calculate_permutation_scores use this call, so the suggested correction would have to be implemented in both functions.

An alternative solution is to create a new protected function that do this logic, for example:

        def _scorer_weights_robust(scorer, estimator, X, y, sample_weight):
            if sample_weight is not None:
                return scorer(estimator, X, y, sample_weight)
            return scorer(estimator, X, y)

then we can use it like this:

feature_score = _scorer_weights_robust(scorer, estimator, X_permuted, y, sample_weight)

What do you think?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay lets create this function to be called in both spots.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Copy Markdown
Member

@thomasjpfan thomasjpfan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

We will need to wait for another review before we can merge this.

@rth
Copy link
Copy Markdown
Member

rth commented Aug 14, 2020

LGTM, thanks @RoeiKa !

@rth rth merged commit 2c0bd62 into scikit-learn:master Aug 14, 2020
jayzed82 pushed a commit to jayzed82/scikit-learn that referenced this pull request Oct 22, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants