Skip to content
This repository was archived by the owner on Nov 27, 2023. It is now read-only.

Use sample_weight in the metric calculations in model validations#1

Merged
vecchp merged 8 commits intomasterfrom
DAS-1145_sample_wt_validation
Feb 15, 2019
Merged

Use sample_weight in the metric calculations in model validations#1
vecchp merged 8 commits intomasterfrom
DAS-1145_sample_wt_validation

Conversation

@ryan-deak-zefr
Copy link
Copy Markdown

Purpose

To correct metric calculations in model validation (and cross validation) by attempting to use sample_weight in the metric calculations when sample_weight are supplied in fit_params.

Reference Issues/PRs

Fixes scikit-learn#4632. See also scikit-learn#10806.

What does this implement/fix? Explain your changes.

This change always uses the same sample_weight used in training to weight the metric calculations in _fit_and_score and functions called by _fit_and_score, so it should be thought of changing the default behavior of things like cross validation. This is unlike PR scikit-learn#10806 in that it is isolated to _validation.py and doesn't touch _search.py. This is very intentional.

Any other comments?

This PR provides a simple test showing how unweighted metrics have very unfavorable consequences when models are trained with importance weights but not scored with the same weights. It is shown in the test that the current omission of sample_weight in metrics calculations considerably overestimates the accuracy (0.5 accuracy vs 9.99999 * 10^-7).

@ghost ghost added the wip label Feb 14, 2019
@ghost ghost assigned ryan-deak-zefr Feb 14, 2019
Score returned by ``scorer`` applied to ``X`` and ``y`` given
``sample_weight``.
"""
if sample_weight is None or np.all(sample_weight == 1):
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This np.all(sample_weight == 1) check is here to make it so that if sample_weight are all ones, then it is the same as omitting. This allows us to use scorers that don't support sample_weight.

if 'sample_weight' in str(e):
raise TypeError(
(
"Attempted to use 'sample_weight' for training "
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Repackages error with informative error message.

Copy link
Copy Markdown

@zexuan-zhou zexuan-zhou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love the error handling part that deal with some scorers not supporting sample_weights

else:
score = scorer(estimator, X, y)
else:
try:
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any benefit to passing sample_weight as a keyword argument? E.g.

kwargs = { 'sample_weight': sample_weight}

# ...

score = score(estimator, X, y, **kwargs)

The reason I ask is that when I found the other PR that did importance weighting (https://github.com/scikit-learn/scikit-learn/pull/10806/files#diff-60033c11a662f460e1567effd5faa6f0R584) they had this. I don't know if that's important or not.

What does everyone think?


# Unnecessary extra test to illustrate that this is not the desired
# metric value (but previously what has been historically returned).
assert train_metric != np.sum(y[train]) / y[train].shape[0]
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is unnecessary, but I like it for illustrative purposes.

Copy link
Copy Markdown

@vecchp vecchp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@vecchp vecchp merged commit b364f40 into master Feb 15, 2019
@vecchp vecchp deleted the DAS-1145_sample_wt_validation branch February 15, 2019 19:03
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Should cross-validation scoring take sample-weights into account?

3 participants