Closes DAS-1146: Sample weight metric calculations#3
Conversation
ryan-deak-zefr
left a comment
There was a problem hiding this comment.
This works to fix some of the issues but there are presumably much deeper methodological issues that might make this PR moot.
| return train_sample_weight, test_sample_weight | ||
|
|
||
|
|
||
| def _apply_scorer(estimator, X, y, scorer, sample_weight): |
There was a problem hiding this comment.
Same as in the sklearn PR for DAS-1145.
| y_train, | ||
| scorer, | ||
| error_score, | ||
| test_sample_weight=None, |
There was a problem hiding this comment.
Ordering is consistent with the test then train parameter order established above.
| scorer, | ||
| return_train_score, | ||
| ): | ||
| if "sample_weight" in fit_params: |
There was a problem hiding this comment.
At the top since this is used in both branches.
| # | ||
| # Each value in the fit_params dict is a 2-tuple where the | ||
| # data representation is in the second dimension (dim 1). | ||
| sample_weight = fit_params["sample_weight"][1] |
There was a problem hiding this comment.
Not sure about getting this from some cache or something.
| params=None, | ||
| fit_params=None, | ||
| return_train_score=True, | ||
| sample_weight=None, |
There was a problem hiding this comment.
Place at the end and make it None by default to make it "backward compatible".
| clf.fit(X, y) | ||
|
|
||
|
|
||
| @pytest.mark.parametrize( |
There was a problem hiding this comment.
This test kind of subsumes the test in the sklearn PR because it tests the situation with one fold metric calculation and multiple fold metric calculations that need to be combined. It does this by testing at the dcv.GridSearchCV level rather than at the level of functions in dask_ml/model_selection/methods.py.
| ), | ||
| ], | ||
| ) | ||
| def test_sample_weight_in_metrics(cv_ind, exp_acc): |
There was a problem hiding this comment.
NOTE: If this test were used in the current master of dask_ml, it would fail because the metric returned by all tests is 0.5.
0.5 is definitely incorrect as it disregards sample_weight entirely.
| ---------- | ||
| sample_weight : array-like | ||
| sample weights. Should contain all sample weights needed for | ||
| training and testing. May be None. |
There was a problem hiding this comment.
Why is it sensical for sample_weight to be None in a function called _get_fold_sample_weights?
| train_sample_weight = None | ||
| test_sample_weight = None | ||
| else: | ||
| # "0" is the train split, "1" is the test split. |
There was a problem hiding this comment.
Is this a convention somewhere else?
| from itertools import product | ||
| from multiprocessing import cpu_count | ||
|
|
||
| # sklearn.metrics.make_scorer |
There was a problem hiding this comment.
What is the point of these comments?
vecchp
left a comment
There was a problem hiding this comment.
+1 Let' just make sure we issue the sklearn pr first before this goes out.
|
This PR has the same tests as the scikit-learn PR: ZEFR-INC/scikit-learn#2 So it should be working very similarly. |
Summary
This PR address that dask_ml (like sklearn) doesn't use the importance weights when calculating metrics when scoring folds in cross validation. This affects all CV methods.
Changes
This PR only addresses the use of
sample_weightin the metric calculations in cross fold validation and doesn't address the issue of combining the metrics across folds, which is currently done by simple averaging and not by weighted averaging using the sum of the sample weights for a fold as the weight.Tests
test_sample_weight_in_metricsintests/model_selection/dask_searchcv/test_model_selection.py.This test is written to pass given the current way of combining metrics across folds (which is likely not the best / correct way to do it).