sample weight support for robust regression via weighted percentile algo#10
Conversation
33052ba to
f07f8ad
Compare
There was a problem hiding this comment.
maybe this could go in utils.stats.
What do you think of working with quantile instead of percentile?
There was a problem hiding this comment.
@arjoly what exactly do you propose? changing the name percentile to quantile and using fractions instead of 0-100 ?
I agree that would be nicer -- I did it like this to be consistent with scipy.stats.mstats.scoreatpercentile
There was a problem hiding this comment.
As you proposes, I would rename the function.
|
Do you have tests for this? |
There was a problem hiding this comment.
There is currently no test for the robust regression losses with sample weights, right?
|
+1 for adding those features to the sample_weight PR but this need to be properly tested. This would be interesting at some point to evaluate the use of models that support A new example covariate shift correction would be great. Although probably not the for GB w/ sample_weight PR itself. |
|
added robust regression tests for boston housing and some tests for weighted percentile |
|
@ogrisel a covariate shift example would be indeed great -- has anybody a nice dataset for this? One could use a checkerboard synthetic dataset where one changes P(x) (ie the probability that you draw an example from one of the checkerboard cells) |
…tile sample weight support for robust regression via weighted percentile algo
We could use one of the existing datasets and create an artificial train / test split that introduces a shift. For instance we could use the Boston dataset and use in the test set samples with higher tax rate (the TAX feature) with a higher likelihood than in the training set. |
No description provided.