-
-
Notifications
You must be signed in to change notification settings - Fork 26.9k
Tweedie deviance loss for tree based models #16668
Description
Describe the workflow you want to enable
If the target y is (approximately) Poisson, Gamma or else Tweedie distributed, it would be beneficial for tree based regressors to support Tweedie deviance loss functions as splitting criterion. This partially addresses #5975.
Describe your proposed solution
Ideally, one first implements
- differentiable loss functions A common private module for differentiable loss functions used as objective functions in estimators #15123
and then adds the different loss criteria to the tree based models:
-
DecisionTreeRegressor(poisson only) [MRG] ENH add Poisson splitting criterion for single trees #17386 -
RandomForestRegressor(poisson only) ENH Adds Poisson criterion in RandomForestRegressor #19304 #19836 -
GradientBoostingRegressor -
HistGradientBoostingRegressor(poisson and gamma but no other tweedie cases) ENH Poisson loss for HistGradientBoostingRegressor #16692
Open for Discussion
For Poisson and Tweedie deviance with 1<=power<2, ther target y may be zero while the prediction y_pred must be strictly larger than zero. A tree might find a split where one node has y=0 for all samples in that node, resulting naively in y_pred = mean(y) = 0 for that node. I see 3 different solutions to that:
- Use a log-link function, i.e. predict
y_pred = np.exp(tree)
See ENH Poisson loss for HistGradientBoostingRegressor #16692 for HistGradientBoostingRegressor. This may be no option for DecisionTreeRegressor. - Use a splitting rule that forbids splits where one node has
sum(y)=0.
One might also introduce some option likemin_y_weight, such that splits withsum(sample_weight*y) < min_y_weightare forbidden. - Use some form of parent child average
y_pred = a * mean(y) + (1-a) * y_pred_parentand forbid further splits, see [1].
(Bayes/credibility theory motivates to seta = sum(sample_weight*y)/(gamma+sum(sample_weight*y))for some hyperparametergamma.)
There is also a dirty solution that allows y_pred=0 but sets the value min(eps, y_pred) in the loss function for some tiny value of eps.
References
[1] R rpart library, chapter 8 Poisson regression
Metadata
Metadata
Assignees
Type
Projects
Status