-
-
Notifications
You must be signed in to change notification settings - Fork 26.9k
ENH P/R/F should be able to ignore a majority class in the multiclass case #1983
Description
P/R/F are famous for handling class imbalance in the binary classification case. Correct me if I'm wrong (@arjoly?), but imbalance against a majority negative class should also be handled in the multiclass case. In particular, while the documentation currently states that micro-averaged P = R = F, this is not true of the case where a negative class is ignored; but it should be possible to ignore a negative class for any of the average settings.
Indeed, I think the pos_label argument is a mistake (except in that you can more reliably provide a default value than for neg_label): it only applies to the binary case and overrides the average setting; neg_label would apply to all multiclass averaging methods.
It should be easy to implement: treat the problem as multilabel and delete the neg_label column from the label indicator matrix. I.e. it is the case where each instance is assigned 0 or 1 label.
The tricky part is the interface: should pos_label be deprecated? Deprecation makes sense as pos_label and neg_label should not be necessary together. But if so, how do we ensure the binary case works by default?