-
-
Notifications
You must be signed in to change notification settings - Fork 26.9k
flip_y in make_classification is misleading #14088
Copy link
Copy link
Closed
Labels
Description
As per description of flip_y, it is "The fraction of samples whose class are randomly exchanged."
So when you have two classes one would expect by setting flip_y equal to 0.1, 10% of the labels flip (exchange), as the name suggest (flip_y). However, if you look at the source code 10% of the labels are assigned random labels which 50% of the time they are assigned their own labels so about 5% of labels are going to be flipped in the end.
This doesn't seem like a big issue at first, but we have had so many people confused with flip_y in a competition on Kaggle at https://www.kaggle.com/c/instant-gratification.
Reactions are currently unavailable