-
-
Notifications
You must be signed in to change notification settings - Fork 26.9k
Discretizer #5778
Copy link
Copy link
Closed
Labels
Description
Binarizer transforms continuous values to two states (0 or 1). It would be nice to generalize this to an arbitrary number of states K.
This preprocessor would produce a scipy sparse matrix of shape (n_samples, K * n_features) using the one-of-K encoding. The K thresholds could be chosen uniformly between the min and max of each feature or using the K-quantiles.
For example, using uniformly chosen thresholds, if min=0, max=1.0 and K=3, a feature value between 0 and 0.33 would be encoded as [1, 0, 0], a value between 0.33 and 0.66 as [0, 1, 0] and a value between 0.66 and 1.0 as [0, 0, 1].
My usecase is that this encoding might be more meaningful than continuous values when using PolynomialFeatures.
Possibly related to #1062.
Reactions are currently unavailable