-
-
Notifications
You must be signed in to change notification settings - Fork 26.9k
UnaryEncoder #8628
Copy link
Copy link
Closed
Labels
ModerateAnything that requires some knowledge of conventions and best practicesAnything that requires some knowledge of conventions and best practicesNew Featuremodule:preprocessing
Description
I'm sure we've discussed this before, but I'm not sure where, and there certainly does not appear to be an active PR. For ordinal (and discretized; see #7668) features, a "unary" encoding (is there a better name for this) is more informative than a one-hot encoding. For k values 0, ..., k - 1 of the ordinal feature x, this creates k - 1 binary features such that the ith is active if x > i (for i = 0, ... k - 1). Below is an initial implementation.
class UnaryEncoder(BaseEstimator, TransformerMixin):
def __init__(self, n_values):
self.n_values = n_values
def fit(self, X, y=None):
return self
def transform(self, X):
values = np.arange(self.n_values - 1)
X = check_array(X)
Xt = np.hstack([values < X[:, i, None] for i in range(X.shape[1])])
return Xt>>> UnaryEncoder(3).fit_transform([[0], [1], [2]])
array([[0, 0],
[ 1, 0],
[ 1, 1]])Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
ModerateAnything that requires some knowledge of conventions and best practicesAnything that requires some knowledge of conventions and best practicesNew Featuremodule:preprocessing